π Just released version 0.24.0 of the πππππππππππ_πππ Python library!
Exciting updates include: β‘ InferenceClient is now a drop-in replacement for OpenAI's chat completion!
β¨ Support for response_format, adapter_id , truncate, and more in InferenceClient
πΎ Serialization module with a save_torch_model helper that handles shared layers, sharding, naming convention, and safe serialization. Basically a condensed version of logic scattered across safetensors, transformers , accelerate
π Optimized HfFileSystem to avoid getting rate limited when browsing HuggingFaceFW/fineweb
π¨ HfApi & CLI improvements: prevent empty commits, create repo inside resource group, webhooks API, more options in the Search API, etc.
π I'm excited to announce that huggingface_hub's InferenceClient now supports OpenAI's Python client syntax! For developers integrating AI into their codebases, this means you can switch to open-source models with just three lines of code. Here's a quick example of how easy it is.
Why use the InferenceClient? π Seamless transition: keep your existing code structure while leveraging LLMs hosted on the Hugging Face Hub. π€ Direct integration: easily launch a model to run inference using our Inference Endpoint service. π Stay Updated: always be in sync with the latest Text-Generation-Inference (TGI) updates.