Wauplin/huggingface_hub · 📢 [v0.19.0] Inference Endpoints and robustness!

Owner Nov 6, 2023

•

edited Nov 8, 2023

EDIT: Release v0.19.0is now available on PyPI!

🚀 Inference Endpoints API

Inference Endpoints provides a secure solution to easily deploy models hosted on the Hub in a production-ready infrastructure managed by Huggingface. With huggingface_hub>=0.19.0 integration, you can now manage your Inference Endpoints programmatically. Combined with the InferenceClient, this becomes the go-to solution to deploy models and run jobs in production, either sequentially or in batch!

Here is an example how to get an inference endpoint, wake it up, wait for initialization, run jobs in batch and pause back the endpoint. All of this in a few lines of code! For more details, please check out our dedicated guide.

>>> import asyncio
>>> from huggingface_hub import get_inference_endpoint

# Get endpoint + wait until initialized
>>> endpoint = get_inference_endpoint("batch-endpoint").resume().wait()

# Run inference
>>> async_client = endpoint.async_client
>>> results = asyncio.gather(*[async_client.text_generation(...) for job in jobs])

# Pause endpoint
>>> endpoint.pause()

Implement API for Inference Endpoints by @Wauplin in #1779
Fix inference endpoints docs by @Wauplin in #1785

⏬ Improved download experience

huggingface_hub is a library primarily used to transfer (huge!) files with the Huggingface Hub. Our goal is to keep improving the experience for this core part of the library. In this release, we introduce a more robust download mechanism for slow/limited connection while improving the UX for users with a high bandwidth available!

More robust downloads

Getting a connection error in the middle of a download is frustrating. That's why we've implemented a retry mechanism that automatically reconnects if a connection get closed or a ReadTimeout error is raised. The download restart exactly where it stopped without having to redownload any bytes.

Retry on ConnectionError/ReadTimeout when streaming file from server by @Wauplin in #1766
Reset nb_retries if data has been received from the server by @Wauplin in #1784

In addition to this, it is possible to configure huggingface_hub with higher timeouts thanks to @Shahafgo. This should help getting around some issues on slower connections.

Adding the ability to configure the timeout of get request by @Shahafgo in #1720
Fix a bug to respect the HF_HUB_ETAG_TIMEOUT. by @Shahafgo in #1728

Progress bars while using `hf_transfer`

hf_transfer is a Rust-based library focused on improving upload and download speed on machines with a high bandwidth available. Once installed (pip install -U hf_transfer), it can transparently be used with huggingface_hub simply by setting HF_HUB_ENABLE_HF_TRANSFER=1 as environment variable. The counterpart of higher performances is the lack of some user-friendly features such as better error handling or a retry mechanism -meaning it is recommended only to power-users-. In this release we still ship a new feature to improve UX: progress bars. No need to update any existing code, a simple library upgrade is enough.

hf-transfer progress bar by @cbensimon in #1792
Add support for progress bars in hf_transfer uploads by @Wauplin in #1804

📚 Documentation

`huggingface-cli` guide

huggingface-cli is the CLI tool shipped with huggingface_hub. It recently got some nice improvement, especially with commands to download and upload files directly from the terminal. All of this needed a guide, so here it is!

Add CLI guide to documentation by @Wauplin in #1797

Environment variables

Environment variables are useful to configure how huggingface_hub should work. Historically we had some inconsistencies on how those variables were named. This is now improved, with a backward compatible approach. Please check the package reference for more details. The goal is to propagate those changes to the whole HF-ecosystem, making configuration easier for everyone.

Harmonize environment variables by @Wauplin in #1786
Ensure backward compatibility for HUGGING_FACE_HUB_TOKEN env variable by @Wauplin in #1795
Do not promote HF_ENDPOINT environment variable by @Wauplin in #1799

Hindi translation

Hindi documentation landed on the Hub thanks to @aneeshd27 ! Checkout the Hindi version of the quickstart guide here.

Added translation of 3 files as mentioned in issue by @aneeshd27 in #1772

Minor docs fixes

Added [[autodoc]] for ModelStatus by @jamesbraza in #1758
Expanded docstrings on post and ModelStatus by @jamesbraza in #1740
Fix document link for manage-cache by @liuxueyang in #1774
Minor doc fixes by @pcuenca in #1775

💔 Breaking changes

Legacy ModelSearchArguments and DatasetSearchArguments have been completely removed from huggingface_hub. This shouldn't cause problem as they were already not in use (and unusable in practice).

Removed GeneralTags, ModelTags and DatasetTags by @VictorHugoPilled in #1761

Classes containing details about a repo (ModelInfo, DatasetInfo and SpaceInfo) have been refactored by @mariosasko to be more Pythonic and aligned with the other classes in huggingface_hub. In particular those objects are now based the dataclass module instead of a custom ReprMixin class. Every change is meant to be backward compatible, meaning no breaking changes is expected. However, if you detect any inconsistency, please let us know and we will fix it asap.

Replace ReprMixin with dataclasses by @mariosasko in #1788
Fix SpaceInfo initialization + add test by @Wauplin in #1802

The legacy Repository and InferenceAPI classes are now deprecated but will not be removed before the next major release (v1.0).
Instead of the git-based Repository, we advice to use the http-based HfApi. Check out this guide explaining the reasons behind it. For InferenceAPI, we recommend to switch to InferenceClient which is much more feature-complete and will keep getting improved.

Deprecate Repository class by @Wauplin in #1724

⚙️ Miscellaneous improvements, fixes and maintenance

`InferenceClient`

Adding InferenceClient.get_recommended_model by @jamesbraza in #1770
Fix InferenceClient.text_generation when pydantic is not installed by @Wauplin in #1793
Supporting pydantic<3 by @jamesbraza in #1727

`HfFileSystem`

[hffs] Raise NotImplementedError on transaction commits by @Wauplin in #1736
Fix huggingface filesystem repo_type not forwarded by @Wauplin in #1791
Fix HfFileSystemFile when init fails + improve error message by @Wauplin in #1805

FIPS compliance

Set usedforsecurity=False in hashlib methods (FIPS compliance) by @Wauplin in #1782

Misc fixes

Fix UnboundLocalError when using commit context manager by @hahunavth in #1722
Fixed improperly configured 'every' leading to test_sync_and_squash_history failure by @jamesbraza in #1731
Testing WEBHOOK_PAYLOAD_EXAMPLE deserialization by @jamesbraza in #1732
Keep lock files in a /locks folder to prevent rare concurrency issue by @beeender in #1659
Fix Space runtime on static Space by @Wauplin in #1754
Clearer error message on unprocessable entity. by @Wauplin in #1755
Do not warn in ModelHubMixin on missing config file by @Wauplin in #1776
Update SpaceHardware enum by @Wauplin in #1798
change prop name by @julien-c in #1803

Internal

Bump version to 0.19 by @Wauplin in #1723
Make @retry_endpoint a default for all test by @Wauplin in #1725
Retry test on 502 Bad Gateway by @Wauplin in #1737
Consolidated mypy type ignores in InferenceClient.post by @jamesbraza in #1742
fix: remove useless token by @rtrompier in #1765
Fix CI (typing-extensions minimal requirement by @Wauplin in #1781
remove black formatter to use only ruff by @Wauplin in #1783
Separate test and prod cache (+ ruff formatter) by @Wauplin in #1789
fix 3.8 tensorflow in ci by @Wauplin (direct commit on main)

🤗 Significant community contributions

The following contributors have made significant changes to the library over the last release:

@VictorHugoPilled
- Removed GeneralTags, ModelTags and DatasetTags (#1761)
@aneeshd27
- Added translation of 3 files as mentioned in issue (#1772)

jamesbraza

Nov 6, 2023

Looks good to me! Nice work :)

davanstrien

Nov 8, 2023

Super cool! The inference endpoints API examples are so nice (literally was about to start handcrafting something for this and now I don't need to!)

📢 [v0.19.0] Inference Endpoints and robustness!