Hugging Face on Amazon SageMaker
Deep Learning Containers
Deep Learning Containers (DLCs) are Docker images pre-installed with deep learning frameworks and libraries such as 🤗 Transformers, 🤗 Datasets, and 🤗 Tokenizers. The DLCs allow you to start training models immediately, skipping the complicated process of building and optimizing your training environments from scratch. Our DLCs are thoroughly tested and optimized for deep learning environments, requiring no configuration or maintenance on your part. In particular, the Hugging Face Inference DLC comes with a pre-written serving stack which drastically lowers the technical bar of deep learning serving.
Our DLCs are available everywhere Amazon SageMaker is available. While it is possible to use the DLCs without the SageMaker Python SDK, there are many advantages to using SageMaker to train your model:
- Cost-effective: Training instances are only live for the duration of your job. Once your job is complete, the training cluster stops, and you won’t be billed anymore. SageMaker also supports Spot instances, which can reduce costs up to 90%.
- Built-in automation: SageMaker automatically stores training metadata and logs in a serverless managed metastore and fully manages I/O operations with S3 for your datasets, checkpoints, and model artifacts.
- Multiple security mechanisms: SageMaker offers encryption at rest, in transit, Virtual Private Cloud connectivity, and Identity and Access Management to secure your data and code.
Hugging Face DLCs are open source and licensed under Apache 2.0. Feel free to reach out on our community forum if you have any questions. For premium support, our Expert Acceleration Program gives you direct dedicated support from our team.
Features & benefits 🔥
Hugging Face DLCs make it easier than ever to train Transformer models in SageMaker. Here is why you should consider using Hugging Face DLCs to train and deploy your next machine learning models:
One command is all you need
With the new Hugging Face DLCs, train cutting-edge Transformers-based NLP models in a single line of code. Choose from multiple DLC variants, each one optimized for TensorFlow and PyTorch, single-GPU, single-node multi-GPU, and multi-node clusters.
Accelerate machine learning from science to production
In addition to Hugging Face DLCs, we created a first-class Hugging Face extension for the SageMaker Python SDK to accelerate data science teams, reducing the time required to set up and run experiments from days to minutes.
You can use the Hugging Face DLCs with SageMaker’s automatic model tuning to optimize your training hyperparameters and increase the accuracy of your models.
Deploy your trained models for inference with just one more line of code or select any of the 10,000+ publicly available models from the model Hub and deploy them with SageMaker.
Easily track and compare your experiments and training artifacts in SageMaker Studio’s web-based integrated development environment (IDE).
Built-in performance
Hugging Face DLCs feature built-in performance optimizations for PyTorch and TensorFlow to train NLP models faster. The DLCs also give you the flexibility to choose a training infrastructure that best aligns with the price/performance ratio for your workload.
The Hugging Face Training DLCs are fully integrated with SageMaker distributed training libraries to train models faster than ever, using the latest generation of instances available on Amazon Elastic Compute Cloud.
Hugging Face Inference DLCs provide you with production-ready endpoints that scale quickly with your AWS environment, built-in monitoring, and a ton of enterprise features.
Resources, Documentation & Samples 📄
Take a look at our published blog posts, videos, documentation, sample notebooks and scripts for additional help and more context about Hugging Face DLCs on SageMaker.
Blogs and videos
- AWS: Embracing natural language processing with Hugging Face
- Deploy Hugging Face models easily with Amazon SageMaker
- AWS and Hugging Face collaborate to simplify and accelerate adoption of natural language processing models
- Walkthrough: End-to-End Text Classification
- Working with Hugging Face models on Amazon SageMaker
- Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker
- Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker
- Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker
Documentation
- Run training on Amazon SageMaker
- Deploy models to Amazon SageMaker
- Reference
- Amazon SageMaker documentation for Hugging Face
- Python SDK SageMaker documentation for Hugging Face
- Deep Learning Container
- SageMaker’s Distributed Data Parallel Library
- SageMaker’s Distributed Model Parallel Library
Sample notebooks
- All notebooks
- Getting Started with Pytorch
- Getting Started with Tensorflow
- Distributed Training Data Parallelism
- Distributed Training Model Parallelism
- Spot Instances and continue training
- SageMaker Metrics
- Distributed Training Data Parallelism Tensorflow
- Distributed Training Summarization
- Image Classification with Vision Transformer
- Deploy one of the 10 000+ Hugging Face Transformers to Amazon SageMaker for Inference
- Deploy a Hugging Face Transformer model from S3 to SageMaker for inference