🤗 Optimum is an extension of Transformers that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency.
The AI ecosystem evolves quickly, and more and more specialized hardware along with their own optimizations are emerging every day. As such, Optimum enables developers to efficiently use any of these platforms with the same ease inherent to Transformers.
🤗 Optimum is distributed as a collection of packages - check out the links below for an in-depth look at each one.
Maximize training throughput and efficiency with Habana's Gaudi processor
Optimize your model to speedup inference with OpenVINO and Neural Compressor
Accelerate your training and inference workflows with AWS Trainium and AWS Inferentia
Accelerate inference with NVIDIA TensorRT-LLM on the NVIDIA platform
Enable performance optimizations for AMD Instinct GPUs and AMD Ryzen AI NPUs
Fast and efficient inference on FuriosaAI WARBOY
Apply quantization and graph optimization to accelerate Transformers models training and inference with ONNX Runtime
A one-liner integration to use PyTorch's BetterTransformer with Transformers models