Accelerate Transformers on State of the Art Hardware

Hugging Face is partnering with leading AI Hardware accelerators to make state of the art production performance accessible

Optimum: the ML Optimization toolkit for production performance

Hardware-specific acceleration tools

1. Quantize

Make models faster with minimal impact on accuracy, leveraging post-training quantization, quantization-aware training and dynamic quantization from Intel® Low Precision Optimization Tool (LPOT).

from import LpotQuantizerForSequenceClassification

# Create quantizer from config 
quantizer = LpotQuantizerForSequenceClassification.from_config(

model = quantizer.fit_dynamic()

2. Prune

Make models smaller with minimal impact on accuracy, with easy to use configurations to remove model weights using Intel® Low Precision Optimization Tool (LPOT).

from import LpotPrunerForSequenceClassification

# Create pruner from config 
pruner = LpotPrunerForSequenceClassification.from_config(

model =

3. Train

Train models faster than ever before with Graphcore Intelligence Processing Units (IPUs), the latest generation of AI dedicated hardware, leveraging the built-in IPUTrainer API to train or finetune transformers models (coming soon)

from optimum.graphcore import IPUTrainer
from optimum.graphcore.bert import BertIPUConfig
from transformers import BertForMaskedLM, BertTokenizer
from poptorch.optim import AdamW

# Allocate model and tokenizer as usual
tokenizer = BertTokenizer.from_pretrained("bert-base-cased")
model = BertForMaskedLM.from_pretrained("bert-base-cased")

# Trainer + poptorch custom configuration optional 
ipu_config = BertIPUConfig()
trainer = IPUTrainer(model, trainings_args, config=ipu_config)
optimizer = AdamW(model.parameters)

# This is hidden from the user, it will be handled by the Trainer
with trainer.compile(some_data_loader) as model_f:
     for steps in range(...):
		 outputs = trainer.step(optimizer)    

# Save the model and/or push to hub

Meet the Hugging Face Hardware Partners

Scale with Xeon

Do more with Snapdragon

Train with IPU