Accelerate Transformers on State of the Art Hardware
Hugging Face is partnering with leading AI Hardware accelerators to make state of the art production performance accessible
Meet the Hugging Face Hardware Partners
Train Transformers faster with IPUsLearn more
Accelerate Transformers Training on GaudiLearn more
Scale with XeonLearn more
Optimum: the ML Optimization toolkit for production performance
Hardware-specific acceleration tools
Make models faster with minimal impact on accuracy, leveraging post-training quantization, quantization-aware training and dynamic quantization from Intel® Neural Compressor.
from optimum.intel.neural_compressor.quantization import IncQuantizerForSequenceClassification # Create quantizer from config quantizer = IncQuantizerForSequenceClassification.from_config( "echarlaix/bert-base-dynamic-quant-test", config_name="quantization.yml", eval_func=eval_func, ) model = quantizer.fit_dynamic()
Make models smaller with minimal impact on accuracy, with easy to use configurations to remove model weights using Intel® Neural Compressor.
from optimum.intel.neural_compressor.pruning import IncPrunerForSequenceClassification # Create pruner from config pruner = IncPrunerForSequenceClassification.from_config( "echarlaix/distilbert-base-uncased-sst2-magnitude-pruning-test", config_name="prune.yml", eval_func=eval_func, train_func=train_func, ) prune = pruner.fit() model = prune()
Train models faster than ever before with Graphcore Intelligence Processing Units (IPUs), the latest generation of AI dedicated hardware, leveraging the built-in IPUTrainer API to train or finetune transformers models (coming soon)
from optimum.graphcore import IPUConfig, IPUTrainer from transformers import BertForPreTraining, BertTokenizer # Allocate model and tokenizer as usual tokenizer = BertTokenizer.from_pretrained("bert-base-cased") model = BertForPreTraining.from_pretrained("bert-base-cased") # IPU configuration + Trainer ipu_config = IPUConfig.from_pretrained("Graphcore/bert-base-ipu") trainer = IPUTrainer(model, ipu_config=ipu_config, args=trainings_args) # The Trainer takes care of compiling the model for the IPUs in the background # to perform training, the user does not have to deal with that trainer.train() # Save the model and/or push to hub model.save_pretrained("...") model.push_to_hub("...")