Train Transformers faster with IPUs
Graphcore and Hugging Face are working together to make training of Transformer models on IPUs fast and easy. Contact Graphcore to learn more about leveraging IPUs for your training needs.
or explore IPU-optimized models
How to use IPUs with Transformers and Optimum
Take advantage of the power of Graphcore IPUs to train Transformers models with minimal changes to your code thanks to the IPUTrainer class in Optimum. This plug-and-play experience leverages the full software stack of Graphcore so you can train state of the art models on state of the art hardware.
from optimum.graphcore import IPUConfig, IPUTrainer
model_name_or_path = ...
# IPUConfig specifying parameters related to the IPU
ipu_config = IPUConfig.from_pretrained(model_name_or_path)
trainer = IPUTrainer(model, ipu_config, training_args, train_dataset, eval_dataset)
# Training
train_result = trainer.train(resume_from_checkpoint=checkpoint)
trainer.save_model()
# Evaluation
metrics = trainer.evaluate()
IPU systems for AI acceleration at scale
Designed to drive the next AI breakthroughs, now integrated with Hugging Face!
Best for NLP & Computer Vision
Accelerate training and inference models with high-performance optimisations across natural language processing, computer vision and more.
Graphcore’s IPU is powering advances in AI applications such as fraud detection for finance, drug discovery for life sciences, defect detection for manufacturing, traffic monitoring for smart cities and for all of tomorrow’s new breakthroughs.
Poplar Software
The Poplar SDK is a complete software stack co-designed with the IPU for AI application development.
Graphcore’s Poplar graph toolchain is fully integrated with Transformers so developers can easily port existing models. For maximum performance, Poplar enables direct IPU programming in Python and C++.
IPU Systems
Graphcore’s next generation Bow Pod AI computer systems are powered by the world’s first 3D Wafer-on-Wafer processor for AI infrastructure at scale.
Bow systems deliver up to 40% higher performance and 16% better power efficiency for real world AI applications than its predecessors, all for the same price and requiring no changes to existing software.