Accelerate Transformers Training on Gaudi

Habana and Hugging Face make training Transformer models on Habana Gaudi® processors faster and easier. Contact Habana to learn how Gaudi processors can help you achieve deep learning training efficiency.

Get Started Training Transformers with Gaudi and Optimum

Habana SynapseAI® SDK and Hugging Face Optimum provide a robust combination of optimization tools to help you efficiently train and run high-performance transformer models on Gaudi processors. These resources provide all you need to train transformer models and thus leverage the Gaudi price performance advantages.

huggingface@habana:~
from transformers import AutoModel
from optimum.habana import GaudiTrainer, GaudiTrainingArguments

model = AutoModel.from_pretrained(model_name_or_path)
training_args = GaudiTrainingArguments(use_habana=True, use_lazy_mode=True)

gaudi_trainer = GaudiTrainer(model, training_args)

gaudi_trainer.train()

High-Efficiency Deep Learning Training with Gaudi

Purpose-built for deep learning to drive a new era of training

Deep Learning Efficiency

Architected expressly for deep learning training, Gaudi processors offer inherent design efficiencies resulting in the powerful combination of AI performance and cost-effectiveness.

Announced in October 2021, the new Amazon EC2 instances featuring Gaudi accelerators deliver up to 40% better price performance for training machine learning models compared to the latest GPU-based Amazon EC2 instances.

Deep Learning Efficiency
Gaudi heterogeneous compute architecture for Deep Learning

Deep Learning Usability

Habana’s SynapseAI provides over 30+ reference models for deep learning computer vision and NLP. The Habana Developer Site and GitHub support you with the tools, documentation, how-to content, reference models and community support to make it easy to build new or migrate existing models to Gaudi.

Now, the combination of SynapseAI and Optimum makes training Transformer models even easier and faster than ever.

Deep Learning Usability
SynapseAI Software Platform

Deep Learning Versatility

Gaudi processors enable customers to train models with efficiency and ease of use wherever and however they need to implement deep learning workloads.

Whether in the cloud on Amazon EC2 DL1 Instances based on Gaudi or on-premises with systems built on the Supermicro X12 Gaudi Training Server, we got your back.

Deep Learning Versatility
Flexible Gaudi deployment solutions

Deep Learning Scalability

Gaudi offers flexible and affordable system scaling with ten 100 Gigabit Ethernet ports integrated onto every Gaudi processor to provide substantial and flexible networking capacity.

7 ports are dedicated to connecting to the other processors within the server, with 3 ports dedicated to scale out, enabling easy, efficient scale out and scale up.

Deep Learning Scalability
Flexible Gaudi deployment solutions

Accelerate Transformers Training on Gaudi

Reach out to Habana to learn more about training Hugging Face models on Gaudi processors