🤗 Optimum Habana

🤗 Optimum Habana is the interface between the 🤗 Transformers and 🤗 Diffusers libraries and Habana’s Gaudi processor (HPU). It provides a set of tools that enable easy model loading, training and inference on single- and multi-HPU settings for various downstream tasks as shown in the table below.

HPUs offer fast model training and inference as well as a great price-performance ratio. Check out this blog post about BERT pre-training and this article benchmarking Habana Gaudi2 versus Nvidia A100 GPUs for concrete examples. If you are not familiar with HPUs, we recommend you take a look at our conceptual guide.

The following model architectures, tasks and device distributions have been validated for 🤗 Optimum Habana:

In the tables below, ✅ means single-card, multi-card and DeepSpeed have all been validated.

Transformers

Architecture	Training	Inference	Tasks
BERT	✅	✅	text classification question answering language modeling
RoBERTa	✅	✅	question answering language modeling
ALBERT	✅	✅	question answering language modeling
DistilBERT	✅	✅	question answering language modeling
GPT2	✅	✅	language modeling text generation
BLOOM(Z)		DeepSpeed	text generation
StarCoder		Single card	text generation
GPT-J	DeepSpeed	Single card DeepSpeed	language modeling text generation
GPT-NeoX	DeepSpeed	DeepSpeed	language modeling text generation
OPT		DeepSpeed	text generation
Llama 2 / CodeLlama	✅	✅	language modeling text generation
StableLM		Single card	text generation
Falcon	LoRA	✅	text generation
CodeGen		Single card	text generation
MPT		Single card	text generation
Mistral		Single card	text generation
Phi	✅	Single card	language modeling text generation
Mixtral		Single card	text generation
T5 / Flan T5	✅	✅	summarization translation question answering
BART		Single card	summarization translation question answering
ViT	✅	✅	image classification
Swin	✅	✅	image classification
Wav2Vec2	✅	✅	audio classification speech recognition
Whisper	✅	✅	speech recognition
SpeechT5		Single card	text to speech
CLIP	✅	✅	contrastive image-text training
BridgeTower	✅	✅	contrastive image-text training
ESMFold		Single card	protein folding
Blip		Single card	visual question answering image to text

Diffusers

Architecture	Training	Inference	Tasks
Stable Diffusion		Single card	text-to-image generation
LDM3D		Single card	text-to-image generation

TRL:

Architecture	Training	Inference	Tasks
Llama 2	✅		DPO Pipeline
Llama 2	✅		PPO Pipeline

Other models and tasks supported by the 🤗 Transformers and 🤗 Diffusers library may also work. You can refer to this section for using them with 🤗 Optimum Habana. Besides, this page explains how to modify any example from the 🤗 Transformers library to make it work with 🤗 Optimum Habana.

Tutorials

Learn the basics and become familiar with training transformers on HPUs with 🤗 Optimum. Start here if you are using 🤗 Optimum Habana for the first time!

How-to guides

Practical guides to help you achieve a specific goal. Take a look at these guides to learn how to use 🤗 Optimum Habana to solve real-world problems.

Conceptual guides

High-level explanations for building a better understanding of important topics such as HPUs.

Reference

Technical descriptions of how the Habana classes and methods of 🤗 Optimum Habana work.