Optimum documentation

πŸ€— Optimum Habana

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

πŸ€— Optimum Habana

πŸ€— Optimum Habana is the interface between the πŸ€— Transformers and πŸ€— Diffusers libraries and Habana’s Gaudi processor (HPU). It provides a set of tools that enable easy model loading, training and inference on single- and multi-HPU settings for various downstream tasks as shown in the table below.

HPUs offer fast model training and inference as well as a great price-performance ratio. Check out this blog post about BERT pre-training and this article benchmarking Habana Gaudi2 versus Nvidia A100 GPUs for concrete examples. If you are not familiar with HPUs, we recommend you take a look at our conceptual guide.

The following model architectures, tasks and device distributions have been validated for πŸ€— Optimum Habana:

In the tables below, βœ… means single-card, multi-card and DeepSpeed have all been validated.

  • Transformers
Architecture Training Inference Tasks
BERT βœ… βœ…
  • text classification
  • question answering
  • language modeling
  • RoBERTa βœ… βœ…
  • question answering
  • language modeling
  • ALBERT βœ… βœ…
  • question answering
  • language modeling
  • DistilBERT βœ… βœ…
  • question answering
  • language modeling
  • GPT2 βœ… βœ…
  • language modeling
  • text generation
  • BLOOM(Z)
  • DeepSpeed
  • text generation
  • StarCoder
  • Single card
  • text generation
  • GPT-J
  • DeepSpeed
  • Single card
  • DeepSpeed
  • language modeling
  • text generation
  • GPT-NeoX
  • DeepSpeed
  • DeepSpeed
  • language modeling
  • text generation
  • OPT
  • DeepSpeed
  • text generation
  • Llama 2 / CodeLlama βœ… βœ…
  • language modeling
  • text generation
  • StableLM
  • Single card
  • text generation
  • Falcon
  • LoRA
  • βœ…
  • text generation
  • CodeGen
  • Single card
  • text generation
  • MPT
  • Single card
  • text generation
  • Mistral
  • Single card
  • text generation
  • Phi βœ…
  • Single card
  • language modeling
  • text generation
  • Mixtral
  • Single card
  • text generation
  • T5 / Flan T5 βœ… βœ…
  • summarization
  • translation
  • question answering
  • BART
  • Single card
  • summarization
  • translation
  • question answering
  • ViT βœ… βœ…
  • image classification
  • Swin βœ… βœ…
  • image classification
  • Wav2Vec2 βœ… βœ…
  • audio classification
  • speech recognition
  • Whisper βœ… βœ…
  • speech recognition
  • SpeechT5
  • Single card
  • text to speech
  • CLIP βœ… βœ…
  • contrastive image-text training
  • BridgeTower βœ… βœ…
  • contrastive image-text training
  • ESMFold
  • Single card
  • protein folding
  • Blip
  • Single card
  • visual question answering
  • image to text
    • Diffusers
    Architecture Training Inference Tasks
    Stable Diffusion
  • Single card
  • text-to-image generation
  • LDM3D
  • Single card
  • text-to-image generation
    • TRL:
    Architecture Training Inference Tasks
    Llama 2 βœ…
  • DPO Pipeline
  • Llama 2 βœ…
  • PPO Pipeline
  • Other models and tasks supported by the πŸ€— Transformers and πŸ€— Diffusers library may also work. You can refer to this section for using them with πŸ€— Optimum Habana. Besides, this page explains how to modify any example from the πŸ€— Transformers library to make it work with πŸ€— Optimum Habana.