Optimum documentation

πŸ€— Optimum Habana

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v1.19.0).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

πŸ€— Optimum Habana

πŸ€— Optimum Habana is the interface between the πŸ€— Transformers and πŸ€— Diffusers libraries and Habana’s Gaudi processor (HPU). It provides a set of tools that enable easy model loading, training and inference on single- and multi-HPU settings for various downstream tasks as shown in the table below.

HPUs offer fast model training and inference as well as a great price-performance ratio. Check out this blog post about BERT pre-training and this article benchmarking Habana Gaudi2 versus Nvidia A100 GPUs for concrete examples. If you are not familiar with HPUs, we recommend you take a look at our conceptual guide.

The following model architectures, tasks and device distributions have been validated for πŸ€— Optimum Habana:

In the tables below, βœ… means single-card, multi-card and DeepSpeed have all been validated.

  • Transformers
Architecture Training Inference Tasks
BERT βœ… βœ…
  • text classification
  • question answering
  • language modeling
  • RoBERTa βœ… βœ…
  • question answering
  • language modeling
  • ALBERT βœ… βœ…
  • question answering
  • language modeling
  • DistilBERT βœ… βœ…
  • question answering
  • language modeling
  • GPT2 βœ… βœ…
  • language modeling
  • text generation
  • BLOOM(Z)
  • DeepSpeed
  • text generation
  • StarCoder
  • Single card
  • text generation
  • GPT-J
  • DeepSpeed
  • Single card
  • DeepSpeed
  • language modeling
  • text generation
  • GPT-NeoX
  • DeepSpeed
  • DeepSpeed
  • language modeling
  • text generation
  • OPT
  • DeepSpeed
  • text generation
  • Llama 2 / CodeLlama βœ… βœ…
  • language modeling
  • text generation
  • StableLM
  • Single card
  • text generation
  • Falcon
  • LoRA
  • βœ…
  • text generation
  • CodeGen
  • Single card
  • text generation
  • MPT
  • Single card
  • text generation
  • Mistral
  • Single card
  • text generation
  • Phi βœ…
  • Single card
  • language modeling
  • text generation
  • Mixtral
  • Single card
  • text generation
  • T5 / Flan T5 βœ… βœ…
  • summarization
  • translation
  • question answering
  • BART
  • Single card
  • summarization
  • translation
  • question answering
  • ViT βœ… βœ…
  • image classification
  • Swin βœ… βœ…
  • image classification
  • Wav2Vec2 βœ… βœ…
  • audio classification
  • speech recognition
  • Whisper βœ… βœ…
  • speech recognition
  • SpeechT5
  • Single card
  • text to speech
  • CLIP βœ… βœ…
  • contrastive image-text training
  • BridgeTower βœ… βœ…
  • contrastive image-text training
  • ESMFold
  • Single card
  • protein folding
  • Blip
  • Single card
  • visual question answering
  • image to text
    • Diffusers
    Architecture Training Inference Tasks
    Stable Diffusion
  • Single card
  • text-to-image generation
  • LDM3D
  • Single card
  • text-to-image generation
    • TRL:
    Architecture Training Inference Tasks
    Llama 2 βœ…
  • DPO Pipeline
  • Llama 2 βœ…
  • PPO Pipeline
  • Other models and tasks supported by the πŸ€— Transformers and πŸ€— Diffusers library may also work. You can refer to this section for using them with πŸ€— Optimum Habana. Besides, this page explains how to modify any example from the πŸ€— Transformers library to make it work with πŸ€— Optimum Habana.