Optimum documentation

Optimum for Intel Gaudi

You are viewing v1.21.4 version. A newer version v1.23.3 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Optimum for Intel Gaudi

Optimum for Intel Gaudi is the interface between the Transformers and Diffusers libraries and Intelยฎ Gaudiยฎ AI Accelerators (HPUs). It provides a set of tools that enable easy model loading, training and inference on single- and multi-HPU settings for various downstream tasks as shown in the table below.

HPUs offer fast model training and inference as well as a great price-performance ratio. Check out this blog post about BERT pre-training and this post benchmarking Intel Gaudi 2 with NVIDIA A100 GPUs for concrete examples. If you are not familiar with HPUs, we recommend you take a look at our conceptual guide.

The following model architectures, tasks and device distributions have been validated for Optimum for Intel Gaudi:

In the tables below, โœ… means single-card, multi-card and DeepSpeed have all been validated.

  • Transformers:
Architecture Training Inference Tasks
BERT โœ… โœ…
  • text classification
  • question answering
  • language modeling
  • text feature extraction
  • RoBERTa โœ… โœ…
  • question answering
  • language modeling
  • ALBERT โœ… โœ…
  • question answering
  • language modeling
  • DistilBERT โœ… โœ…
  • question answering
  • language modeling
  • GPT2 โœ… โœ…
  • language modeling
  • text generation
  • BLOOM(Z)
  • DeepSpeed
  • text generation
  • StarCoder / StarCoder2 โœ…
  • Single card
  • language modeling
  • text generation
  • GPT-J
  • DeepSpeed
  • Single card
  • DeepSpeed
  • language modeling
  • text generation
  • GPT-NeoX
  • DeepSpeed
  • DeepSpeed
  • language modeling
  • text generation
  • OPT
  • DeepSpeed
  • text generation
  • Llama 2 / CodeLlama / Llama 3 / Llama Guard / Granite โœ… โœ…
  • language modeling
  • text generation
  • question answering
  • text classification (Llama Guard)
  • StableLM
  • Single card
  • text generation
  • Falcon
  • LoRA
  • โœ…
  • text generation
  • CodeGen
  • Single card
  • text generation
  • MPT
  • Single card
  • text generation
  • Mistral
  • Single card
  • text generation
  • Phi โœ…
  • Single card
  • language modeling
  • text generation
  • Mixtral
  • Single card
  • text generation
  • Gemma โœ…
  • Single card
  • language modeling
  • text generation
  • Qwen2
  • Single card
  • Single card
  • language modeling
  • text generation
  • Persimmon
  • Single card
  • text generation
  • T5 / Flan T5 โœ… โœ…
  • summarization
  • translation
  • question answering
  • BART
  • Single card
  • summarization
  • translation
  • question answering
  • ViT โœ… โœ…
  • image classification
  • Swin โœ… โœ…
  • image classification
  • Wav2Vec2 โœ… โœ…
  • audio classification
  • speech recognition
  • Whisper โœ… โœ…
  • speech recognition
  • SpeechT5
  • Single card
  • text to speech
  • CLIP โœ… โœ…
  • contrastive image-text training
  • BridgeTower โœ… โœ…
  • contrastive image-text training
  • ESMFold
  • Single card
  • protein folding
  • Blip
  • Single card
  • visual question answering
  • image to text
  • OWLViT
  • Single card
  • zero shot object detection
  • ClipSeg
  • Single card
  • object segmentation
  • Llava / Llava-next
  • Single card
  • image to text
  • SAM
  • Single card
  • object segmentation
  • VideoMAE
  • Single card
  • Video classification
  • TableTransformer
  • Single card
  • table object detection
  • DETR
  • Single card
  • object detection
    • Diffusers
    Architecture Training Inference Tasks
    Stable Diffusion
  • textual inversion
  • ControlNet
  • Single card
  • text-to-image generation
  • Stable Diffusion XL
  • fine-tuning
  • Single card
  • text-to-image generation
  • LDM3D
  • Single card
  • text-to-image generation
    • PyTorch Image Models/TIMM:
    Architecture Training Inference Tasks
    FastViT
  • Single card
  • image classification
    • TRL:
    Architecture Training Inference Tasks
    Llama 2 โœ…
  • DPO Pipeline
  • Llama 2 โœ…
  • PPO Pipeline
  • Stable Diffusion โœ…
  • DDPO Pipeline
  • Other models and tasks supported by the ๐Ÿค— Transformers and ๐Ÿค— Diffusers library may also work. You can refer to this section for using them with ๐Ÿค— Optimum Habana. Besides, this page explains how to modify any example from the ๐Ÿค— Transformers library to make it work with ๐Ÿค— Optimum Habana.

    < > Update on GitHub