metadata

title: README
emoji: 📚
colorFrom: blue
colorTo: green
sdk: static
pinned: false

together we advance_AI

AI is increasingly pervasive across the modern world. It’s driving our smart technology in retail, cities, factories and healthcare, and transforming our digital homes. AMD offers advanced AI acceleration from data center to edge, enabling high performance and high efficiency to make the world smarter.

Getting Started with Hugging Face Transformers

Looking for how to use the most common transformers on Hugging Face for inference workloads on select AMD Instinct™ accelerators and AMD Radeon™ GPUs using the AMD ROCm™ software? This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model. General Linux and ML experience is a required pre-requisite.

1. Confirm you have a supported AMD hardware platform

Is my hardware supported with ROCm on Linux?

2. Install ROCm driver, libraries and tools

Follow the detailed installation instructions for your Linux based platform.

3. Install Machine Learning Frameworks

Pip installation is an easy way to acquire all the required packages and is described in more detail below.

If you prefer to use a container strategy, check out the pre-built images at ROCm Docker Hub and AMD Infinity Hub after installing the required dependancies.

PyTorch

AMD ROCm is fully integrated into the mainline PyTorch ecosystem. Pip wheels are built and tested as part of the stable and nightly releases. Go to pytorch.org and use the 'Install PyTorch' widget. Select 'Stable + Linux + Pip + Python + ROCm' to get the specific pip installation command.

An example command line (note the versioning of the whl file):

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2

TensorFlow

AMD ROCm is upstreamed into the TensorFlow github repository. Pre-built wheels are hosted on pipy.org

The latest version can be installed with this command:

pip install tensorflow-rocm

4. Use a Hugging Face Model

Now that you have the base requirements installed, get the latest transformer models.

pip install transformers

This allows you to easily import any of the base models into your python application. Here is an example using GPT2 in PyTorch:

from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2Model.from_pretrained('gpt2')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

All of the 200+ standard transformer models are regularly tested with our supported hardware platforms. Note that this also implies that all derivatives of those core models should also function correctly. Let us know if you run into issues at our ROCm Community page

Here are a few of the more popular ones to get you started:

Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.

5. Optimum Support

For a deeper dive into using Hugging Face libraries on AMD GPUs, check out the Optimum page describing details on Flash Attention 2, GPTQ Quantization and ONNX Runtime integration.

Details on getting started with Hugging Face models are available on the Optimum page

Serving a model with TGI

Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale. TGI is already usable in production on AMD Instinct™ GPUs through the docker image ghcr.io/huggingface/text-generation-inference:latest-rocm. Make sure to refer to the documentation concerning the support and any limitations.

Benchmarking

The Optimum-Benchmark is available as a utility to easily benchmark the performance of transformers on AMD GPUs, across normal and distributed settings, with various supported optimizations and quantization schemes.

Useful Links and Blogs

Detailed Llama-3 results Run TGI on AMD Instinct MI300X
Detailed Llama-2 results show casing the Optimum benchmark on AMD Instinct MI250
Check out our blog titled Run a Chatgpt-like Chatbot on a Single GPU with ROCm
Complete ROCm Documentation for installation and usage
Extended training content and connect with the development community at the Developer Hub