AMD's profile picture

AMD

Enterprise
company
Verified

AI & ML interests

None defined yet.

Organization Card
About org cards

AMD Pervasive AI

together we advance_AI

AI is increasingly pervasive across the modern world. It’s driving our smart technology in retail, cities, factories and healthcare, and transforming our digital homes. AMD offers advanced AI acceleration from data center to edge, enabling high performance and high efficiency to make the world smarter.

Getting Started with Hugging Face Transformers

AMD’s Ryzen™ AI family of laptop processors provide users with an integrated Neural Processing Unit (NPU) which offloads the host CPU and GPU from AI processing tasks. Ryzen™ AI software consists of the Vitis ™ AI execution provider (EP) for ONNX Runtime combined with quantization tools and a pre-optimized model zoo. All of this is made possible based on Ryzen™ AI technology built on AMD XDNA™ architecture, purpose-built to run AI workloads efficiently and locally, offering a host of benefits for the developer innovating the next groundbreaking AI app. Details on getting started with Hugging Face models are available on the Optimum page

The following section describes how to use the most common transformers on Hugging Face for inference workloads on select AMD Instinct™ accelerators and AMD Radeon™ GPUs using the AMD ROCm software ecosystem. This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model. General Linux and ML experience is a required pre-requisite.

1. Confirm you have a supported AMD hardware platform

Is my hardware supported with ROCm on Linux?

2. Install ROCm driver, libraries and tools

Follow the detailed installation instructions for your Linux based platform.

3. Install Machine Learning Frameworks

Pip installation is an easy way to acquire all the required packages and is described in more detail below.

If you prefer to use a container strategy, check out the pre-built images at ROCm Docker Hub and AMD Infinity Hub after installing the required dependancies.

PyTorch

AMD ROCm is fully integrated into the mainline PyTorch ecosystem. Pip wheels are built and tested as part of the stable and nightly releases. Go to pytorch.org and use the 'Install PyTorch' widget. Select 'Stable + Linux + Pip + Python + ROCm' to get the specific pip installation command.

An example command line (note the versioning of the whl file):

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2

TensorFlow

AMD ROCm is upstreamed into the TensorFlow github repository. Pre-built wheels are hosted on pipy.org

The latest version can be installed with this command:

pip install tensorflow-rocm

4. Use a Hugging Face Model

Now that you have the base requirements installed, get the latest transformer models.

pip install transformers

This allows you to easily import any of the base models into your python application. Here is an example using GPT2 in PyTorch:

from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2Model.from_pretrained('gpt2')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

All of the 200+ standard transformer models are regularly tested with our supported hardware platforms. Note that this also implies that all derivatives of those core models should also function correctly. Let us know if you run into issues at our ROCm Community page

Here are a few of the more popular ones to get you started:

Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application.

5. Optimum Support

For a deeper dive into using Hugging Face libraries on AMD GPUs, check out the Optimum page describing details on Flash Attention 2, GPTQ Quantization and ONNX Runtime integration.

Serving a model with TGI

Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale. TGI is already usable in production on AMD Instinct™ GPUs through the docker image ghcr.io/huggingface/text-generation-inference:1.2-rocm. Make sure to refer to the documentation concerning the support and any limitations.

Benchmarking

The Optimum-Benchmark is available as a utility to easily benchmark the performance of transformers on AMD GPUs, across normal and distributed settings, with various supported optimizations and quantization schemes.

Useful Links and Blogs

datasets

None public yet