|
--- |
|
title: README |
|
emoji: 📚 |
|
colorFrom: blue |
|
colorTo: green |
|
sdk: static |
|
pinned: false |
|
--- |
|
|
|
|
|
# together we advance_AI |
|
|
|
AI is increasingly pervasive across the modern world. |
|
It’s driving our smart technology in retail, cities, factories and healthcare, |
|
and transforming our digital homes. |
|
AMD offers advanced AI acceleration from data center to edge, |
|
enabling high performance and high efficiency to make the world smarter. |
|
|
|
# Getting Started with Hugging Face Transformers |
|
|
|
|
|
|
|
Looking for how to use the most common transformers on Hugging Face |
|
for inference workloads on select AMD Instinct™ accelerators and AMD Radeon™ GPUs using the AMD ROCm™ software? |
|
This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model. |
|
General Linux and ML experience is a required pre-requisite. |
|
|
|
## 1. Confirm you have a supported AMD hardware platform |
|
|
|
Is my [hardware supported](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html) with ROCm on Linux? |
|
|
|
## 2. Install ROCm driver, libraries and tools |
|
|
|
Follow the detailed [installation instructions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/) for your Linux based platform. |
|
|
|
## 3. Install Machine Learning Frameworks |
|
Pip installation is an easy way to acquire all the required packages and is described in more detail below. |
|
|
|
>If you prefer to use a container strategy, check out the pre-built images at |
|
[ROCm Docker Hub](https://hub.docker.com/u/rocm/) |
|
and [AMD Infinity Hub](https://www.amd.com/en/developer/resources/infinity-hub.html) |
|
after installing the required [dependancies](https://rocm.docs.amd.com/en/latest/deploy/docker.html). |
|
|
|
### PyTorch |
|
AMD ROCm is fully integrated into the mainline PyTorch ecosystem. Pip wheels are built and tested as part of the stable and nightly releases. |
|
Go to [pytorch.org](https://pytorch.org) and use the 'Install PyTorch' widget. |
|
Select 'Stable + Linux + Pip + Python + ROCm' to get the specific pip installation command. |
|
|
|
An example command line (note the versioning of the whl file): |
|
|
|
> `pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2` |
|
|
|
### TensorFlow |
|
|
|
AMD ROCm is upstreamed into the TensorFlow github repository. |
|
Pre-built wheels are hosted on [pipy.org](https://pypi.org/project/tensorflow-rocm/) |
|
|
|
The latest version can be installed with this command: |
|
|
|
> `pip install tensorflow-rocm` |
|
|
|
## 4. Use a Hugging Face Model |
|
Now that you have the base requirements installed, get the latest transformer models. |
|
|
|
> `pip install transformers` |
|
|
|
This allows you to easily import any of the base models into your python application. |
|
Here is an example using [GPT2](https://huggingface.co/gpt2) in PyTorch: |
|
|
|
```python |
|
from transformers import GPT2Tokenizer, GPT2Model |
|
tokenizer = GPT2Tokenizer.from_pretrained('gpt2') |
|
model = GPT2Model.from_pretrained('gpt2') |
|
text = "Replace me by any text you'd like." |
|
encoded_input = tokenizer(text, return_tensors='pt') |
|
output = model(**encoded_input) |
|
``` |
|
|
|
All of the 200+ standard transformer models are regularly tested with our supported hardware platforms. |
|
Note that this also implies that all derivatives of those core models should also function correctly. |
|
Let us know if you run into issues at our [ROCm Community page](https://github.com/RadeonOpenCompute/ROCm/discussions) |
|
|
|
Here are a few of the more popular ones to get you started: |
|
- [BERT](https://huggingface.co/bert-base-uncased) |
|
- [BLOOM](https://huggingface.co/bigscience/bloom) |
|
- [LLaMA](https://huggingface.co/huggyllama/llama-7b) |
|
- [OPT](https://huggingface.co/facebook/opt-66b) |
|
- [T5](https://huggingface.co/t5-base) |
|
|
|
Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application. |
|
|
|
## 5. Optimum Support |
|
For a deeper dive into using Hugging Face libraries on AMD GPUs, check out the [Optimum](https://huggingface.co/docs/optimum/main/en/amd/amdgpu/overview) page |
|
describing details on Flash Attention 2, GPTQ Quantization and ONNX Runtime integration. |
|
|
|
Details on getting started |
|
with Hugging Face models are available on the [Optimum page](https://huggingface.co/docs/optimum/main/en/amd/index) |
|
|
|
# Serving a model with TGI |
|
|
|
Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale. |
|
TGI is already usable in production on AMD Instinct™ GPUs through the docker image `ghcr.io/huggingface/text-generation-inference:latest-rocm`. |
|
Make sure to refer to the [documentation](https://huggingface.co/docs/text-generation-inference/supported_models#supported-hardware) |
|
concerning the support and any limitations. |
|
|
|
# Benchmarking |
|
|
|
The [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) is available as a utility to easily benchmark the performance of transformers on AMD GPUs, |
|
across normal and distributed settings, with various supported optimizations and quantization schemes. |
|
|
|
# Useful Links and Blogs |
|
|
|
- Detailed Llama-3 results [Run TGI on AMD Instinct MI300X](https://huggingface.co/blog/huggingface-amd-mi300) |
|
- Detailed Llama-2 results show casing the [Optimum benchmark on AMD Instinct MI250](https://huggingface.co/blog/huggingface-and-optimum-amd) |
|
- Check out our blog titled [Run a Chatgpt-like Chatbot on a Single GPU with ROCm](https://huggingface.co/blog/chatbot-amd-gpu) |
|
- Complete ROCm [Documentation](https://rocm.docs.amd.com/en/latest/) for installation and usage |
|
- Extended training content and connect with the development community at the [Developer Hub](https://www.amd.com/en/developer/resources/rocm-hub.html) |
|
|
|
|
|
|
|
|
|
|
|
|