Transformers

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

Efficient Inference on a Single GPU

This document will be completed soon with information on how to infer on a single GPU. In the meantime you can check out the guide for training on a single GPU and the guide for inference on CPUs.