Transformers documentation

Efficient Inference on a Single GPU

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Efficient Inference on a Single GPU

This document will be completed soon with information on how to infer on a single GPU. In the meantime you can check out the guide for training on a single GPU and the guide for inference on CPUs.