Transformers documentation

Efficient Inference on a Single GPU

You are viewing v4.21.3 version. A newer version v4.40.1 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Efficient Inference on a Single GPU

This document will be completed soon with information on how to infer on a single GPU. In the meantime you can check out the guide for training on a single GPU and the guide for inference on CPUs.