Transformers documentation

Efficient Inference on a Multiple GPUs

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Efficient Inference on a Multiple GPUs

This document contains information on how to efficiently infer on a multiple GPUs.

Note: A multi GPU setup can use the majority of the strategies described in the single GPU section. You must be aware of simple techniques, though, that can be used for a better usage.