AWS Trainium & Inferentia

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

NeuronX Text-generation-inference for AWS inferentia2

Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs).

A neuron backend allows to deploy TGI for Trainium and Inferentia chips.