Google Cloud documentation

(Preview) Cloud Run Examples

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

(Preview) Cloud Run Examples

This directory contains usage examples of the Hugging Face Deep Learning Containers (DLCs) in Cloud Run only for inference at the moment, with a focus on Large Language Models (LLMs).

Cloud Run now offers on-demand access to NVIDIA L4 GPUs for running AI inference workloads; but is still in preview, so the Cloud Run examples within this repository should be taken solely for testing and experimentation; please avoid using those for production workloads. We are actively working towards general availability and appreciate your understanding.

Inference Examples

Example Title
deploy-gemma-2-on-cloud-run Deploy Gemma2 9B with TGI DLC on Cloud Run
deploy-llama-3-1-on-cloud-run Deploy Llama 3.1 8B with TGI DLC on Cloud Run

Training Examples

Coming soon!

📍 Find the complete example on GitHub here!

< > Update on GitHub