TRL documentation

Community Tutorials

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Community Tutorials

Community tutorials are made by active members of the Hugging Face community that want to share their knowledge and expertise with others. They are a great way to learn about the library and its features, and to get started with core classes and modalities.

Language Models

Task Class Description Author Tutorial Colab
Instruction tuning SFTTrainer Fine-tuning Google Gemma LLMs using ChatML format with QLoRA Philipp Schmid Link Open In Colab
Structured Generation SFTTrainer Fine-tuning Llama-2-7B to generate Persian product catalogs in JSON using QLoRA and PEFT Mohammadreza Esmaeilian Link Open In Colab
Preference Optimization DPOTrainer Align Mistral-7b using Direct Preference Optimization for human preference alignment Maxime Labonne Link Open In Colab
Preference Optimization ORPOTrainer Fine-tuning Llama 3 with ORPO combining instruction tuning and preference alignment Maxime Labonne Link Open In Colab

Vision Language Models

Task Class Description Author Tutorial Colab
Visual QA SFTTrainer Fine-tuning Qwen2-VL-7B for visual question answering on ChartQA dataset Sergio Paniego Link Open In Colab
SEO Description SFTTrainer Fine-tuning Qwen2-VL-7B for generating SEO-friendly descriptions from images Philipp Schmid Link Open In Colab
Visual QA DPOTrainer PaliGemma 🤝 Direct Preference Optimization Merve Noyan Link Open In Colab

Contributing

If you have a tutorial that you would like to add to this list, please open a PR to add it. We will review it and merge it if it is relevant to the community.

< > Update on GitHub