NLP Course documentation
Supervised Fine-Tuning
Supervised Fine-Tuning
Supervised Fine-Tuning (SFT) is a process primarily used to adapt pre-trained language models to follow instructions, engage in dialogue, and use specific output formats. While pre-trained models have impressive general capabilities, SFT helps transform them into assistant-like models that can better understand and respond to user prompts. This is typically done by training on datasets of human-written conversations and instructions.
This page provides a step-by-step guide to fine-tuning the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
model using the SFTTrainer
. By following these steps, you can adapt the model to perform specific tasks more effectively.
When to Use SFT
Before diving into implementation, it’s important to understand when SFT is the right choice for your project. As a first step, you should consider whether using an existing instruction-tuned model with well-crafted prompts would suffice for your use case. SFT involves significant computational resources and engineering effort, so it should only be pursued when prompting existing models proves insufficient.
If you determine that SFT is necessary, the decision to proceed depends on two primary factors:
Template Control
SFT allows precise control over the model’s output structure. This is particularly valuable when you need the model to:
- Generate responses in a specific chat template format
- Follow strict output schemas
- Maintain consistent styling across responses
Domain Adaptation
When working in specialized domains, SFT helps align the model with domain-specific requirements by:
- Teaching domain terminology and concepts
- Enforcing professional standards
- Handling technical queries appropriately
- Following industry-specific guidelines
This evaluation will help determine if SFT is the right approach for your needs.
Dataset Preparation
The supervised fine-tuning process requires a task-specific dataset structured with input-output pairs. Each pair should consist of:
- An input prompt
- The expected model response
- Any additional context or metadata
The quality of your training data is crucial for successful fine-tuning. Let’s look at how to prepare and validate your dataset:
Training Configuration
The success of your fine-tuning depends heavily on choosing the right training parameters. Let’s explore each important parameter and how to configure them effectively:
The SFTTrainer configuration requires consideration of several parameters that control the training process. Let’s explore each parameter and their purpose:
Training Duration Parameters:
num_train_epochs
: Controls total training durationmax_steps
: Alternative to epochs, sets maximum number of training steps- More epochs allow better learning but risk overfitting
Batch Size Parameters:
per_device_train_batch_size
: Determines memory usage and training stabilitygradient_accumulation_steps
: Enables larger effective batch sizes- Larger batches provide more stable gradients but require more memory
Learning Rate Parameters:
learning_rate
: Controls size of weight updateswarmup_ratio
: Portion of training used for learning rate warmup- Too high can cause instability, too low results in slow learning
Monitoring Parameters:
logging_steps
: Frequency of metric loggingeval_steps
: How often to evaluate on validation datasave_steps
: Frequency of model checkpoint saves
Implementation with TRL
Now that we understand the key components, let’s implement the training with proper validation and monitoring. We will use the SFTTrainer
class from the Transformers Reinforcement Learning (TRL) library, which is built on top of the transformers
library. Here’s a complete example using the TRL library:
from datasets import load_dataset
from trl import SFTConfig, SFTTrainer
import torch
# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"
# Load dataset
dataset = load_dataset("HuggingFaceTB/smoltalk", "all")
# Configure trainer
training_args = SFTConfig(
output_dir="./sft_output",
max_steps=1000,
per_device_train_batch_size=4,
learning_rate=5e-5,
logging_steps=10,
save_steps=100,
eval_strategy="steps",
eval_steps=50,
)
# Initialize trainer
trainer = SFTTrainer(
model=model,
args=training_args,
train_dataset=dataset["train"],
eval_dataset=dataset["test"],
processing_class=tokenizer,
)
# Start training
trainer.train()
Packing the Dataset
The SFTTrainer supports example packing to optimize training efficiency. This feature allows multiple short examples to be packed into the same input sequence, maximizing GPU utilization during training. To enable packing, simply set packing=True
in the SFTConfig constructor. When using packed datasets with max_steps
, be aware that you may train for more epochs than expected depending on your packing configuration. You can customize how examples are combined using a formatting function - particularly useful when working with datasets that have multiple fields like question-answer pairs. For evaluation datasets, you can disable packing by setting eval_packing=False
in the SFTConfig. Here’s a basic example of customizing the packing configuration:
# Configure packing
training_args = SFTConfig(packing=True)
trainer = SFTTrainer(model=model, train_dataset=dataset, args=training_args)
trainer.train()
When packing the dataset with multiple fields, you can define a custom formatting function to combine the fields into a single input sequence. This function should take a list of examples and return a dictionary with the packed input sequence. Here’s an example of a custom formatting function:
def formatting_func(example):
text = f"### Question: {example['question']}\n ### Answer: {example['answer']}"
return text
training_args = SFTConfig(packing=True)
trainer = SFTTrainer(
"facebook/opt-350m",
train_dataset=dataset,
args=training_args,
formatting_func=formatting_func,
)
Monitoring Training Progress
Effective monitoring is crucial for successful fine-tuning. Let’s explore what to watch for during training:
Understanding Loss Patterns
Training loss typically follows three distinct phases:
- Initial Sharp Drop: Rapid adaptation to new data distribution
- Gradual Stabilization: Learning rate slows as model fine-tunes
- Convergence: Loss values stabilize, indicating training completion

Metrics to Monitor
Effective monitoring involves tracking quantitative metrics, and evaluating qualitative metrics. Available metrics are:
- Training loss
- Validation loss
- Learning rate progression
- Gradient norms
The Path to Convergence
As training progresses, the loss curve should gradually stabilize. The key indicator of healthy training is a small gap between training and validation loss, suggesting the model is learning generalizable patterns rather than memorizing specific examples. The absolute loss values will vary depending on your task and dataset.
Monitoring Training Progress
The graph above shows a typical training progression. Notice how both training and validation loss decrease sharply at first, then gradually level off. This pattern indicates the model is learning effectively while maintaining generalization ability.
Warning Signs to Watch For
Several patterns in the loss curves can indicate potential issues. Below we illustrate common warning signs and solutions that we can consider.

If the validation loss decreases at a significantly slower rate than training loss, your model is likely overfitting to the training data. Consider:
- Reducing the training steps
- Increasing the dataset size
- Validating dataset quality and diversity

If the loss doesn’t show significant improvement, the model might be:
- Learning too slowly (try increasing the learning rate)
- Struggling with the task (check data quality and task complexity)
- Hitting architecture limitations (consider a different model)

Extremely low loss values could suggest memorization rather than learning. This is particularly concerning if:
- The model performs poorly on new, similar examples
- The outputs lack diversity
- The responses are too similar to training examples
We should note that the interpretation of the loss values we outline here is aimed on the most common case, and in fact, loss values can behave on various ways depending on the model, the dataset, the training parameters, etc. If you interested in exploring more about outlined patterns, you should check out this blog post by the people at Fast AI.
Evaluation after SFT
In section 11.4 we will learn how to evaluate the model using benchmark datasets. For now, we will focus on the qualitative evaluation of the model.
After completing SFT, consider these follow-up actions:
- Evaluate the model thoroughly on held-out test data
- Validate template adherence across various inputs
- Test domain-specific knowledge retention
- Monitor real-world performance metrics
Quiz
1. What parameters control the training duration in SFT?
2. Which pattern in the loss curves indicates potential overfitting?
3. What is gradient_accumulation_steps used for?
4. What should you monitor during SFT training?
5. What indicates healthy convergence during training?
💐 Nice work!
You’ve learned how to fine-tune models using SFT! To continue your learning:
- Try the notebook with different parameters
- Experiment with other datasets
- Contribute improvements to the course material