GPT-2 Instruction Tuning with a Single GPU π
The model and all artifacts in this repository are obtained by full fine-tuning the base GPT-2 model on the Stanford Alpaca dataset with a single GPU with less than 8 GB of memory. The dataset was filtered using the max_length
hyperparameter (maximum number of tokens per example), and gradient accumulation was applied to significantly reduce the memory footprint.
Results with the base model (124M parameters) are acceptable π, but further hyperparameters search and tricks could most probably lead to better results. Surely, it is an βaffordableβ playground to play with this important step of the pipeline that transforms a model from an LLM to a usable and querable model.
1.0 Fine-Tuning Hyperparameters
Parameter | Value |
---|---|
num_epochs | 35 |
lr | 1e-5 |
num_warmup_steps | 300 |
weight_decay | 0.0 |
batch_size | 16 |
gradient_accumulation_steps | 8 |
max_grad_norm | 2.0 |
checkpoint_path | 'modelstore' |
logs_path | 'logs' |
max_length | 140 |
eval_split | 0.1 |
seed | 9 |
2.0 Examples
Key | Value |
---|---|
temperature | 0.7 |
do_sample | True |
max_new_tokens | 150 |
top_p | 0.92 |
top_k | 0 |
2.1 First example
Instruction:
What is the capital of France?
Response:
The capital of France is Paris.
2.2 Second example
Instruction:
Give three tips to improve my writing.
Response:
1. Break up the sentences into smaller, more manageable chunks.
2. Use regular and non-consecutive paragraphs.
3. Include a clear goal in your writing.
2.3 Third example
Instruction:
What is life?
Response:
Life is like a rollercoaster ride, filled with ups and downs and twists and turns. It is a journey full of highs and lows, but there are also rewards and punishments to be found along the way. Life can be unpredictable and unpredictable, so it is important to have a plan in place to stay on track and make the most of your opportunities.
3.0 References
- π Dive into the Code: GitHub Repository
- π Explore the Dataset: Stanford Alpaca Dataset
- Downloads last month
- 8