metadata
base_model: unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
library_name: peft
license: llama3.1
tags:
- trl
- sft
- unsloth
- generated_from_trainer
model-index:
- name: meta-llama-Meta-Llama-3.1-8B-Instruct_SFT_E1_D30002
results: []
meta-llama-Meta-Llama-3.1-8B-Instruct_SFT_E1_D30002
This model is a fine-tuned version of unsloth/meta-llama-3.1-8b-instruct-bnb-4bit on the None dataset.
Model description
This model was trained on Successful episodes of the top 10 model similar to D20001 but instead of using the whole episode as input, each episode was split into conversation pieces.
e.g.
[
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
]
is split int:
[
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
and
[
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
{
role: 'user'
content: '...'
},
{
role: 'assistant'
content: '...'
},
]
Training and evaluation data
After splitting, the dataset contains about 6635 conversation bits accross all games.
The Dataset ID is D30002
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 4
- eval_batch_size: 8
- seed: 7331
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.03
- lr_scheduler_warmup_steps: 5
- num_epochs: 1
Training results
Framework versions
- PEFT 0.12.0
- Transformers 4.44.2
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1