saleperson_model
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.6152
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 48
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.2943 | 0.27 | 10 | 1.0754 |
0.9998 | 0.53 | 20 | 0.9170 |
0.8633 | 0.8 | 30 | 0.8350 |
0.7769 | 1.07 | 40 | 0.7880 |
0.7549 | 1.33 | 50 | 0.7466 |
0.6801 | 1.6 | 60 | 0.7187 |
0.6115 | 1.87 | 70 | 0.6902 |
0.5801 | 2.13 | 80 | 0.6871 |
0.6014 | 2.4 | 90 | 0.6623 |
0.5491 | 2.67 | 100 | 0.6473 |
0.5215 | 2.93 | 110 | 0.6394 |
0.5051 | 3.2 | 120 | 0.6475 |
0.4746 | 3.47 | 130 | 0.6259 |
0.4593 | 3.73 | 140 | 0.6188 |
0.4788 | 4.0 | 150 | 0.6096 |
0.4217 | 4.27 | 160 | 0.6232 |
0.4297 | 4.53 | 170 | 0.6120 |
0.4225 | 4.8 | 180 | 0.6028 |
0.3932 | 5.07 | 190 | 0.6156 |
0.3799 | 5.33 | 200 | 0.6146 |
0.389 | 5.6 | 210 | 0.5977 |
0.3759 | 5.87 | 220 | 0.6067 |
0.3584 | 6.13 | 230 | 0.6216 |
0.3451 | 6.4 | 240 | 0.6044 |
0.3517 | 6.67 | 250 | 0.6069 |
0.3476 | 6.93 | 260 | 0.6071 |
0.3161 | 7.2 | 270 | 0.6183 |
0.3164 | 7.47 | 280 | 0.6077 |
0.3196 | 7.73 | 290 | 0.6139 |
0.3294 | 8.0 | 300 | 0.6065 |
0.3197 | 8.27 | 310 | 0.6161 |
0.3249 | 8.53 | 320 | 0.6148 |
0.2774 | 8.8 | 330 | 0.6174 |
0.3011 | 9.07 | 340 | 0.6153 |
0.2937 | 9.33 | 350 | 0.6153 |
0.3168 | 9.6 | 360 | 0.6152 |
0.3103 | 9.87 | 370 | 0.6152 |
Framework versions
- PEFT 0.7.1
- Transformers 4.38.2
- Pytorch 2.0.1+cu117
- Datasets 2.14.4
- Tokenizers 0.15.2
- Downloads last month
- 25
Unable to determine this model’s pipeline type. Check the
docs
.