results
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4335
- F1: 0.8190
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 70
- num_epochs: 4
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
1.0065 | 0.0684 | 16 | 0.9778 | 0.5401 |
1.005 | 0.1368 | 32 | 0.9209 | 0.5508 |
0.8912 | 0.2051 | 48 | 0.8197 | 0.5698 |
0.8738 | 0.2735 | 64 | 0.7217 | 0.5946 |
0.6965 | 0.3419 | 80 | 0.6439 | 0.6593 |
0.6463 | 0.4103 | 96 | 0.6081 | 0.6828 |
0.6129 | 0.4786 | 112 | 0.5541 | 0.7278 |
0.5931 | 0.5470 | 128 | 0.5693 | 0.6868 |
0.5643 | 0.6154 | 144 | 0.5290 | 0.7454 |
0.5601 | 0.6838 | 160 | 0.5402 | 0.7159 |
0.5259 | 0.7521 | 176 | 0.5021 | 0.7613 |
0.5361 | 0.8205 | 192 | 0.5051 | 0.7653 |
0.5235 | 0.8889 | 208 | 0.4816 | 0.7747 |
0.526 | 0.9573 | 224 | 0.4726 | 0.7765 |
0.486 | 1.0256 | 240 | 0.4786 | 0.7712 |
0.4757 | 1.0940 | 256 | 0.4669 | 0.7804 |
0.4635 | 1.1624 | 272 | 0.4682 | 0.7891 |
0.4691 | 1.2308 | 288 | 0.4561 | 0.7898 |
0.4682 | 1.2991 | 304 | 0.4818 | 0.7542 |
0.4229 | 1.3675 | 320 | 0.4704 | 0.7831 |
0.4192 | 1.4359 | 336 | 0.4544 | 0.7964 |
0.4249 | 1.5043 | 352 | 0.4493 | 0.7928 |
0.4339 | 1.5726 | 368 | 0.4597 | 0.7921 |
0.4513 | 1.6410 | 384 | 0.4478 | 0.7931 |
0.4553 | 1.7094 | 400 | 0.4474 | 0.7916 |
0.42 | 1.7778 | 416 | 0.4473 | 0.7917 |
0.4194 | 1.8462 | 432 | 0.4416 | 0.8002 |
0.4265 | 1.9145 | 448 | 0.4370 | 0.8054 |
0.4216 | 1.9829 | 464 | 0.4324 | 0.8117 |
0.3869 | 2.0513 | 480 | 0.4369 | 0.8010 |
0.3617 | 2.1197 | 496 | 0.4424 | 0.8096 |
0.3773 | 2.1880 | 512 | 0.4558 | 0.8042 |
0.3852 | 2.2564 | 528 | 0.4311 | 0.8109 |
0.3726 | 2.3248 | 544 | 0.4403 | 0.8096 |
0.3586 | 2.3932 | 560 | 0.4381 | 0.8125 |
0.3756 | 2.4615 | 576 | 0.4337 | 0.8109 |
0.3765 | 2.5299 | 592 | 0.4341 | 0.8110 |
0.4104 | 2.5983 | 608 | 0.4263 | 0.8120 |
0.3704 | 2.6667 | 624 | 0.4404 | 0.8063 |
0.4087 | 2.7350 | 640 | 0.4271 | 0.8171 |
0.3498 | 2.8034 | 656 | 0.4336 | 0.8162 |
0.3606 | 2.8718 | 672 | 0.4286 | 0.8180 |
0.343 | 2.9402 | 688 | 0.4343 | 0.8039 |
0.378 | 3.0085 | 704 | 0.4491 | 0.8018 |
0.3199 | 3.0769 | 720 | 0.4344 | 0.8131 |
0.3529 | 3.1453 | 736 | 0.4332 | 0.8148 |
0.3228 | 3.2137 | 752 | 0.4362 | 0.8170 |
0.3061 | 3.2821 | 768 | 0.4390 | 0.8162 |
0.3277 | 3.3504 | 784 | 0.4385 | 0.8170 |
0.2973 | 3.4188 | 800 | 0.4389 | 0.8143 |
0.3162 | 3.4872 | 816 | 0.4348 | 0.8181 |
0.3078 | 3.5556 | 832 | 0.4345 | 0.8171 |
0.3482 | 3.6239 | 848 | 0.4359 | 0.8125 |
0.3243 | 3.6923 | 864 | 0.4336 | 0.8170 |
0.3465 | 3.7607 | 880 | 0.4337 | 0.8175 |
0.2984 | 3.8291 | 896 | 0.4329 | 0.8194 |
0.3159 | 3.8974 | 912 | 0.4332 | 0.8190 |
0.3327 | 3.9658 | 928 | 0.4335 | 0.8190 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.3.1+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
Model tree for SneakyLemon/results
Base model
meta-llama/Meta-Llama-3-8B