--- license: llama3 base_model: meta-llama/Meta-Llama-3-8B tags: - generated_from_trainer metrics: - f1 model-index: - name: results results: [] --- # results This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.4335 - F1: 0.8190 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 64 - eval_batch_size: 64 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 70 - num_epochs: 4 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | F1 | |:-------------:|:------:|:----:|:---------------:|:------:| | 1.0065 | 0.0684 | 16 | 0.9778 | 0.5401 | | 1.005 | 0.1368 | 32 | 0.9209 | 0.5508 | | 0.8912 | 0.2051 | 48 | 0.8197 | 0.5698 | | 0.8738 | 0.2735 | 64 | 0.7217 | 0.5946 | | 0.6965 | 0.3419 | 80 | 0.6439 | 0.6593 | | 0.6463 | 0.4103 | 96 | 0.6081 | 0.6828 | | 0.6129 | 0.4786 | 112 | 0.5541 | 0.7278 | | 0.5931 | 0.5470 | 128 | 0.5693 | 0.6868 | | 0.5643 | 0.6154 | 144 | 0.5290 | 0.7454 | | 0.5601 | 0.6838 | 160 | 0.5402 | 0.7159 | | 0.5259 | 0.7521 | 176 | 0.5021 | 0.7613 | | 0.5361 | 0.8205 | 192 | 0.5051 | 0.7653 | | 0.5235 | 0.8889 | 208 | 0.4816 | 0.7747 | | 0.526 | 0.9573 | 224 | 0.4726 | 0.7765 | | 0.486 | 1.0256 | 240 | 0.4786 | 0.7712 | | 0.4757 | 1.0940 | 256 | 0.4669 | 0.7804 | | 0.4635 | 1.1624 | 272 | 0.4682 | 0.7891 | | 0.4691 | 1.2308 | 288 | 0.4561 | 0.7898 | | 0.4682 | 1.2991 | 304 | 0.4818 | 0.7542 | | 0.4229 | 1.3675 | 320 | 0.4704 | 0.7831 | | 0.4192 | 1.4359 | 336 | 0.4544 | 0.7964 | | 0.4249 | 1.5043 | 352 | 0.4493 | 0.7928 | | 0.4339 | 1.5726 | 368 | 0.4597 | 0.7921 | | 0.4513 | 1.6410 | 384 | 0.4478 | 0.7931 | | 0.4553 | 1.7094 | 400 | 0.4474 | 0.7916 | | 0.42 | 1.7778 | 416 | 0.4473 | 0.7917 | | 0.4194 | 1.8462 | 432 | 0.4416 | 0.8002 | | 0.4265 | 1.9145 | 448 | 0.4370 | 0.8054 | | 0.4216 | 1.9829 | 464 | 0.4324 | 0.8117 | | 0.3869 | 2.0513 | 480 | 0.4369 | 0.8010 | | 0.3617 | 2.1197 | 496 | 0.4424 | 0.8096 | | 0.3773 | 2.1880 | 512 | 0.4558 | 0.8042 | | 0.3852 | 2.2564 | 528 | 0.4311 | 0.8109 | | 0.3726 | 2.3248 | 544 | 0.4403 | 0.8096 | | 0.3586 | 2.3932 | 560 | 0.4381 | 0.8125 | | 0.3756 | 2.4615 | 576 | 0.4337 | 0.8109 | | 0.3765 | 2.5299 | 592 | 0.4341 | 0.8110 | | 0.4104 | 2.5983 | 608 | 0.4263 | 0.8120 | | 0.3704 | 2.6667 | 624 | 0.4404 | 0.8063 | | 0.4087 | 2.7350 | 640 | 0.4271 | 0.8171 | | 0.3498 | 2.8034 | 656 | 0.4336 | 0.8162 | | 0.3606 | 2.8718 | 672 | 0.4286 | 0.8180 | | 0.343 | 2.9402 | 688 | 0.4343 | 0.8039 | | 0.378 | 3.0085 | 704 | 0.4491 | 0.8018 | | 0.3199 | 3.0769 | 720 | 0.4344 | 0.8131 | | 0.3529 | 3.1453 | 736 | 0.4332 | 0.8148 | | 0.3228 | 3.2137 | 752 | 0.4362 | 0.8170 | | 0.3061 | 3.2821 | 768 | 0.4390 | 0.8162 | | 0.3277 | 3.3504 | 784 | 0.4385 | 0.8170 | | 0.2973 | 3.4188 | 800 | 0.4389 | 0.8143 | | 0.3162 | 3.4872 | 816 | 0.4348 | 0.8181 | | 0.3078 | 3.5556 | 832 | 0.4345 | 0.8171 | | 0.3482 | 3.6239 | 848 | 0.4359 | 0.8125 | | 0.3243 | 3.6923 | 864 | 0.4336 | 0.8170 | | 0.3465 | 3.7607 | 880 | 0.4337 | 0.8175 | | 0.2984 | 3.8291 | 896 | 0.4329 | 0.8194 | | 0.3159 | 3.8974 | 912 | 0.4332 | 0.8190 | | 0.3327 | 3.9658 | 928 | 0.4335 | 0.8190 | ### Framework versions - Transformers 4.41.2 - Pytorch 2.3.1+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1