--- license: llama3 library_name: peft tags: - trl - orpo - generated_from_trainer base_model: meta-llama/Meta-Llama-3-8B model-index: - name: results results: [] --- # results This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset. It achieves the following results on the evaluation set: - Loss: 2.2477 - Rewards/chosen: -0.2025 - Rewards/rejected: -0.2831 - Rewards/accuracies: 0.8875 - Rewards/margins: 0.0806 - Logps/rejected: -2.8313 - Logps/chosen: -2.0249 - Logits/rejected: -2.1125 - Logits/chosen: -1.7341 - Nll Loss: 2.2267 - Log Odds Ratio: -0.3842 - Log Odds Chosen: 0.8874 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-06 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 50 - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen | |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:| | 5.5008 | 0.2907 | 50 | 5.6262 | -0.5231 | -0.6023 | 0.8250 | 0.0792 | -6.0233 | -5.2314 | -2.0311 | -1.8904 | 5.5816 | -0.4363 | 0.7951 | | 4.92 | 0.5814 | 100 | 5.1023 | -0.4828 | -0.5584 | 0.8250 | 0.0756 | -5.5836 | -4.8278 | -2.1181 | -2.0055 | 5.0596 | -0.4441 | 0.7604 | | 4.6969 | 0.8721 | 150 | 4.6774 | -0.4489 | -0.5171 | 0.8500 | 0.0682 | -5.1705 | -4.4885 | -2.1660 | -2.0410 | 4.6355 | -0.4630 | 0.6879 | | 3.9492 | 1.1628 | 200 | 3.8213 | -0.3674 | -0.4438 | 0.875 | 0.0765 | -4.4384 | -3.6736 | -2.2855 | -1.9961 | 3.8167 | -0.4302 | 0.7799 | | 3.45 | 1.4535 | 250 | 3.4864 | -0.3342 | -0.4227 | 0.9125 | 0.0885 | -4.2266 | -3.3420 | -2.2557 | -1.8804 | 3.4837 | -0.3910 | 0.9067 | | 3.2561 | 1.7442 | 300 | 3.2679 | -0.3119 | -0.3956 | 0.9000 | 0.0837 | -3.9559 | -3.1191 | -2.2849 | -1.9045 | 3.2595 | -0.4022 | 0.8630 | | 3.0471 | 2.0349 | 350 | 3.1300 | -0.3005 | -0.3768 | 0.9000 | 0.0763 | -3.7679 | -3.0046 | -2.2584 | -1.8626 | 3.1220 | -0.4214 | 0.7911 | | 2.9312 | 2.3256 | 400 | 2.9729 | -0.2816 | -0.3469 | 0.875 | 0.0653 | -3.4686 | -2.8161 | -2.2750 | -1.8891 | 2.9539 | -0.4551 | 0.6823 | | 2.6856 | 2.6163 | 450 | 2.8281 | -0.2630 | -0.3133 | 0.8375 | 0.0503 | -3.1333 | -2.6298 | -2.2692 | -1.8896 | 2.8010 | -0.5058 | 0.5330 | | 2.7304 | 2.9070 | 500 | 2.7191 | -0.2493 | -0.2893 | 0.7875 | 0.0400 | -2.8928 | -2.4927 | -2.2573 | -1.8775 | 2.6907 | -0.5448 | 0.4286 | | 2.6224 | 3.1977 | 550 | 2.6362 | -0.2406 | -0.2809 | 0.7750 | 0.0403 | -2.8089 | -2.4062 | -2.2342 | -1.8500 | 2.6066 | -0.5412 | 0.4341 | | 2.5026 | 3.4884 | 600 | 2.5858 | -0.2354 | -0.2761 | 0.7750 | 0.0407 | -2.7606 | -2.3537 | -2.2217 | -1.8389 | 2.5555 | -0.5383 | 0.4406 | | 2.6062 | 3.7791 | 650 | 2.5413 | -0.2315 | -0.2783 | 0.7875 | 0.0468 | -2.7833 | -2.3151 | -2.2000 | -1.8150 | 2.5111 | -0.5115 | 0.5079 | | 2.3809 | 4.0698 | 700 | 2.4987 | -0.2264 | -0.2712 | 0.8000 | 0.0448 | -2.7123 | -2.2642 | -2.1931 | -1.8048 | 2.4689 | -0.5187 | 0.4884 | | 2.4307 | 4.3605 | 750 | 2.4637 | -0.2232 | -0.2721 | 0.8000 | 0.0489 | -2.7213 | -2.2323 | -2.1814 | -1.7947 | 2.4350 | -0.5014 | 0.5339 | | 2.4116 | 4.6512 | 800 | 2.4364 | -0.2203 | -0.2709 | 0.8000 | 0.0506 | -2.7095 | -2.2034 | -2.1728 | -1.7871 | 2.4081 | -0.4942 | 0.5536 | | 2.3713 | 4.9419 | 850 | 2.4145 | -0.2180 | -0.2716 | 0.8125 | 0.0535 | -2.7157 | -2.1803 | -2.1681 | -1.7788 | 2.3873 | -0.4823 | 0.5863 | | 2.3885 | 5.2326 | 900 | 2.3904 | -0.2160 | -0.2735 | 0.8250 | 0.0575 | -2.7352 | -2.1603 | -2.1621 | -1.7749 | 2.3630 | -0.4664 | 0.6301 | | 2.3782 | 5.5233 | 950 | 2.3710 | -0.2141 | -0.2735 | 0.8250 | 0.0595 | -2.7355 | -2.1408 | -2.1522 | -1.7627 | 2.3448 | -0.4588 | 0.6524 | | 2.2396 | 5.8140 | 1000 | 2.3565 | -0.2130 | -0.2767 | 0.8500 | 0.0637 | -2.7666 | -2.1295 | -2.1432 | -1.7523 | 2.3312 | -0.4429 | 0.6988 | | 2.2947 | 6.1047 | 1050 | 2.3363 | -0.2109 | -0.2761 | 0.8625 | 0.0652 | -2.7607 | -2.1086 | -2.1430 | -1.7592 | 2.3118 | -0.4374 | 0.7162 | | 2.2506 | 6.3953 | 1100 | 2.3212 | -0.2094 | -0.2765 | 0.8625 | 0.0671 | -2.7653 | -2.0941 | -2.1394 | -1.7585 | 2.2969 | -0.4304 | 0.7376 | | 2.2421 | 6.6860 | 1150 | 2.3090 | -0.2084 | -0.2781 | 0.8625 | 0.0697 | -2.7808 | -2.0840 | -2.1324 | -1.7495 | 2.2853 | -0.4213 | 0.7657 | | 2.2733 | 6.9767 | 1200 | 2.2972 | -0.2072 | -0.2788 | 0.875 | 0.0715 | -2.7878 | -2.0724 | -2.1276 | -1.7452 | 2.2739 | -0.4147 | 0.7865 | | 2.269 | 7.2674 | 1250 | 2.2879 | -0.2064 | -0.2803 | 0.875 | 0.0738 | -2.8025 | -2.0641 | -2.1251 | -1.7449 | 2.2651 | -0.4067 | 0.8118 | | 2.1922 | 7.5581 | 1300 | 2.2843 | -0.2056 | -0.2779 | 0.875 | 0.0723 | -2.7791 | -2.0565 | -2.1274 | -1.7480 | 2.2614 | -0.4121 | 0.7953 | | 2.1969 | 7.8488 | 1350 | 2.2745 | -0.2050 | -0.2797 | 0.875 | 0.0748 | -2.7975 | -2.0497 | -2.1249 | -1.7453 | 2.2520 | -0.4034 | 0.8228 | | 2.1968 | 8.1395 | 1400 | 2.2674 | -0.2043 | -0.2805 | 0.875 | 0.0762 | -2.8054 | -2.0433 | -2.1219 | -1.7424 | 2.2452 | -0.3987 | 0.8385 | | 2.2984 | 8.4302 | 1450 | 2.2618 | -0.2038 | -0.2810 | 0.8875 | 0.0772 | -2.8104 | -2.0379 | -2.1210 | -1.7416 | 2.2398 | -0.3952 | 0.8501 | | 2.2809 | 8.7209 | 1500 | 2.2636 | -0.2041 | -0.2852 | 0.9125 | 0.0811 | -2.8523 | -2.0408 | -2.1185 | -1.7341 | 2.2419 | -0.3823 | 0.8918 | | 2.2605 | 9.0116 | 1550 | 2.2537 | -0.2032 | -0.2833 | 0.9000 | 0.0801 | -2.8331 | -2.0316 | -2.1153 | -1.7363 | 2.2324 | -0.3857 | 0.8816 | | 2.1305 | 9.3023 | 1600 | 2.2505 | -0.2028 | -0.2832 | 0.9000 | 0.0804 | -2.8322 | -2.0279 | -2.1129 | -1.7336 | 2.2294 | -0.3849 | 0.8848 | | 2.1614 | 9.5930 | 1650 | 2.2487 | -0.2026 | -0.2833 | 0.9000 | 0.0807 | -2.8330 | -2.0261 | -2.1129 | -1.7343 | 2.2276 | -0.3841 | 0.8878 | | 2.1278 | 9.8837 | 1700 | 2.2478 | -0.2025 | -0.2832 | 0.8875 | 0.0807 | -2.8322 | -2.0250 | -2.1129 | -1.7345 | 2.2268 | -0.3839 | 0.8882 | ### Framework versions - PEFT 0.11.1 - Transformers 4.41.2 - Pytorch 2.3.0+cu121 - Datasets 2.19.2 - Tokenizers 0.19.1