--- license: apache-2.0 base_model: TheBloke/OpenHermes-2-Mistral-7B-GPTQ tags: - generated_from_trainer model-index: - name: covid3-mistral-dpo-gptq results: [] --- # covid3-mistral-dpo-gptq This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset. It achieves the following results on the evaluation set: - Loss: 2.2375 - Rewards/chosen: -2.8294 - Rewards/rejected: -1.7077 - Rewards/accuracies: 0.25 - Rewards/margins: -1.1217 - Logps/rejected: -24.0692 - Logps/chosen: -35.7956 - Logits/rejected: -2.8653 - Logits/chosen: -2.8666 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 1 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 2 - training_steps: 1000 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:| | 0.6957 | 0.0 | 10 | 0.6940 | 0.0226 | 0.0252 | 0.375 | -0.0026 | -6.7409 | -7.2761 | -2.8058 | -2.8067 | | 0.6925 | 0.0 | 20 | 0.6971 | 0.0317 | 0.0422 | 0.3333 | -0.0105 | -6.5702 | -7.1844 | -2.8074 | -2.8082 | | 0.6876 | 0.01 | 30 | 0.6995 | 0.0202 | 0.0373 | 0.375 | -0.0170 | -6.6197 | -7.2995 | -2.8093 | -2.8102 | | 0.6961 | 0.01 | 40 | 0.6982 | 0.0054 | 0.0189 | 0.375 | -0.0135 | -6.8034 | -7.4475 | -2.8113 | -2.8122 | | 0.6863 | 0.01 | 50 | 0.6998 | 0.0019 | 0.0188 | 0.3333 | -0.0169 | -6.8044 | -7.4830 | -2.8121 | -2.8130 | | 0.6965 | 0.01 | 60 | 0.6977 | 0.0119 | 0.0251 | 0.2917 | -0.0132 | -6.7419 | -7.3829 | -2.8120 | -2.8129 | | 0.7209 | 0.01 | 70 | 0.6993 | 0.0336 | 0.0497 | 0.3333 | -0.0161 | -6.4949 | -7.1656 | -2.8103 | -2.8112 | | 0.6988 | 0.01 | 80 | 0.6984 | 0.0294 | 0.0432 | 0.375 | -0.0138 | -6.5605 | -7.2080 | -2.8085 | -2.8094 | | 0.6913 | 0.01 | 90 | 0.6981 | 0.0216 | 0.0342 | 0.4167 | -0.0126 | -6.6501 | -7.2856 | -2.8084 | -2.8093 | | 0.6641 | 0.02 | 100 | 0.7030 | 0.0493 | 0.0702 | 0.3333 | -0.0209 | -6.2907 | -7.0088 | -2.8098 | -2.8107 | | 0.7083 | 0.02 | 110 | 0.7072 | 0.0575 | 0.0870 | 0.3333 | -0.0295 | -6.1225 | -6.9268 | -2.8105 | -2.8114 | | 0.6307 | 0.02 | 120 | 0.7128 | 0.0727 | 0.1120 | 0.3333 | -0.0393 | -5.8727 | -6.7749 | -2.8105 | -2.8114 | | 0.7216 | 0.02 | 130 | 0.7158 | 0.0814 | 0.1250 | 0.3333 | -0.0436 | -5.7422 | -6.6879 | -2.8108 | -2.8117 | | 0.7189 | 0.02 | 140 | 0.7135 | 0.0948 | 0.1343 | 0.3333 | -0.0395 | -5.6489 | -6.5536 | -2.8099 | -2.8108 | | 0.7177 | 0.03 | 150 | 0.7128 | 0.0954 | 0.1335 | 0.3333 | -0.0381 | -5.6579 | -6.5481 | -2.8100 | -2.8109 | | 0.639 | 0.03 | 160 | 0.7232 | 0.0823 | 0.1404 | 0.3333 | -0.0581 | -5.5880 | -6.6785 | -2.8135 | -2.8144 | | 0.7128 | 0.03 | 170 | 0.7361 | 0.0571 | 0.1393 | 0.375 | -0.0822 | -5.5991 | -6.9308 | -2.8165 | -2.8174 | | 0.709 | 0.03 | 180 | 0.7361 | 0.0690 | 0.1519 | 0.375 | -0.0829 | -5.4739 | -6.8120 | -2.8159 | -2.8168 | | 0.6167 | 0.03 | 190 | 0.7483 | 0.0424 | 0.1461 | 0.375 | -0.1038 | -5.5311 | -7.0782 | -2.8180 | -2.8189 | | 0.7521 | 0.03 | 200 | 0.7589 | 0.0180 | 0.1360 | 0.3333 | -0.1180 | -5.6325 | -7.3223 | -2.8199 | -2.8209 | | 0.6204 | 0.04 | 210 | 0.7726 | -0.0220 | 0.1130 | 0.375 | -0.1350 | -5.8622 | -7.7214 | -2.8217 | -2.8227 | | 0.6578 | 0.04 | 220 | 0.7839 | -0.0525 | 0.0994 | 0.3333 | -0.1520 | -5.9980 | -8.0273 | -2.8232 | -2.8242 | | 0.7633 | 0.04 | 230 | 0.7868 | -0.0613 | 0.0902 | 0.375 | -0.1516 | -6.0903 | -8.1152 | -2.8235 | -2.8245 | | 0.7391 | 0.04 | 240 | 0.7917 | -0.0742 | 0.0850 | 0.375 | -0.1592 | -6.1429 | -8.2441 | -2.8246 | -2.8256 | | 0.6759 | 0.04 | 250 | 0.8023 | -0.1101 | 0.0656 | 0.3333 | -0.1757 | -6.3368 | -8.6031 | -2.8262 | -2.8272 | | 0.6768 | 0.04 | 260 | 0.8107 | -0.1470 | 0.0326 | 0.375 | -0.1796 | -6.6662 | -8.9720 | -2.8264 | -2.8274 | | 0.5398 | 0.04 | 270 | 0.8411 | -0.2390 | -0.0341 | 0.375 | -0.2049 | -7.3331 | -9.8918 | -2.8279 | -2.8289 | | 0.5617 | 0.05 | 280 | 0.8797 | -0.3532 | -0.1075 | 0.375 | -0.2457 | -8.0674 | -11.0340 | -2.8306 | -2.8317 | | 0.7585 | 0.05 | 290 | 0.9009 | -0.4183 | -0.1540 | 0.375 | -0.2642 | -8.5328 | -11.6845 | -2.8318 | -2.8329 | | 0.4971 | 0.05 | 300 | 0.9602 | -0.5793 | -0.2520 | 0.375 | -0.3274 | -9.5121 | -13.2952 | -2.8362 | -2.8373 | | 0.5759 | 0.05 | 310 | 1.0568 | -0.8155 | -0.3982 | 0.375 | -0.4173 | -10.9749 | -15.6568 | -2.8426 | -2.8437 | | 0.451 | 0.05 | 320 | 1.1605 | -1.0527 | -0.5383 | 0.375 | -0.5144 | -12.3754 | -18.0287 | -2.8482 | -2.8493 | | 1.4199 | 0.06 | 330 | 1.1756 | -1.1393 | -0.6287 | 0.375 | -0.5106 | -13.2791 | -18.8948 | -2.8505 | -2.8516 | | 0.6853 | 0.06 | 340 | 1.1875 | -1.1840 | -0.6936 | 0.375 | -0.4904 | -13.9281 | -19.3416 | -2.8530 | -2.8541 | | 0.3956 | 0.06 | 350 | 1.2550 | -1.2944 | -0.7654 | 0.375 | -0.5291 | -14.6460 | -20.4463 | -2.8568 | -2.8579 | | 0.8692 | 0.06 | 360 | 1.3093 | -1.4107 | -0.8644 | 0.375 | -0.5463 | -15.6363 | -21.6084 | -2.8602 | -2.8613 | | 1.4214 | 0.06 | 370 | 1.2759 | -1.3853 | -0.8782 | 0.375 | -0.5071 | -15.7746 | -21.3549 | -2.8579 | -2.8590 | | 0.6163 | 0.06 | 380 | 1.3124 | -1.4537 | -0.9274 | 0.375 | -0.5263 | -16.2665 | -22.0389 | -2.8580 | -2.8591 | | 0.586 | 0.07 | 390 | 1.4060 | -1.6073 | -1.0263 | 0.375 | -0.5810 | -17.2554 | -23.5750 | -2.8594 | -2.8605 | | 1.7565 | 0.07 | 400 | 1.3869 | -1.5469 | -0.9534 | 0.375 | -0.5936 | -16.5259 | -22.9709 | -2.8611 | -2.8623 | | 0.749 | 0.07 | 410 | 1.4037 | -1.5658 | -0.9400 | 0.375 | -0.6258 | -16.3927 | -23.1602 | -2.8615 | -2.8626 | | 0.7682 | 0.07 | 420 | 1.4444 | -1.6154 | -0.9575 | 0.375 | -0.6578 | -16.5678 | -23.6556 | -2.8618 | -2.8630 | | 0.5276 | 0.07 | 430 | 1.5646 | -1.7833 | -1.0365 | 0.375 | -0.7467 | -17.3576 | -25.3345 | -2.8645 | -2.8658 | | 1.2132 | 0.07 | 440 | 1.6229 | -1.8510 | -1.0641 | 0.375 | -0.7869 | -17.6336 | -26.0119 | -2.8657 | -2.8670 | | 1.0323 | 0.07 | 450 | 1.6468 | -1.8672 | -1.0528 | 0.3333 | -0.8143 | -17.5208 | -26.1736 | -2.8655 | -2.8668 | | 1.1453 | 0.08 | 460 | 1.6741 | -1.8759 | -1.0266 | 0.3333 | -0.8494 | -17.2580 | -26.2613 | -2.8659 | -2.8672 | | 1.526 | 0.08 | 470 | 1.6465 | -1.8347 | -1.0076 | 0.3333 | -0.8271 | -17.0681 | -25.8488 | -2.8671 | -2.8684 | | 1.1323 | 0.08 | 480 | 1.5543 | -1.7064 | -0.9557 | 0.3333 | -0.7507 | -16.5494 | -24.5655 | -2.8682 | -2.8694 | | 1.0389 | 0.08 | 490 | 1.5824 | -1.7717 | -1.0002 | 0.3333 | -0.7715 | -16.9945 | -25.2190 | -2.8694 | -2.8706 | | 0.8626 | 0.08 | 500 | 1.6038 | -1.8376 | -1.0545 | 0.3333 | -0.7831 | -17.5374 | -25.8781 | -2.8693 | -2.8706 | | 0.8392 | 0.09 | 510 | 1.6952 | -1.9873 | -1.1387 | 0.3333 | -0.8486 | -18.3790 | -27.3744 | -2.8697 | -2.8710 | | 0.6528 | 0.09 | 520 | 1.7895 | -2.1144 | -1.1842 | 0.25 | -0.9302 | -18.8344 | -28.6457 | -2.8693 | -2.8707 | | 1.3843 | 0.09 | 530 | 1.8088 | -2.1501 | -1.2043 | 0.25 | -0.9458 | -19.0354 | -29.0030 | -2.8696 | -2.8710 | | 1.296 | 0.09 | 540 | 1.7833 | -2.1309 | -1.2130 | 0.25 | -0.9178 | -19.1228 | -28.8106 | -2.8691 | -2.8705 | | 0.7343 | 0.09 | 550 | 1.8244 | -2.1833 | -1.2404 | 0.25 | -0.9428 | -19.3968 | -29.3344 | -2.8676 | -2.8689 | | 1.089 | 0.09 | 560 | 1.8288 | -2.1789 | -1.2313 | 0.25 | -0.9476 | -19.3059 | -29.2912 | -2.8690 | -2.8704 | | 0.8322 | 0.1 | 570 | 1.9009 | -2.2715 | -1.2811 | 0.25 | -0.9903 | -19.8038 | -30.2165 | -2.8697 | -2.8711 | | 0.8684 | 0.1 | 580 | 1.9310 | -2.3144 | -1.3151 | 0.25 | -0.9993 | -20.1433 | -30.6454 | -2.8722 | -2.8736 | | 0.9827 | 0.1 | 590 | 1.9558 | -2.3309 | -1.3222 | 0.25 | -1.0087 | -20.2145 | -30.8112 | -2.8740 | -2.8754 | | 0.5176 | 0.1 | 600 | 1.9731 | -2.3665 | -1.3574 | 0.25 | -1.0091 | -20.5666 | -31.1672 | -2.8754 | -2.8768 | | 1.0789 | 0.1 | 610 | 2.0276 | -2.4550 | -1.4152 | 0.25 | -1.0398 | -21.1444 | -32.0516 | -2.8756 | -2.8769 | | 0.8444 | 0.1 | 620 | 2.1331 | -2.6253 | -1.5121 | 0.25 | -1.1132 | -22.1132 | -33.7550 | -2.8726 | -2.8739 | | 1.6609 | 0.1 | 630 | 2.1160 | -2.6511 | -1.5573 | 0.25 | -1.0938 | -22.5657 | -34.0127 | -2.8740 | -2.8753 | | 1.3086 | 0.11 | 640 | 2.0791 | -2.6721 | -1.6152 | 0.25 | -1.0569 | -23.1446 | -34.2231 | -2.8749 | -2.8762 | | 1.0659 | 0.11 | 650 | 2.0520 | -2.6575 | -1.6184 | 0.25 | -1.0391 | -23.1760 | -34.0763 | -2.8766 | -2.8778 | | 1.3081 | 0.11 | 660 | 2.0481 | -2.6650 | -1.6332 | 0.25 | -1.0318 | -23.3247 | -34.1520 | -2.8756 | -2.8769 | | 0.769 | 0.11 | 670 | 2.0971 | -2.7165 | -1.6666 | 0.25 | -1.0500 | -23.6581 | -34.6672 | -2.8745 | -2.8758 | | 1.1385 | 0.11 | 680 | 2.1554 | -2.7771 | -1.7021 | 0.25 | -1.0750 | -24.0137 | -35.2731 | -2.8735 | -2.8748 | | 1.0306 | 0.12 | 690 | 2.2076 | -2.8501 | -1.7587 | 0.25 | -1.0914 | -24.5793 | -36.0025 | -2.8714 | -2.8727 | | 1.3893 | 0.12 | 700 | 2.2299 | -2.8955 | -1.7944 | 0.25 | -1.1010 | -24.9367 | -36.4564 | -2.8682 | -2.8695 | | 2.2234 | 0.12 | 710 | 2.2237 | -2.9162 | -1.8126 | 0.25 | -1.1036 | -25.1184 | -36.6639 | -2.8654 | -2.8667 | | 0.4678 | 0.12 | 720 | 2.2379 | -2.9096 | -1.7873 | 0.25 | -1.1223 | -24.8652 | -36.5974 | -2.8658 | -2.8671 | | 0.8098 | 0.12 | 730 | 2.2768 | -2.9290 | -1.7762 | 0.25 | -1.1529 | -24.7543 | -36.7922 | -2.8652 | -2.8665 | | 1.8821 | 0.12 | 740 | 2.2740 | -2.9198 | -1.7623 | 0.25 | -1.1574 | -24.6159 | -36.6994 | -2.8641 | -2.8654 | | 1.095 | 0.12 | 750 | 2.2689 | -2.8862 | -1.7174 | 0.25 | -1.1688 | -24.1662 | -36.3637 | -2.8647 | -2.8660 | | 1.7464 | 0.13 | 760 | 2.2488 | -2.8828 | -1.7320 | 0.25 | -1.1508 | -24.3128 | -36.3297 | -2.8640 | -2.8653 | | 0.9967 | 0.13 | 770 | 2.2235 | -2.8783 | -1.7502 | 0.25 | -1.1281 | -24.4945 | -36.2849 | -2.8622 | -2.8634 | | 0.7823 | 0.13 | 780 | 2.2370 | -2.9074 | -1.7744 | 0.25 | -1.1330 | -24.7361 | -36.5759 | -2.8593 | -2.8606 | | 1.3903 | 0.13 | 790 | 2.2755 | -2.9143 | -1.7485 | 0.25 | -1.1658 | -24.4774 | -36.6450 | -2.8587 | -2.8600 | | 2.0372 | 0.13 | 800 | 2.2250 | -2.7892 | -1.6505 | 0.25 | -1.1387 | -23.4972 | -35.3939 | -2.8629 | -2.8642 | | 0.7111 | 0.14 | 810 | 2.2409 | -2.7911 | -1.6348 | 0.25 | -1.1562 | -23.3407 | -35.4124 | -2.8642 | -2.8654 | | 0.8446 | 0.14 | 820 | 2.2740 | -2.8395 | -1.6646 | 0.25 | -1.1749 | -23.6383 | -35.8968 | -2.8638 | -2.8651 | | 1.2303 | 0.14 | 830 | 2.2812 | -2.8540 | -1.6787 | 0.25 | -1.1752 | -23.7798 | -36.0417 | -2.8648 | -2.8661 | | 0.5053 | 0.14 | 840 | 2.2834 | -2.8740 | -1.7065 | 0.25 | -1.1675 | -24.0571 | -36.2418 | -2.8640 | -2.8653 | | 0.5767 | 0.14 | 850 | 2.3105 | -2.9262 | -1.7448 | 0.25 | -1.1814 | -24.4399 | -36.7635 | -2.8618 | -2.8631 | | 1.7435 | 0.14 | 860 | 2.3174 | -2.9360 | -1.7519 | 0.25 | -1.1841 | -24.5119 | -36.8619 | -2.8627 | -2.8639 | | 1.6134 | 0.14 | 870 | 2.3028 | -2.9288 | -1.7659 | 0.25 | -1.1629 | -24.6517 | -36.7902 | -2.8635 | -2.8647 | | 1.747 | 0.15 | 880 | 2.2686 | -2.8780 | -1.7398 | 0.25 | -1.1382 | -24.3902 | -36.2816 | -2.8658 | -2.8671 | | 1.3341 | 0.15 | 890 | 2.2555 | -2.8559 | -1.7244 | 0.25 | -1.1315 | -24.2361 | -36.0610 | -2.8673 | -2.8686 | | 1.884 | 0.15 | 900 | 2.2349 | -2.8291 | -1.7129 | 0.25 | -1.1162 | -24.1211 | -35.7924 | -2.8677 | -2.8689 | | 0.5031 | 0.15 | 910 | 2.2361 | -2.8327 | -1.7156 | 0.25 | -1.1171 | -24.1479 | -35.8284 | -2.8671 | -2.8684 | | 0.7273 | 0.15 | 920 | 2.2545 | -2.8595 | -1.7291 | 0.25 | -1.1304 | -24.2834 | -36.0963 | -2.8665 | -2.8678 | | 1.2208 | 0.15 | 930 | 2.2655 | -2.8756 | -1.7364 | 0.25 | -1.1393 | -24.3561 | -36.2580 | -2.8656 | -2.8669 | | 0.6928 | 0.16 | 940 | 2.2697 | -2.8817 | -1.7405 | 0.25 | -1.1412 | -24.3971 | -36.3184 | -2.8652 | -2.8665 | | 2.2099 | 0.16 | 950 | 2.2581 | -2.8642 | -1.7302 | 0.25 | -1.1340 | -24.2945 | -36.1442 | -2.8656 | -2.8668 | | 1.6883 | 0.16 | 960 | 2.2544 | -2.8575 | -1.7258 | 0.25 | -1.1318 | -24.2503 | -36.0772 | -2.8656 | -2.8668 | | 1.9968 | 0.16 | 970 | 2.2455 | -2.8405 | -1.7135 | 0.25 | -1.1271 | -24.1270 | -35.9072 | -2.8657 | -2.8670 | | 2.1044 | 0.16 | 980 | 2.2400 | -2.8308 | -1.7064 | 0.25 | -1.1243 | -24.0569 | -35.8097 | -2.8656 | -2.8668 | | 0.7207 | 0.17 | 990 | 2.2376 | -2.8286 | -1.7067 | 0.25 | -1.1218 | -24.0597 | -35.7875 | -2.8654 | -2.8667 | | 1.1388 | 0.17 | 1000 | 2.2375 | -2.8294 | -1.7077 | 0.25 | -1.1217 | -24.0692 | -35.7956 | -2.8653 | -2.8666 | ### Framework versions - Transformers 4.35.2 - Pytorch 2.0.1+cu117 - Datasets 2.15.0 - Tokenizers 0.15.0