Edit model card

tinyllama-1.1b-sum-dpo-full_LR5e-8_3epochs_old

This model is a fine-tuned version of martimfasantos/tinyllama-1.1b-sum-sft-full_old on the openai/summarize_from_feedback dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6687
  • Rewards/chosen: -0.2893
  • Rewards/rejected: -0.3487
  • Rewards/accuracies: 0.6008
  • Rewards/margins: 0.0594
  • Logps/rejected: -98.0463
  • Logps/chosen: -87.6427
  • Logits/rejected: -2.7624
  • Logits/chosen: -2.7684

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-08
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.6931 0.0172 100 0.6932 -0.0000 0.0001 0.4851 -0.0001 -63.1729 -58.7138 -3.1573 -3.1630
0.6931 0.0345 200 0.6932 -0.0000 0.0001 0.4730 -0.0001 -63.1741 -58.7133 -3.1575 -3.1631
0.6932 0.0517 300 0.6932 0.0001 0.0001 0.4942 -0.0000 -63.1702 -58.7051 -3.1574 -3.1631
0.6932 0.0689 400 0.6932 0.0001 0.0001 0.4884 -0.0001 -63.1678 -58.7049 -3.1574 -3.1631
0.6931 0.0861 500 0.6932 -0.0000 0.0001 0.4737 -0.0001 -63.1733 -58.7135 -3.1577 -3.1633
0.693 0.1034 600 0.6932 0.0001 0.0001 0.4923 -0.0000 -63.1656 -58.7003 -3.1575 -3.1632
0.6932 0.1206 700 0.6931 0.0002 0.0002 0.5100 0.0001 -63.1644 -58.6897 -3.1574 -3.1631
0.6929 0.1378 800 0.6932 0.0002 0.0003 0.4668 -0.0001 -63.1484 -58.6918 -3.1571 -3.1627
0.6931 0.1551 900 0.6931 0.0003 0.0002 0.5058 0.0000 -63.1556 -58.6837 -3.1569 -3.1625
0.6931 0.1723 1000 0.6931 0.0004 0.0002 0.5051 0.0001 -63.1557 -58.6755 -3.1567 -3.1624
0.6929 0.1895 1100 0.6931 0.0005 0.0004 0.5160 0.0001 -63.1450 -58.6627 -3.1565 -3.1621
0.6927 0.2068 1200 0.6930 0.0007 0.0005 0.5160 0.0002 -63.1294 -58.6411 -3.1560 -3.1616
0.6929 0.2240 1300 0.6930 0.0009 0.0006 0.5230 0.0003 -63.1224 -58.6264 -3.1548 -3.1605
0.692 0.2412 1400 0.6929 0.0010 0.0005 0.5407 0.0005 -63.1333 -58.6153 -3.1542 -3.1598
0.6918 0.2584 1500 0.6929 0.0011 0.0006 0.5351 0.0005 -63.1157 -58.5976 -3.1532 -3.1588
0.6921 0.2757 1600 0.6928 0.0015 0.0007 0.5611 0.0008 -63.1099 -58.5639 -3.1517 -3.1574
0.692 0.2929 1700 0.6926 0.0018 0.0008 0.5662 0.0010 -63.1046 -58.5339 -3.1502 -3.1558
0.6904 0.3101 1800 0.6926 0.0018 0.0007 0.5699 0.0012 -63.1148 -58.5277 -3.1485 -3.1542
0.691 0.3274 1900 0.6924 0.0018 0.0003 0.5581 0.0015 -63.1539 -58.5341 -3.1473 -3.1529
0.6909 0.3446 2000 0.6923 0.0020 0.0002 0.5723 0.0018 -63.1632 -58.5155 -3.1452 -3.1509
0.6903 0.3618 2100 0.6921 0.0019 -0.0002 0.5697 0.0021 -63.1963 -58.5193 -3.1434 -3.1490
0.6884 0.3790 2200 0.6920 0.0018 -0.0006 0.5757 0.0024 -63.2422 -58.5311 -3.1407 -3.1464
0.6876 0.3963 2300 0.6918 0.0015 -0.0012 0.5769 0.0027 -63.3015 -58.5638 -3.1381 -3.1437
0.6898 0.4135 2400 0.6917 0.0012 -0.0018 0.5625 0.0030 -63.3619 -58.5900 -3.1348 -3.1404
0.6905 0.4307 2500 0.6915 0.0007 -0.0028 0.5743 0.0035 -63.4609 -58.6445 -3.1321 -3.1378
0.6864 0.4480 2600 0.6913 -0.0001 -0.0039 0.5732 0.0038 -63.5690 -58.7216 -3.1295 -3.1352
0.6866 0.4652 2700 0.6911 -0.0014 -0.0057 0.5709 0.0043 -63.7456 -58.8490 -3.1270 -3.1327
0.6869 0.4824 2800 0.6909 -0.0025 -0.0071 0.5750 0.0046 -63.8913 -58.9609 -3.1248 -3.1305
0.6888 0.4997 2900 0.6907 -0.0042 -0.0093 0.5855 0.0051 -64.1121 -59.1289 -3.1214 -3.1271
0.6885 0.5169 3000 0.6905 -0.0061 -0.0118 0.5804 0.0057 -64.3621 -59.3245 -3.1180 -3.1236
0.686 0.5341 3100 0.6904 -0.0071 -0.0130 0.5857 0.0059 -64.4774 -59.4209 -3.1160 -3.1217
0.6869 0.5513 3200 0.6902 -0.0095 -0.0159 0.5878 0.0064 -64.7659 -59.6584 -3.1119 -3.1176
0.6834 0.5686 3300 0.6900 -0.0122 -0.0190 0.5809 0.0068 -65.0782 -59.9308 -3.1072 -3.1130
0.6795 0.5858 3400 0.6897 -0.0147 -0.0221 0.5881 0.0074 -65.3901 -60.1840 -3.1036 -3.1093
0.6848 0.6030 3500 0.6895 -0.0171 -0.0250 0.5897 0.0079 -65.6826 -60.4227 -3.1007 -3.1064
0.6834 0.6203 3600 0.6893 -0.0196 -0.0280 0.5857 0.0084 -65.9796 -60.6710 -3.0969 -3.1026
0.6788 0.6375 3700 0.6890 -0.0219 -0.0308 0.5813 0.0089 -66.2620 -60.8999 -3.0922 -3.0979
0.6825 0.6547 3800 0.6888 -0.0253 -0.0348 0.5904 0.0095 -66.6623 -61.2404 -3.0889 -3.0946
0.6791 0.6720 3900 0.6885 -0.0287 -0.0389 0.5943 0.0103 -67.0740 -61.5806 -3.0858 -3.0915
0.6816 0.6892 4000 0.6881 -0.0328 -0.0438 0.5897 0.0110 -67.5621 -61.9903 -3.0815 -3.0872
0.6749 0.7064 4100 0.6879 -0.0340 -0.0456 0.5901 0.0116 -67.7361 -62.1084 -3.0755 -3.0812
0.6839 0.7236 4200 0.6877 -0.0364 -0.0484 0.5964 0.0120 -68.0226 -62.3546 -3.0712 -3.0769
0.6827 0.7409 4300 0.6876 -0.0377 -0.0500 0.5897 0.0123 -68.1844 -62.4844 -3.0675 -3.0732
0.6815 0.7581 4400 0.6873 -0.0402 -0.0531 0.5950 0.0129 -68.4913 -62.7319 -3.0645 -3.0702
0.6829 0.7753 4500 0.6870 -0.0443 -0.0578 0.5939 0.0136 -68.9615 -63.1372 -3.0609 -3.0666
0.6747 0.7926 4600 0.6868 -0.0476 -0.0617 0.5915 0.0141 -69.3541 -63.4724 -3.0573 -3.0630
0.6828 0.8098 4700 0.6864 -0.0518 -0.0669 0.5936 0.0151 -69.8725 -63.8948 -3.0542 -3.0599
0.6821 0.8270 4800 0.6861 -0.0560 -0.0717 0.5939 0.0156 -70.3462 -64.3141 -3.0504 -3.0562
0.6767 0.8442 4900 0.6858 -0.0602 -0.0766 0.5948 0.0164 -70.8421 -64.7344 -3.0474 -3.0532
0.6765 0.8615 5000 0.6856 -0.0618 -0.0786 0.5934 0.0168 -71.0357 -64.8873 -3.0427 -3.0484
0.6792 0.8787 5100 0.6853 -0.0665 -0.0841 0.5936 0.0176 -71.5851 -65.3618 -3.0385 -3.0443
0.6753 0.8959 5200 0.6851 -0.0697 -0.0877 0.5929 0.0180 -71.9544 -65.6814 -3.0354 -3.0413
0.6749 0.9132 5300 0.6849 -0.0732 -0.0918 0.5922 0.0186 -72.3637 -66.0356 -3.0313 -3.0370
0.6762 0.9304 5400 0.6846 -0.0747 -0.0940 0.5932 0.0192 -72.5755 -66.1839 -3.0282 -3.0340
0.6757 0.9476 5500 0.6845 -0.0761 -0.0955 0.5962 0.0194 -72.7312 -66.3251 -3.0247 -3.0305
0.6795 0.9649 5600 0.6844 -0.0758 -0.0955 0.6018 0.0197 -72.7251 -66.2887 -3.0221 -3.0279
0.6736 0.9821 5700 0.6842 -0.0786 -0.0989 0.6008 0.0202 -73.0675 -66.5758 -3.0181 -3.0239
0.6701 0.9993 5800 0.6839 -0.0831 -0.1040 0.6029 0.0209 -73.5774 -67.0210 -3.0139 -3.0198
0.6725 1.0165 5900 0.6836 -0.0839 -0.1053 0.6039 0.0214 -73.7143 -67.1023 -3.0090 -3.0148
0.6742 1.0338 6000 0.6834 -0.0850 -0.1069 0.6043 0.0219 -73.8738 -67.2139 -3.0056 -3.0114
0.6712 1.0510 6100 0.6833 -0.0878 -0.1100 0.6046 0.0223 -74.1846 -67.4874 -3.0008 -3.0066
0.675 1.0682 6200 0.6831 -0.0903 -0.1131 0.6043 0.0228 -74.4897 -67.7427 -2.9969 -3.0027
0.6766 1.0855 6300 0.6828 -0.0936 -0.1170 0.6036 0.0234 -74.8753 -68.0717 -2.9936 -2.9994
0.6754 1.1027 6400 0.6826 -0.0972 -0.1212 0.6094 0.0240 -75.2993 -68.4308 -2.9896 -2.9954
0.6769 1.1199 6500 0.6823 -0.0999 -0.1244 0.6059 0.0246 -75.6244 -68.6977 -2.9850 -2.9909
0.6764 1.1371 6600 0.6821 -0.1041 -0.1293 0.6076 0.0252 -76.1111 -69.1214 -2.9809 -2.9867
0.6734 1.1544 6700 0.6817 -0.1081 -0.1341 0.6022 0.0260 -76.5930 -69.5220 -2.9770 -2.9828
0.6654 1.1716 6800 0.6814 -0.1138 -0.1407 0.6053 0.0268 -77.2464 -70.0935 -2.9716 -2.9774
0.679 1.1888 6900 0.6812 -0.1168 -0.1441 0.6090 0.0272 -77.5858 -70.3942 -2.9678 -2.9737
0.6652 1.2061 7000 0.6809 -0.1215 -0.1495 0.6057 0.0280 -78.1280 -70.8571 -2.9641 -2.9700
0.6668 1.2233 7100 0.6808 -0.1224 -0.1507 0.6071 0.0283 -78.2466 -70.9482 -2.9603 -2.9661
0.6655 1.2405 7200 0.6806 -0.1254 -0.1542 0.6083 0.0288 -78.5984 -71.2532 -2.9555 -2.9614
0.6783 1.2578 7300 0.6804 -0.1273 -0.1565 0.6087 0.0292 -78.8264 -71.4380 -2.9521 -2.9580
0.6703 1.2750 7400 0.6802 -0.1295 -0.1593 0.6071 0.0297 -79.1055 -71.6647 -2.9497 -2.9555
0.6709 1.2922 7500 0.6802 -0.1302 -0.1601 0.6080 0.0299 -79.1917 -71.7369 -2.9461 -2.9519
0.6774 1.3094 7600 0.6799 -0.1334 -0.1639 0.6097 0.0305 -79.5669 -72.0519 -2.9409 -2.9468
0.6667 1.3267 7700 0.6796 -0.1379 -0.1690 0.6078 0.0311 -80.0833 -72.5013 -2.9364 -2.9423
0.6631 1.3439 7800 0.6793 -0.1427 -0.1747 0.6076 0.0321 -80.6536 -72.9770 -2.9325 -2.9384
0.6734 1.3611 7900 0.6790 -0.1469 -0.1797 0.6094 0.0327 -81.1455 -73.4038 -2.9286 -2.9346
0.6646 1.3784 8000 0.6786 -0.1515 -0.1852 0.6092 0.0337 -81.6967 -73.8575 -2.9249 -2.9308
0.6717 1.3956 8100 0.6783 -0.1560 -0.1904 0.6111 0.0344 -82.2197 -74.3164 -2.9212 -2.9271
0.6674 1.4128 8200 0.6779 -0.1608 -0.1962 0.6087 0.0354 -82.7997 -74.7964 -2.9181 -2.9240
0.6659 1.4300 8300 0.6779 -0.1625 -0.1979 0.6087 0.0354 -82.9745 -74.9664 -2.9143 -2.9202
0.6642 1.4473 8400 0.6777 -0.1647 -0.2007 0.6092 0.0360 -83.2477 -75.1821 -2.9110 -2.9169
0.6579 1.4645 8500 0.6775 -0.1650 -0.2013 0.6080 0.0363 -83.3130 -75.2138 -2.9067 -2.9125
0.6725 1.4817 8600 0.6774 -0.1676 -0.2043 0.6101 0.0367 -83.6107 -75.4718 -2.9030 -2.9089
0.6646 1.4990 8700 0.6774 -0.1665 -0.2032 0.6101 0.0367 -83.4985 -75.3618 -2.9012 -2.9071
0.6681 1.5162 8800 0.6771 -0.1691 -0.2064 0.6092 0.0373 -83.8169 -75.6183 -2.8978 -2.9037
0.6635 1.5334 8900 0.6768 -0.1758 -0.2138 0.6087 0.0381 -84.5617 -76.2875 -2.8935 -2.8994
0.6509 1.5507 9000 0.6766 -0.1793 -0.2180 0.6092 0.0386 -84.9755 -76.6455 -2.8897 -2.8956
0.663 1.5679 9100 0.6764 -0.1824 -0.2216 0.6073 0.0391 -85.3355 -76.9553 -2.8858 -2.8918
0.6614 1.5851 9200 0.6762 -0.1856 -0.2252 0.6076 0.0396 -85.7006 -77.2724 -2.8834 -2.8894
0.6605 1.6023 9300 0.6761 -0.1847 -0.2246 0.6078 0.0398 -85.6352 -77.1840 -2.8793 -2.8852
0.6616 1.6196 9400 0.6759 -0.1879 -0.2282 0.6053 0.0403 -86.0049 -77.5025 -2.8759 -2.8818
0.6595 1.6368 9500 0.6757 -0.1905 -0.2315 0.6085 0.0410 -86.3271 -77.7626 -2.8721 -2.8781
0.6612 1.6540 9600 0.6753 -0.1938 -0.2356 0.6069 0.0418 -86.7373 -78.0935 -2.8679 -2.8738
0.6563 1.6713 9700 0.6751 -0.1979 -0.2402 0.6083 0.0423 -87.2033 -78.5057 -2.8649 -2.8708
0.6526 1.6885 9800 0.6750 -0.2017 -0.2444 0.6069 0.0427 -87.6160 -78.8784 -2.8620 -2.8680
0.6392 1.7057 9900 0.6747 -0.2051 -0.2485 0.6094 0.0434 -88.0276 -79.2194 -2.8594 -2.8653
0.6528 1.7229 10000 0.6746 -0.2062 -0.2500 0.6087 0.0437 -88.1775 -79.3360 -2.8562 -2.8622
0.6542 1.7402 10100 0.6744 -0.2075 -0.2516 0.6066 0.0441 -88.3364 -79.4595 -2.8532 -2.8592
0.6559 1.7574 10200 0.6739 -0.2141 -0.2595 0.6078 0.0454 -89.1350 -80.1233 -2.8483 -2.8543
0.6708 1.7746 10300 0.6737 -0.2171 -0.2629 0.6104 0.0458 -89.4692 -80.4205 -2.8439 -2.8500
0.6454 1.7919 10400 0.6737 -0.2178 -0.2638 0.6048 0.0460 -89.5570 -80.4903 -2.8419 -2.8479
0.6495 1.8091 10500 0.6735 -0.2211 -0.2676 0.6036 0.0465 -89.9389 -80.8204 -2.8383 -2.8444
0.6648 1.8263 10600 0.6732 -0.2247 -0.2719 0.6034 0.0472 -90.3731 -81.1833 -2.8349 -2.8409
0.6568 1.8436 10700 0.6731 -0.2275 -0.2752 0.6039 0.0476 -90.6979 -81.4662 -2.8311 -2.8372
0.6536 1.8608 10800 0.6728 -0.2303 -0.2785 0.6043 0.0482 -91.0335 -81.7461 -2.8295 -2.8355
0.6574 1.8780 10900 0.6726 -0.2320 -0.2808 0.6032 0.0487 -91.2560 -81.9128 -2.8271 -2.8331
0.6601 1.8952 11000 0.6725 -0.2331 -0.2820 0.6018 0.0489 -91.3829 -82.0227 -2.8250 -2.8311
0.6562 1.9125 11100 0.6722 -0.2383 -0.2881 0.6029 0.0498 -91.9931 -82.5429 -2.8218 -2.8278
0.6536 1.9297 11200 0.6720 -0.2416 -0.2919 0.6025 0.0503 -92.3716 -82.8687 -2.8187 -2.8248
0.674 1.9469 11300 0.6718 -0.2432 -0.2940 0.6041 0.0508 -92.5781 -83.0317 -2.8164 -2.8225
0.6536 1.9642 11400 0.6717 -0.2439 -0.2949 0.6032 0.0511 -92.6723 -83.0980 -2.8133 -2.8194
0.6693 1.9814 11500 0.6717 -0.2456 -0.2969 0.6018 0.0513 -92.8725 -83.2765 -2.8119 -2.8179
0.6529 1.9986 11600 0.6714 -0.2469 -0.2988 0.6036 0.0518 -93.0569 -83.4057 -2.8097 -2.8158
0.6454 2.0159 11700 0.6713 -0.2488 -0.3010 0.6025 0.0522 -93.2831 -83.5962 -2.8079 -2.8140
0.6643 2.0331 11800 0.6711 -0.2513 -0.3040 0.6027 0.0527 -93.5825 -83.8399 -2.8052 -2.8113
0.6478 2.0503 11900 0.6710 -0.2554 -0.3084 0.5985 0.0530 -94.0157 -84.2502 -2.8025 -2.8086
0.6512 2.0675 12000 0.6708 -0.2561 -0.3095 0.6050 0.0535 -94.1316 -84.3177 -2.8001 -2.8061
0.6517 2.0848 12100 0.6708 -0.2574 -0.3109 0.6053 0.0536 -94.2719 -84.4484 -2.7988 -2.8048
0.646 2.1020 12200 0.6707 -0.2592 -0.3130 0.6025 0.0538 -94.4818 -84.6297 -2.7972 -2.8033
0.6439 2.1192 12300 0.6706 -0.2607 -0.3147 0.6029 0.0540 -94.6511 -84.7795 -2.7953 -2.8014
0.6432 2.1365 12400 0.6705 -0.2646 -0.3191 0.6053 0.0545 -95.0945 -85.1767 -2.7925 -2.7985
0.6437 2.1537 12500 0.6704 -0.2662 -0.3209 0.6018 0.0548 -95.2735 -85.3289 -2.7907 -2.7968
0.6581 2.1709 12600 0.6702 -0.2678 -0.3229 0.6029 0.0552 -95.4749 -85.4889 -2.7888 -2.7948
0.6509 2.1881 12700 0.6700 -0.2692 -0.3248 0.6036 0.0556 -95.6598 -85.6304 -2.7870 -2.7930
0.6603 2.2054 12800 0.6700 -0.2697 -0.3254 0.6004 0.0557 -95.7213 -85.6830 -2.7854 -2.7914
0.6459 2.2226 12900 0.6700 -0.2702 -0.3259 0.6027 0.0556 -95.7675 -85.7359 -2.7844 -2.7904
0.6501 2.2398 13000 0.6698 -0.2723 -0.3285 0.6011 0.0562 -96.0266 -85.9425 -2.7827 -2.7887
0.6452 2.2571 13100 0.6698 -0.2721 -0.3282 0.6025 0.0561 -96.0042 -85.9225 -2.7811 -2.7872
0.6553 2.2743 13200 0.6697 -0.2732 -0.3296 0.6034 0.0564 -96.1360 -86.0296 -2.7798 -2.7859
0.6627 2.2915 13300 0.6697 -0.2745 -0.3311 0.6020 0.0566 -96.2910 -86.1636 -2.7781 -2.7842
0.6393 2.3088 13400 0.6697 -0.2741 -0.3307 0.6013 0.0566 -96.2503 -86.1255 -2.7777 -2.7838
0.6366 2.3260 13500 0.6696 -0.2757 -0.3325 0.6027 0.0568 -96.4266 -86.2794 -2.7767 -2.7827
0.6522 2.3432 13600 0.6696 -0.2765 -0.3334 0.6032 0.0569 -96.5202 -86.3612 -2.7753 -2.7814
0.6535 2.3604 13700 0.6695 -0.2780 -0.3351 0.6022 0.0572 -96.6946 -86.5112 -2.7742 -2.7802
0.6555 2.3777 13800 0.6694 -0.2786 -0.3360 0.6022 0.0574 -96.7815 -86.5683 -2.7734 -2.7795
0.6658 2.3949 13900 0.6694 -0.2781 -0.3355 0.6032 0.0574 -96.7320 -86.5236 -2.7727 -2.7788
0.6453 2.4121 14000 0.6693 -0.2789 -0.3364 0.6018 0.0575 -96.8240 -86.6049 -2.7718 -2.7778
0.6451 2.4294 14100 0.6692 -0.2797 -0.3375 0.6034 0.0578 -96.9303 -86.6776 -2.7708 -2.7769
0.636 2.4466 14200 0.6693 -0.2803 -0.3378 0.6008 0.0576 -96.9631 -86.7390 -2.7706 -2.7766
0.6251 2.4638 14300 0.6691 -0.2812 -0.3393 0.6011 0.0581 -97.1110 -86.8353 -2.7697 -2.7757
0.6517 2.4810 14400 0.6691 -0.2827 -0.3409 0.6025 0.0583 -97.2740 -86.9799 -2.7687 -2.7747
0.633 2.4983 14500 0.6690 -0.2837 -0.3422 0.6006 0.0585 -97.3994 -87.0852 -2.7680 -2.7740
0.6407 2.5155 14600 0.6690 -0.2842 -0.3426 0.6011 0.0584 -97.4438 -87.1331 -2.7679 -2.7739
0.6298 2.5327 14700 0.6690 -0.2853 -0.3438 0.6013 0.0584 -97.5570 -87.2438 -2.7671 -2.7731
0.6432 2.5500 14800 0.6690 -0.2862 -0.3447 0.6018 0.0585 -97.6493 -87.3336 -2.7663 -2.7723
0.6492 2.5672 14900 0.6689 -0.2866 -0.3453 0.6013 0.0587 -97.7090 -87.3695 -2.7660 -2.7721
0.65 2.5844 15000 0.6689 -0.2870 -0.3457 0.6011 0.0587 -97.7523 -87.4156 -2.7655 -2.7715
0.6519 2.6017 15100 0.6689 -0.2874 -0.3462 0.6008 0.0588 -97.8011 -87.4534 -2.7657 -2.7718
0.6308 2.6189 15200 0.6689 -0.2880 -0.3469 0.6011 0.0589 -97.8694 -87.5090 -2.7649 -2.7709
0.6465 2.6361 15300 0.6689 -0.2880 -0.3469 0.6025 0.0589 -97.8726 -87.5095 -2.7649 -2.7710
0.6609 2.6533 15400 0.6688 -0.2883 -0.3473 0.6025 0.0590 -97.9052 -87.5417 -2.7643 -2.7703
0.6597 2.6706 15500 0.6688 -0.2883 -0.3474 0.6022 0.0591 -97.9180 -87.5395 -2.7639 -2.7700
0.6491 2.6878 15600 0.6687 -0.2885 -0.3479 0.6034 0.0593 -97.9666 -87.5668 -2.7639 -2.7700
0.6423 2.7050 15700 0.6687 -0.2885 -0.3477 0.6008 0.0592 -97.9538 -87.5659 -2.7638 -2.7699
0.6405 2.7223 15800 0.6687 -0.2886 -0.3479 0.6018 0.0593 -97.9676 -87.5701 -2.7633 -2.7694
0.6457 2.7395 15900 0.6687 -0.2889 -0.3481 0.6020 0.0592 -97.9878 -87.5970 -2.7633 -2.7694
0.6549 2.7567 16000 0.6687 -0.2888 -0.3481 0.6032 0.0593 -97.9933 -87.5928 -2.7630 -2.7692
0.6288 2.7739 16100 0.6688 -0.2889 -0.3481 0.6050 0.0592 -97.9868 -87.6035 -2.7631 -2.7692
0.6431 2.7912 16200 0.6688 -0.2892 -0.3484 0.6022 0.0592 -98.0221 -87.6322 -2.7633 -2.7694
0.6499 2.8084 16300 0.6687 -0.2893 -0.3485 0.6032 0.0593 -98.0337 -87.6372 -2.7627 -2.7688
0.6524 2.8256 16400 0.6687 -0.2892 -0.3486 0.6013 0.0594 -98.0451 -87.6369 -2.7630 -2.7690
0.6545 2.8429 16500 0.6687 -0.2892 -0.3486 0.6039 0.0594 -98.0392 -87.6310 -2.7631 -2.7691
0.6692 2.8601 16600 0.6688 -0.2894 -0.3485 0.6022 0.0591 -98.0347 -87.6520 -2.7624 -2.7686
0.6587 2.8773 16700 0.6687 -0.2895 -0.3489 0.6011 0.0594 -98.0697 -87.6612 -2.7623 -2.7684
0.6612 2.8946 16800 0.6687 -0.2890 -0.3484 0.6055 0.0593 -98.0176 -87.6163 -2.7631 -2.7692
0.6561 2.9118 16900 0.6688 -0.2893 -0.3485 0.6020 0.0592 -98.0284 -87.6390 -2.7627 -2.7688
0.6548 2.9290 17000 0.6688 -0.2892 -0.3483 0.6006 0.0591 -98.0120 -87.6341 -2.7624 -2.7684
0.6468 2.9462 17100 0.6687 -0.2892 -0.3485 0.6029 0.0593 -98.0333 -87.6348 -2.7623 -2.7683
0.666 2.9635 17200 0.6686 -0.2892 -0.3486 0.6029 0.0594 -98.0413 -87.6310 -2.7622 -2.7683
0.6571 2.9807 17300 0.6687 -0.2893 -0.3485 0.6039 0.0592 -98.0332 -87.6411 -2.7624 -2.7684
0.6414 2.9979 17400 0.6687 -0.2893 -0.3487 0.6008 0.0594 -98.0463 -87.6427 -2.7624 -2.7684

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
7
Safetensors
Model size
1.1B params
Tensor type
F32
·

Finetuned from

Dataset used to train martimfasantos/tinyllama-1.1b-sum-dpo-full_LR5e-8_3epochs_old