Edit model card

tinyllama-1.1b-sum-dpo-full_LR1e-7_3epochs_old

This model is a fine-tuned version of martimfasantos/tinyllama-1.1b-sum-sft-full_old on the openai/summarize_from_feedback dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6382
  • Rewards/chosen: -0.8614
  • Rewards/rejected: -1.0551
  • Rewards/accuracies: 0.6341
  • Rewards/margins: 0.1937
  • Logps/rejected: -168.6898
  • Logps/chosen: -144.8481
  • Logits/rejected: -2.0951
  • Logits/chosen: -2.1077

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-07
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.6931 0.0172 100 0.6932 -0.0000 0.0000 0.4993 -0.0000 -63.1760 -58.7121 -3.1570 -3.1626
0.6932 0.0345 200 0.6932 -0.0000 0.0000 0.4902 -0.0001 -63.1777 -58.7161 -3.1578 -3.1634
0.6932 0.0517 300 0.6932 0.0001 0.0001 0.4847 -0.0001 -63.1684 -58.7055 -3.1576 -3.1633
0.6932 0.0689 400 0.6932 0.0001 0.0001 0.4814 -0.0001 -63.1658 -58.7068 -3.1575 -3.1631
0.6931 0.0861 500 0.6932 0.0001 0.0001 0.4847 -0.0000 -63.1715 -58.7052 -3.1577 -3.1633
0.6929 0.1034 600 0.6931 0.0002 0.0002 0.5037 0.0000 -63.1560 -58.6876 -3.1571 -3.1628
0.693 0.1206 700 0.6931 0.0003 0.0001 0.5214 0.0002 -63.1660 -58.6822 -3.1562 -3.1619
0.6927 0.1378 800 0.6931 0.0006 0.0005 0.5204 0.0001 -63.1322 -58.6491 -3.1561 -3.1618
0.6927 0.1551 900 0.6930 0.0008 0.0005 0.5300 0.0003 -63.1317 -58.6345 -3.1554 -3.1610
0.6928 0.1723 1000 0.6930 0.0011 0.0007 0.5258 0.0003 -63.1075 -58.6060 -3.1540 -3.1596
0.6922 0.1895 1100 0.6929 0.0013 0.0007 0.5455 0.0006 -63.1103 -58.5820 -3.1523 -3.1579
0.6921 0.2068 1200 0.6927 0.0017 0.0008 0.5574 0.0009 -63.1011 -58.5416 -3.1500 -3.1556
0.692 0.2240 1300 0.6925 0.0020 0.0007 0.5599 0.0013 -63.1123 -58.5097 -3.1479 -3.1535
0.6898 0.2412 1400 0.6923 0.0021 0.0002 0.5743 0.0018 -63.1581 -58.5058 -3.1443 -3.1500
0.6889 0.2584 1500 0.6920 0.0017 -0.0007 0.5827 0.0024 -63.2512 -58.5426 -3.1406 -3.1462
0.69 0.2757 1600 0.6917 0.0011 -0.0018 0.5785 0.0030 -63.3644 -58.5982 -3.1355 -3.1411
0.6897 0.2929 1700 0.6913 0.0001 -0.0037 0.5727 0.0038 -63.5467 -58.6985 -3.1294 -3.1351
0.6857 0.3101 1800 0.6910 -0.0016 -0.0061 0.5734 0.0045 -63.7882 -58.8688 -3.1244 -3.1301
0.6866 0.3274 1900 0.6907 -0.0038 -0.0090 0.5843 0.0052 -64.0830 -59.0939 -3.1188 -3.1245
0.6872 0.3446 2000 0.6903 -0.0075 -0.0134 0.5862 0.0060 -64.5228 -59.4572 -3.1120 -3.1176
0.6854 0.3618 2100 0.6899 -0.0124 -0.0194 0.5813 0.0070 -65.1230 -59.9534 -3.1057 -3.1113
0.6786 0.3790 2200 0.6894 -0.0185 -0.0267 0.5836 0.0082 -65.8538 -60.5638 -3.0978 -3.1035
0.6801 0.3963 2300 0.6889 -0.0230 -0.0323 0.5915 0.0093 -66.4100 -61.0095 -3.0912 -3.0969
0.683 0.4135 2400 0.6882 -0.0304 -0.0413 0.5867 0.0108 -67.3051 -61.7559 -3.0824 -3.0881
0.6853 0.4307 2500 0.6876 -0.0392 -0.0515 0.5841 0.0123 -68.3329 -62.6367 -3.0733 -3.0790
0.6775 0.4480 2600 0.6870 -0.0464 -0.0600 0.5834 0.0136 -69.1773 -63.3517 -3.0671 -3.0728
0.6788 0.4652 2700 0.6864 -0.0532 -0.0681 0.5895 0.0150 -69.9938 -64.0275 -3.0610 -3.0668
0.6781 0.4824 2800 0.6860 -0.0581 -0.0740 0.5876 0.0159 -70.5769 -64.5225 -3.0538 -3.0595
0.6796 0.4997 2900 0.6857 -0.0610 -0.0777 0.5892 0.0166 -70.9456 -64.8128 -3.0460 -3.0517
0.6805 0.5169 3000 0.6853 -0.0658 -0.0834 0.5994 0.0176 -71.5177 -65.2877 -3.0368 -3.0425
0.673 0.5341 3100 0.6849 -0.0663 -0.0847 0.5987 0.0184 -71.6468 -65.3387 -3.0324 -3.0381
0.6747 0.5513 3200 0.6842 -0.0780 -0.0982 0.6027 0.0202 -72.9963 -66.5094 -3.0209 -3.0267
0.6743 0.5686 3300 0.6836 -0.0836 -0.1053 0.6022 0.0216 -73.7081 -67.0762 -3.0078 -3.0136
0.6653 0.5858 3400 0.6833 -0.0846 -0.1069 0.6011 0.0222 -73.8674 -67.1758 -2.9991 -3.0049
0.6764 0.6030 3500 0.6827 -0.0900 -0.1136 0.5999 0.0236 -74.5369 -67.7069 -2.9912 -2.9971
0.6737 0.6203 3600 0.6823 -0.0962 -0.1207 0.6104 0.0245 -75.2502 -68.3295 -2.9812 -2.9871
0.6664 0.6375 3700 0.6816 -0.1051 -0.1313 0.6080 0.0263 -76.3151 -69.2178 -2.9692 -2.9751
0.6667 0.6547 3800 0.6807 -0.1172 -0.1456 0.6085 0.0284 -77.7401 -70.4287 -2.9595 -2.9654
0.6678 0.6720 3900 0.6799 -0.1299 -0.1602 0.6092 0.0304 -79.2047 -71.6971 -2.9499 -2.9558
0.6671 0.6892 4000 0.6792 -0.1408 -0.1729 0.6078 0.0321 -80.4742 -72.7925 -2.9368 -2.9426
0.6554 0.7064 4100 0.6787 -0.1458 -0.1791 0.6120 0.0333 -81.0925 -73.2962 -2.9179 -2.9238
0.6742 0.7236 4200 0.6780 -0.1580 -0.1932 0.6127 0.0352 -82.5005 -74.5101 -2.9044 -2.9103
0.6632 0.7409 4300 0.6774 -0.1672 -0.2038 0.6078 0.0366 -83.5592 -75.4285 -2.8933 -2.8992
0.6639 0.7581 4400 0.6765 -0.1825 -0.2215 0.6064 0.0390 -85.3312 -76.9653 -2.8808 -2.8867
0.6617 0.7753 4500 0.6753 -0.2011 -0.2431 0.6078 0.0421 -87.4948 -78.8183 -2.8704 -2.8763
0.6446 0.7926 4600 0.6742 -0.2184 -0.2634 0.6080 0.0450 -89.5165 -80.5508 -2.8604 -2.8664
0.6536 0.8098 4700 0.6733 -0.2347 -0.2821 0.6078 0.0474 -91.3895 -82.1787 -2.8507 -2.8567
0.661 0.8270 4800 0.6723 -0.2469 -0.2967 0.6071 0.0498 -92.8502 -83.4062 -2.8410 -2.8470
0.6655 0.8442 4900 0.6714 -0.2622 -0.3144 0.6059 0.0522 -94.6208 -84.9348 -2.8302 -2.8362
0.65 0.8615 5000 0.6706 -0.2730 -0.3273 0.5957 0.0544 -95.9136 -86.0080 -2.8112 -2.8172
0.6625 0.8787 5100 0.6695 -0.2893 -0.3467 0.5997 0.0574 -97.8500 -87.6453 -2.8012 -2.8071
0.6509 0.8959 5200 0.6690 -0.2924 -0.3512 0.5985 0.0588 -98.3012 -87.9499 -2.7931 -2.7991
0.6469 0.9132 5300 0.6686 -0.2979 -0.3577 0.5978 0.0598 -98.9499 -88.5002 -2.7822 -2.7882
0.6482 0.9304 5400 0.6680 -0.3024 -0.3637 0.6039 0.0613 -99.5495 -88.9507 -2.7739 -2.7799
0.639 0.9476 5500 0.6673 -0.3146 -0.3781 0.6066 0.0635 -100.9877 -90.1737 -2.7615 -2.7675
0.6515 0.9649 5600 0.6668 -0.3113 -0.3759 0.6080 0.0647 -100.7733 -89.8396 -2.7543 -2.7603
0.6512 0.9821 5700 0.6657 -0.3303 -0.3982 0.6094 0.0680 -103.0038 -91.7385 -2.7432 -2.7493
0.6323 0.9993 5800 0.6645 -0.3552 -0.4268 0.6078 0.0716 -105.8584 -94.2304 -2.7257 -2.7318
0.632 1.0165 5900 0.6629 -0.3911 -0.4682 0.6085 0.0771 -109.9998 -97.8232 -2.7023 -2.7085
0.654 1.0338 6000 0.6632 -0.3807 -0.4571 0.6076 0.0764 -108.8926 -96.7834 -2.6907 -2.6969
0.6293 1.0510 6100 0.6624 -0.3916 -0.4703 0.6111 0.0787 -110.2114 -97.8768 -2.6768 -2.6831
0.6314 1.0682 6200 0.6611 -0.4228 -0.5060 0.6120 0.0832 -113.7813 -100.9947 -2.6635 -2.6697
0.6526 1.0855 6300 0.6599 -0.4394 -0.5262 0.6145 0.0869 -115.8035 -102.6482 -2.6530 -2.6593
0.6347 1.1027 6400 0.6593 -0.4394 -0.5278 0.6180 0.0884 -115.9650 -102.6523 -2.6435 -2.6499
0.6393 1.1199 6500 0.6588 -0.4468 -0.5370 0.6238 0.0901 -116.8754 -103.3932 -2.6289 -2.6354
0.6374 1.1371 6600 0.6590 -0.4501 -0.5403 0.6166 0.0901 -117.2051 -103.7237 -2.6225 -2.6289
0.6359 1.1544 6700 0.6581 -0.4668 -0.5605 0.6190 0.0936 -119.2262 -105.3939 -2.6058 -2.6123
0.6146 1.1716 6800 0.6567 -0.4994 -0.5980 0.6173 0.0987 -122.9848 -108.6496 -2.5870 -2.5937
0.6367 1.1888 6900 0.6561 -0.5093 -0.6101 0.6227 0.1008 -124.1880 -109.6397 -2.5753 -2.5820
0.6185 1.2061 7000 0.6549 -0.5406 -0.6465 0.6159 0.1059 -127.8333 -112.7735 -2.5638 -2.5706
0.6226 1.2233 7100 0.6558 -0.5185 -0.6213 0.6180 0.1028 -125.3109 -110.5579 -2.5582 -2.5651
0.6173 1.2405 7200 0.6550 -0.5301 -0.6358 0.6162 0.1057 -126.7555 -111.7189 -2.5488 -2.5557
0.6472 1.2578 7300 0.6553 -0.5020 -0.6054 0.6197 0.1034 -123.7222 -108.9138 -2.5474 -2.5543
0.6388 1.2750 7400 0.6552 -0.4984 -0.6021 0.6206 0.1037 -123.3937 -108.5536 -2.5418 -2.5489
0.641 1.2922 7500 0.6543 -0.5020 -0.6078 0.6227 0.1058 -123.9613 -108.9147 -2.5332 -2.5404
0.6721 1.3094 7600 0.6531 -0.5286 -0.6388 0.6229 0.1102 -127.0605 -111.5723 -2.5152 -2.5224
0.6262 1.3267 7700 0.6528 -0.5440 -0.6568 0.6199 0.1127 -128.8555 -113.1147 -2.4986 -2.5058
0.6077 1.3439 7800 0.6520 -0.5730 -0.6901 0.6231 0.1172 -132.1913 -116.0070 -2.4824 -2.4898
0.6293 1.3611 7900 0.6511 -0.5869 -0.7073 0.6234 0.1204 -133.9143 -117.4017 -2.4749 -2.4824
0.6065 1.3784 8000 0.6502 -0.5931 -0.7166 0.6236 0.1235 -134.8416 -118.0241 -2.4667 -2.4743
0.6328 1.3956 8100 0.6499 -0.6051 -0.7307 0.6255 0.1256 -136.2457 -119.2178 -2.4558 -2.4635
0.646 1.4128 8200 0.6494 -0.6002 -0.7264 0.6231 0.1262 -135.8235 -118.7345 -2.4523 -2.4600
0.6384 1.4300 8300 0.6500 -0.5815 -0.7052 0.6234 0.1237 -133.6977 -116.8619 -2.4491 -2.4568
0.6173 1.4473 8400 0.6504 -0.5677 -0.6897 0.6217 0.1219 -132.1456 -115.4836 -2.4449 -2.4526
0.6041 1.4645 8500 0.6501 -0.5732 -0.6969 0.6271 0.1237 -132.8701 -116.0278 -2.4292 -2.4370
0.6635 1.4817 8600 0.6490 -0.6018 -0.7304 0.6252 0.1286 -136.2163 -118.8894 -2.4140 -2.4220
0.6377 1.4990 8700 0.6499 -0.5709 -0.6951 0.6255 0.1243 -132.6951 -115.7986 -2.4168 -2.4247
0.6376 1.5162 8800 0.6488 -0.5866 -0.7147 0.6301 0.1281 -134.6506 -117.3752 -2.4074 -2.4155
0.6174 1.5334 8900 0.6478 -0.6255 -0.7594 0.6336 0.1339 -139.1249 -121.2650 -2.3887 -2.3969
0.6228 1.5507 9000 0.6478 -0.6245 -0.7587 0.6292 0.1342 -139.0503 -121.1639 -2.3815 -2.3898
0.6372 1.5679 9100 0.6480 -0.6203 -0.7539 0.6336 0.1335 -138.5676 -120.7465 -2.3769 -2.3852
0.6 1.5851 9200 0.6474 -0.6400 -0.7768 0.6329 0.1368 -140.8612 -122.7150 -2.3665 -2.3751
0.5989 1.6023 9300 0.6468 -0.6474 -0.7867 0.6341 0.1394 -141.8543 -123.4491 -2.3576 -2.3662
0.614 1.6196 9400 0.6459 -0.6825 -0.8279 0.6368 0.1454 -145.9700 -126.9618 -2.3413 -2.3500
0.596 1.6368 9500 0.6456 -0.6809 -0.8268 0.6368 0.1459 -145.8628 -126.8059 -2.3333 -2.3420
0.6174 1.6540 9600 0.6448 -0.7214 -0.8733 0.6364 0.1519 -150.5126 -130.8547 -2.3123 -2.3212
0.6332 1.6713 9700 0.6452 -0.6900 -0.8381 0.6357 0.1480 -146.9875 -127.7156 -2.3143 -2.3232
0.6115 1.6885 9800 0.6452 -0.6884 -0.8368 0.6341 0.1484 -146.8605 -127.5543 -2.3134 -2.3225
0.5539 1.7057 9900 0.6446 -0.6932 -0.8433 0.6322 0.1501 -147.5115 -128.0289 -2.3106 -2.3197
0.5881 1.7229 10000 0.6446 -0.6998 -0.8514 0.6357 0.1516 -148.3202 -128.6942 -2.3004 -2.3096
0.6197 1.7402 10100 0.6450 -0.6864 -0.8362 0.6343 0.1498 -146.7977 -127.3522 -2.2940 -2.3033
0.6029 1.7574 10200 0.6433 -0.7383 -0.8977 0.6336 0.1593 -152.9491 -132.5467 -2.2721 -2.2816
0.6441 1.7746 10300 0.6435 -0.7404 -0.8998 0.6324 0.1594 -153.1610 -132.7534 -2.2664 -2.2760
0.5718 1.7919 10400 0.6444 -0.7047 -0.8588 0.6341 0.1541 -149.0603 -129.1777 -2.2712 -2.2807
0.5866 1.8091 10500 0.6437 -0.7266 -0.8854 0.6343 0.1588 -151.7161 -131.3703 -2.2598 -2.2695
0.6278 1.8263 10600 0.6437 -0.7187 -0.8763 0.6348 0.1576 -150.8070 -130.5783 -2.2553 -2.2651
0.6083 1.8436 10700 0.6428 -0.7398 -0.9018 0.6306 0.1621 -153.3647 -132.6900 -2.2435 -2.2534
0.5999 1.8608 10800 0.6425 -0.7467 -0.9104 0.6324 0.1637 -154.2222 -133.3793 -2.2412 -2.2513
0.6016 1.8780 10900 0.6423 -0.7546 -0.9199 0.6343 0.1654 -155.1725 -134.1676 -2.2317 -2.2420
0.6056 1.8952 11000 0.6424 -0.7430 -0.9074 0.6303 0.1644 -153.9158 -133.0090 -2.2336 -2.2438
0.6068 1.9125 11100 0.6415 -0.7764 -0.9467 0.6315 0.1703 -157.8523 -136.3506 -2.2170 -2.2275
0.5907 1.9297 11200 0.6416 -0.7643 -0.9335 0.6324 0.1692 -156.5323 -135.1456 -2.2154 -2.2259
0.6504 1.9469 11300 0.6420 -0.7478 -0.9145 0.6289 0.1667 -154.6342 -133.4948 -2.2172 -2.2276
0.6037 1.9642 11400 0.6413 -0.7627 -0.9329 0.6296 0.1702 -156.4750 -134.9861 -2.2093 -2.2199
0.6435 1.9814 11500 0.6415 -0.7615 -0.9315 0.6301 0.1700 -156.3274 -134.8601 -2.2078 -2.2184
0.6037 1.9986 11600 0.6418 -0.7425 -0.9097 0.6294 0.1671 -154.1468 -132.9645 -2.2119 -2.2224
0.6036 2.0159 11700 0.6414 -0.7444 -0.9128 0.6289 0.1684 -154.4553 -133.1498 -2.2068 -2.2174
0.6111 2.0331 11800 0.6408 -0.7710 -0.9439 0.6285 0.1729 -157.5724 -135.8124 -2.1917 -2.2026
0.5739 2.0503 11900 0.6401 -0.8062 -0.9851 0.6283 0.1788 -161.6872 -139.3363 -2.1752 -2.1862
0.5807 2.0675 12000 0.6400 -0.8128 -0.9929 0.6327 0.1801 -162.4718 -139.9921 -2.1663 -2.1776
0.5904 2.0848 12100 0.6396 -0.8183 -0.9996 0.6317 0.1814 -163.1447 -140.5391 -2.1626 -2.1739
0.5722 2.1020 12200 0.6397 -0.8246 -1.0067 0.6327 0.1821 -163.8479 -141.1671 -2.1591 -2.1704
0.5874 2.1192 12300 0.6397 -0.8221 -1.0035 0.6343 0.1814 -163.5287 -140.9182 -2.1576 -2.1690
0.5575 2.1365 12400 0.6391 -0.8641 -1.0517 0.6341 0.1876 -168.3473 -145.1188 -2.1426 -2.1543
0.59 2.1537 12500 0.6392 -0.8708 -1.0586 0.6341 0.1878 -169.0439 -145.7953 -2.1364 -2.1481
0.6028 2.1709 12600 0.6394 -0.8507 -1.0363 0.6336 0.1856 -166.8094 -143.7794 -2.1403 -2.1519
0.5745 2.1881 12700 0.6394 -0.8476 -1.0328 0.6331 0.1852 -166.4608 -143.4725 -2.1395 -2.1511
0.6037 2.2054 12800 0.6395 -0.8490 -1.0347 0.6317 0.1857 -166.6464 -143.6127 -2.1340 -2.1457
0.5773 2.2226 12900 0.6393 -0.8462 -1.0320 0.6315 0.1858 -166.3826 -143.3317 -2.1329 -2.1446
0.5747 2.2398 13000 0.6391 -0.8618 -1.0498 0.6320 0.1880 -168.1579 -144.8899 -2.1262 -2.1381
0.5788 2.2571 13100 0.6392 -0.8607 -1.0489 0.6331 0.1882 -168.0727 -144.7845 -2.1216 -2.1335
0.6091 2.2743 13200 0.6390 -0.8603 -1.0494 0.6327 0.1891 -168.1196 -144.7427 -2.1177 -2.1296
0.6213 2.2915 13300 0.6393 -0.8616 -1.0503 0.6301 0.1886 -168.2058 -144.8738 -2.1141 -2.1261
0.5545 2.3088 13400 0.6397 -0.8361 -1.0209 0.6310 0.1848 -165.2700 -142.3214 -2.1231 -2.1350
0.5633 2.3260 13500 0.6392 -0.8526 -1.0406 0.6336 0.1879 -167.2357 -143.9755 -2.1181 -2.1301
0.5982 2.3432 13600 0.6391 -0.8544 -1.0431 0.6320 0.1886 -167.4862 -144.1549 -2.1134 -2.1255
0.6165 2.3604 13700 0.6390 -0.8581 -1.0475 0.6336 0.1894 -167.9277 -144.5217 -2.1098 -2.1221
0.5863 2.3777 13800 0.6393 -0.8480 -1.0361 0.6322 0.1881 -166.7901 -143.5142 -2.1112 -2.1233
0.6023 2.3949 13900 0.6395 -0.8345 -1.0207 0.6322 0.1862 -165.2497 -142.1660 -2.1148 -2.1269
0.551 2.4121 14000 0.6389 -0.8440 -1.0328 0.6331 0.1888 -166.4650 -143.1130 -2.1104 -2.1226
0.565 2.4294 14100 0.6394 -0.8393 -1.0266 0.6322 0.1874 -165.8436 -142.6391 -2.1116 -2.1238
0.555 2.4466 14200 0.6396 -0.8346 -1.0211 0.6317 0.1865 -165.2906 -142.1683 -2.1129 -2.1251
0.5303 2.4638 14300 0.6392 -0.8468 -1.0356 0.6313 0.1888 -166.7382 -143.3939 -2.1079 -2.1202
0.5998 2.4810 14400 0.6390 -0.8530 -1.0429 0.6350 0.1899 -167.4716 -144.0141 -2.1038 -2.1161
0.5688 2.4983 14500 0.6387 -0.8590 -1.0506 0.6338 0.1916 -168.2381 -144.6089 -2.1014 -2.1137
0.5601 2.5155 14600 0.6386 -0.8520 -1.0429 0.6341 0.1909 -167.4715 -143.9122 -2.1035 -2.1158
0.5694 2.5327 14700 0.6385 -0.8549 -1.0466 0.6336 0.1917 -167.8379 -144.2034 -2.1025 -2.1148
0.5762 2.5500 14800 0.6388 -0.8514 -1.0423 0.6327 0.1909 -167.4103 -143.8544 -2.1027 -2.1151
0.5944 2.5672 14900 0.6388 -0.8497 -1.0403 0.6322 0.1906 -167.2102 -143.6825 -2.1028 -2.1151
0.5766 2.5844 15000 0.6386 -0.8528 -1.0444 0.6327 0.1916 -167.6185 -143.9918 -2.1007 -2.1131
0.6066 2.6017 15100 0.6387 -0.8545 -1.0460 0.6334 0.1915 -167.7836 -144.1632 -2.1001 -2.1125
0.557 2.6189 15200 0.6385 -0.8591 -1.0515 0.6331 0.1924 -168.3309 -144.6236 -2.0980 -2.1104
0.5819 2.6361 15300 0.6384 -0.8621 -1.0552 0.6329 0.1931 -168.6976 -144.9198 -2.0966 -2.1092
0.6353 2.6533 15400 0.6384 -0.8617 -1.0548 0.6331 0.1931 -168.6601 -144.8850 -2.0966 -2.1091
0.6352 2.6706 15500 0.6385 -0.8591 -1.0515 0.6341 0.1924 -168.3342 -144.6245 -2.0974 -2.1098
0.5882 2.6878 15600 0.6384 -0.8581 -1.0511 0.6329 0.1930 -168.2865 -144.5229 -2.0972 -2.1097
0.5698 2.7050 15700 0.6384 -0.8579 -1.0506 0.6334 0.1928 -168.2427 -144.4972 -2.0972 -2.1098
0.5774 2.7223 15800 0.6383 -0.8576 -1.0507 0.6317 0.1931 -168.2498 -144.4737 -2.0970 -2.1095
0.5948 2.7395 15900 0.6385 -0.8583 -1.0511 0.6329 0.1928 -168.2885 -144.5436 -2.0963 -2.1088
0.5977 2.7567 16000 0.6382 -0.8592 -1.0527 0.6343 0.1935 -168.4506 -144.6316 -2.0959 -2.1084
0.5412 2.7739 16100 0.6385 -0.8607 -1.0535 0.6341 0.1927 -168.5258 -144.7848 -2.0957 -2.1081
0.6015 2.7912 16200 0.6385 -0.8599 -1.0527 0.6320 0.1927 -168.4485 -144.7054 -2.0961 -2.1086
0.5921 2.8084 16300 0.6382 -0.8602 -1.0537 0.6338 0.1935 -168.5526 -144.7336 -2.0959 -2.1084
0.5958 2.8256 16400 0.6384 -0.8602 -1.0534 0.6322 0.1932 -168.5213 -144.7309 -2.0953 -2.1078
0.5977 2.8429 16500 0.6384 -0.8601 -1.0531 0.6334 0.1931 -168.4950 -144.7180 -2.0952 -2.1077
0.6289 2.8601 16600 0.6382 -0.8611 -1.0549 0.6338 0.1937 -168.6687 -144.8262 -2.0951 -2.1076
0.6271 2.8773 16700 0.6385 -0.8602 -1.0531 0.6336 0.1929 -168.4876 -144.7302 -2.0954 -2.1080
0.5918 2.8946 16800 0.6384 -0.8615 -1.0546 0.6331 0.1931 -168.6371 -144.8581 -2.0953 -2.1078
0.5885 2.9118 16900 0.6383 -0.8598 -1.0533 0.6331 0.1935 -168.5110 -144.6941 -2.0954 -2.1080
0.6058 2.9290 17000 0.6384 -0.8615 -1.0547 0.6331 0.1933 -168.6532 -144.8587 -2.0949 -2.1075
0.5841 2.9462 17100 0.6384 -0.8599 -1.0531 0.6322 0.1932 -168.4870 -144.7006 -2.0956 -2.1082
0.6214 2.9635 17200 0.6385 -0.8609 -1.0538 0.6341 0.1930 -168.5645 -144.7976 -2.0955 -2.1081
0.5905 2.9807 17300 0.6385 -0.8611 -1.0541 0.6327 0.1931 -168.5945 -144.8186 -2.0951 -2.1076
0.5878 2.9979 17400 0.6382 -0.8614 -1.0551 0.6341 0.1937 -168.6898 -144.8481 -2.0951 -2.1077

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
11
Safetensors
Model size
1.1B params
Tensor type
F32
·

Finetuned from

Dataset used to train martimfasantos/tinyllama-1.1b-sum-dpo-full_LR1e-7_3epochs_old