--- license: gemma base_model: google/gemma-2-2b tags: - trl - sft - generated_from_trainer model-index: - name: collapse_gemma-2-2b_hs2_accumulate_iter17_sftsd1 results: [] --- # collapse_gemma-2-2b_hs2_accumulate_iter17_sftsd1 This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.0990 - Num Input Tokens Seen: 86903648 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-06 - train_batch_size: 8 - eval_batch_size: 16 - seed: 1 - gradient_accumulation_steps: 16 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant_with_warmup - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen | |:-------------:|:------:|:----:|:---------------:|:-----------------:| | No log | 0 | 0 | 1.3909 | 0 | | 1.6971 | 0.0031 | 5 | 1.3908 | 266416 | | 1.7242 | 0.0061 | 10 | 1.3830 | 532160 | | 1.62 | 0.0092 | 15 | 1.3575 | 800512 | | 1.5197 | 0.0122 | 20 | 1.3236 | 1064184 | | 1.5008 | 0.0153 | 25 | 1.2770 | 1334504 | | 1.4642 | 0.0184 | 30 | 1.2405 | 1598368 | | 1.3173 | 0.0214 | 35 | 1.2116 | 1865016 | | 1.2765 | 0.0245 | 40 | 1.1888 | 2130184 | | 1.1355 | 0.0275 | 45 | 1.1997 | 2392176 | | 0.9961 | 0.0306 | 50 | 1.2201 | 2660744 | | 0.8782 | 0.0337 | 55 | 1.2437 | 2923264 | | 0.7621 | 0.0367 | 60 | 1.2934 | 3195320 | | 0.6572 | 0.0398 | 65 | 1.3146 | 3453944 | | 0.5348 | 0.0428 | 70 | 1.3128 | 3720680 | | 0.4188 | 0.0459 | 75 | 1.3216 | 3980496 | | 0.3863 | 0.0490 | 80 | 1.2874 | 4244816 | | 0.3435 | 0.0520 | 85 | 1.2704 | 4511552 | | 0.3193 | 0.0551 | 90 | 1.2561 | 4779816 | | 0.2899 | 0.0582 | 95 | 1.2202 | 5048168 | | 0.3168 | 0.0612 | 100 | 1.2220 | 5310096 | | 0.3101 | 0.0643 | 105 | 1.2154 | 5578536 | | 0.207 | 0.0673 | 110 | 1.2125 | 5837528 | | 0.2417 | 0.0704 | 115 | 1.2159 | 6104280 | | 0.1956 | 0.0735 | 120 | 1.2037 | 6372312 | | 0.2684 | 0.0765 | 125 | 1.1965 | 6644320 | | 0.2592 | 0.0796 | 130 | 1.1988 | 6914696 | | 0.2384 | 0.0826 | 135 | 1.1921 | 7180960 | | 0.1748 | 0.0857 | 140 | 1.1950 | 7449344 | | 0.1348 | 0.0888 | 145 | 1.1829 | 7716736 | | 0.2548 | 0.0918 | 150 | 1.1921 | 7984600 | | 0.1626 | 0.0949 | 155 | 1.1844 | 8252752 | | 0.1477 | 0.0979 | 160 | 1.1824 | 8525784 | | 0.1524 | 0.1010 | 165 | 1.1955 | 8792920 | | 0.1921 | 0.1041 | 170 | 1.1847 | 9057544 | | 0.187 | 0.1071 | 175 | 1.1757 | 9319032 | | 0.1175 | 0.1102 | 180 | 1.1792 | 9582576 | | 0.1303 | 0.1132 | 185 | 1.1772 | 9842240 | | 0.1539 | 0.1163 | 190 | 1.1800 | 10106616 | | 0.1817 | 0.1194 | 195 | 1.1812 | 10366152 | | 0.1275 | 0.1224 | 200 | 1.1754 | 10632048 | | 0.1479 | 0.1255 | 205 | 1.1681 | 10900880 | | 0.1898 | 0.1285 | 210 | 1.1748 | 11163456 | | 0.1901 | 0.1316 | 215 | 1.1713 | 11432112 | | 0.214 | 0.1347 | 220 | 1.1706 | 11688032 | | 0.0697 | 0.1377 | 225 | 1.1671 | 11960096 | | 0.1767 | 0.1408 | 230 | 1.1716 | 12221352 | | 0.175 | 0.1439 | 235 | 1.1650 | 12487832 | | 0.1209 | 0.1469 | 240 | 1.1666 | 12751592 | | 0.0838 | 0.1500 | 245 | 1.1678 | 13017624 | | 0.1519 | 0.1530 | 250 | 1.1689 | 13280048 | | 0.2056 | 0.1561 | 255 | 1.1635 | 13546296 | | 0.2092 | 0.1592 | 260 | 1.1631 | 13802752 | | 0.1651 | 0.1622 | 265 | 1.1643 | 14079416 | | 0.093 | 0.1653 | 270 | 1.1614 | 14353632 | | 0.1438 | 0.1683 | 275 | 1.1632 | 14616352 | | 0.1328 | 0.1714 | 280 | 1.1582 | 14878736 | | 0.1652 | 0.1745 | 285 | 1.1619 | 15140072 | | 0.1245 | 0.1775 | 290 | 1.1572 | 15408216 | | 0.1406 | 0.1806 | 295 | 1.1569 | 15682632 | | 0.1561 | 0.1836 | 300 | 1.1568 | 15952008 | | 0.1717 | 0.1867 | 305 | 1.1521 | 16217672 | | 0.1311 | 0.1898 | 310 | 1.1529 | 16484608 | | 0.1687 | 0.1928 | 315 | 1.1555 | 16747984 | | 0.168 | 0.1959 | 320 | 1.1489 | 17016352 | | 0.1464 | 0.1989 | 325 | 1.1528 | 17277544 | | 0.1649 | 0.2020 | 330 | 1.1507 | 17542440 | | 0.1339 | 0.2051 | 335 | 1.1530 | 17811352 | | 0.1711 | 0.2081 | 340 | 1.1490 | 18078216 | | 0.1412 | 0.2112 | 345 | 1.1445 | 18348264 | | 0.1326 | 0.2142 | 350 | 1.1508 | 18617696 | | 0.1726 | 0.2173 | 355 | 1.1485 | 18883920 | | 0.1026 | 0.2204 | 360 | 1.1450 | 19154632 | | 0.1248 | 0.2234 | 365 | 1.1474 | 19417616 | | 0.1463 | 0.2265 | 370 | 1.1485 | 19687192 | | 0.1872 | 0.2296 | 375 | 1.1466 | 19954904 | | 0.1521 | 0.2326 | 380 | 1.1463 | 20223352 | | 0.0936 | 0.2357 | 385 | 1.1439 | 20486208 | | 0.1429 | 0.2387 | 390 | 1.1446 | 20745120 | | 0.1408 | 0.2418 | 395 | 1.1437 | 21009440 | | 0.1188 | 0.2449 | 400 | 1.1436 | 21272016 | | 0.1577 | 0.2479 | 405 | 1.1448 | 21535880 | | 0.1939 | 0.2510 | 410 | 1.1425 | 21805032 | | 0.1433 | 0.2540 | 415 | 1.1386 | 22075792 | | 0.0973 | 0.2571 | 420 | 1.1428 | 22342136 | | 0.1108 | 0.2602 | 425 | 1.1416 | 22611808 | | 0.2058 | 0.2632 | 430 | 1.1390 | 22875216 | | 0.1185 | 0.2663 | 435 | 1.1419 | 23145696 | | 0.1693 | 0.2693 | 440 | 1.1433 | 23420280 | | 0.1316 | 0.2724 | 445 | 1.1394 | 23689816 | | 0.1103 | 0.2755 | 450 | 1.1368 | 23955832 | | 0.1345 | 0.2785 | 455 | 1.1396 | 24215896 | | 0.0988 | 0.2816 | 460 | 1.1366 | 24473400 | | 0.1211 | 0.2846 | 465 | 1.1380 | 24733328 | | 0.1561 | 0.2877 | 470 | 1.1344 | 24997576 | | 0.2206 | 0.2908 | 475 | 1.1344 | 25259784 | | 0.1184 | 0.2938 | 480 | 1.1378 | 25526128 | | 0.1155 | 0.2969 | 485 | 1.1341 | 25796200 | | 0.1177 | 0.2999 | 490 | 1.1347 | 26069368 | | 0.1346 | 0.3030 | 495 | 1.1360 | 26341608 | | 0.1147 | 0.3061 | 500 | 1.1347 | 26604328 | | 0.1241 | 0.3091 | 505 | 1.1334 | 26868264 | | 0.1042 | 0.3122 | 510 | 1.1364 | 27133232 | | 0.1536 | 0.3152 | 515 | 1.1347 | 27399864 | | 0.134 | 0.3183 | 520 | 1.1341 | 27667104 | | 0.0689 | 0.3214 | 525 | 1.1363 | 27933048 | | 0.1324 | 0.3244 | 530 | 1.1339 | 28199608 | | 0.1085 | 0.3275 | 535 | 1.1330 | 28459712 | | 0.0924 | 0.3306 | 540 | 1.1319 | 28719840 | | 0.1082 | 0.3336 | 545 | 1.1297 | 28975272 | | 0.218 | 0.3367 | 550 | 1.1334 | 29238216 | | 0.192 | 0.3397 | 555 | 1.1313 | 29511704 | | 0.103 | 0.3428 | 560 | 1.1290 | 29776544 | | 0.085 | 0.3459 | 565 | 1.1340 | 30039688 | | 0.1047 | 0.3489 | 570 | 1.1331 | 30301328 | | 0.0752 | 0.3520 | 575 | 1.1278 | 30569216 | | 0.1234 | 0.3550 | 580 | 1.1292 | 30836616 | | 0.2161 | 0.3581 | 585 | 1.1293 | 31102976 | | 0.1606 | 0.3612 | 590 | 1.1296 | 31364480 | | 0.094 | 0.3642 | 595 | 1.1297 | 31625000 | | 0.0934 | 0.3673 | 600 | 1.1292 | 31895944 | | 0.1047 | 0.3703 | 605 | 1.1297 | 32153856 | | 0.1532 | 0.3734 | 610 | 1.1273 | 32422696 | | 0.1137 | 0.3765 | 615 | 1.1277 | 32693648 | | 0.115 | 0.3795 | 620 | 1.1299 | 32963072 | | 0.1122 | 0.3826 | 625 | 1.1298 | 33232352 | | 0.0994 | 0.3856 | 630 | 1.1272 | 33502176 | | 0.0812 | 0.3887 | 635 | 1.1266 | 33770184 | | 0.0799 | 0.3918 | 640 | 1.1273 | 34035544 | | 0.1559 | 0.3948 | 645 | 1.1258 | 34301784 | | 0.1696 | 0.3979 | 650 | 1.1239 | 34576416 | | 0.1574 | 0.4009 | 655 | 1.1236 | 34841800 | | 0.166 | 0.4040 | 660 | 1.1258 | 35104296 | | 0.1426 | 0.4071 | 665 | 1.1252 | 35364256 | | 0.1787 | 0.4101 | 670 | 1.1238 | 35633016 | | 0.1669 | 0.4132 | 675 | 1.1205 | 35903688 | | 0.0938 | 0.4163 | 680 | 1.1223 | 36167328 | | 0.1578 | 0.4193 | 685 | 1.1232 | 36438144 | | 0.2471 | 0.4224 | 690 | 1.1215 | 36700480 | | 0.1151 | 0.4254 | 695 | 1.1214 | 36959904 | | 0.1338 | 0.4285 | 700 | 1.1231 | 37227496 | | 0.1072 | 0.4316 | 705 | 1.1234 | 37496520 | | 0.1114 | 0.4346 | 710 | 1.1220 | 37761112 | | 0.1691 | 0.4377 | 715 | 1.1199 | 38028096 | | 0.1388 | 0.4407 | 720 | 1.1210 | 38304528 | | 0.1454 | 0.4438 | 725 | 1.1213 | 38571176 | | 0.1207 | 0.4469 | 730 | 1.1213 | 38837496 | | 0.1251 | 0.4499 | 735 | 1.1188 | 39108440 | | 0.1334 | 0.4530 | 740 | 1.1195 | 39371328 | | 0.1171 | 0.4560 | 745 | 1.1232 | 39637232 | | 0.1342 | 0.4591 | 750 | 1.1218 | 39903960 | | 0.1377 | 0.4622 | 755 | 1.1177 | 40168456 | | 0.0995 | 0.4652 | 760 | 1.1191 | 40433488 | | 0.1187 | 0.4683 | 765 | 1.1227 | 40704248 | | 0.1524 | 0.4713 | 770 | 1.1209 | 40970656 | | 0.1129 | 0.4744 | 775 | 1.1181 | 41236624 | | 0.1314 | 0.4775 | 780 | 1.1165 | 41507808 | | 0.1474 | 0.4805 | 785 | 1.1158 | 41774728 | | 0.1375 | 0.4836 | 790 | 1.1183 | 42038728 | | 0.1421 | 0.4866 | 795 | 1.1184 | 42299824 | | 0.1142 | 0.4897 | 800 | 1.1183 | 42568312 | | 0.1382 | 0.4928 | 805 | 1.1176 | 42830576 | | 0.1142 | 0.4958 | 810 | 1.1159 | 43098768 | | 0.1176 | 0.4989 | 815 | 1.1158 | 43372456 | | 0.1674 | 0.5020 | 820 | 1.1168 | 43636912 | | 0.0789 | 0.5050 | 825 | 1.1169 | 43895672 | | 0.1103 | 0.5081 | 830 | 1.1170 | 44167352 | | 0.109 | 0.5111 | 835 | 1.1132 | 44431656 | | 0.1452 | 0.5142 | 840 | 1.1144 | 44700856 | | 0.1738 | 0.5173 | 845 | 1.1159 | 44960856 | | 0.0978 | 0.5203 | 850 | 1.1165 | 45231720 | | 0.081 | 0.5234 | 855 | 1.1131 | 45499328 | | 0.1327 | 0.5264 | 860 | 1.1128 | 45760104 | | 0.0918 | 0.5295 | 865 | 1.1154 | 46026552 | | 0.1794 | 0.5326 | 870 | 1.1159 | 46296792 | | 0.2049 | 0.5356 | 875 | 1.1146 | 46563896 | | 0.1396 | 0.5387 | 880 | 1.1159 | 46822144 | | 0.1811 | 0.5417 | 885 | 1.1167 | 47086504 | | 0.1165 | 0.5448 | 890 | 1.1135 | 47352712 | | 0.1405 | 0.5479 | 895 | 1.1118 | 47620280 | | 0.1283 | 0.5509 | 900 | 1.1159 | 47886392 | | 0.1107 | 0.5540 | 905 | 1.1167 | 48153720 | | 0.1643 | 0.5570 | 910 | 1.1120 | 48417448 | | 0.1235 | 0.5601 | 915 | 1.1115 | 48680400 | | 0.0968 | 0.5632 | 920 | 1.1140 | 48949848 | | 0.0832 | 0.5662 | 925 | 1.1150 | 49212664 | | 0.1014 | 0.5693 | 930 | 1.1129 | 49476528 | | 0.0891 | 0.5723 | 935 | 1.1135 | 49742536 | | 0.1393 | 0.5754 | 940 | 1.1152 | 50013944 | | 0.1324 | 0.5785 | 945 | 1.1136 | 50287560 | | 0.165 | 0.5815 | 950 | 1.1119 | 50549992 | | 0.0819 | 0.5846 | 955 | 1.1119 | 50817648 | | 0.1272 | 0.5877 | 960 | 1.1132 | 51083160 | | 0.1226 | 0.5907 | 965 | 1.1116 | 51338512 | | 0.1346 | 0.5938 | 970 | 1.1106 | 51606280 | | 0.1122 | 0.5968 | 975 | 1.1132 | 51879160 | | 0.1976 | 0.5999 | 980 | 1.1137 | 52145992 | | 0.1923 | 0.6030 | 985 | 1.1105 | 52412968 | | 0.1159 | 0.6060 | 990 | 1.1113 | 52676296 | | 0.1475 | 0.6091 | 995 | 1.1127 | 52943568 | | 0.1693 | 0.6121 | 1000 | 1.1124 | 53211752 | | 0.1253 | 0.6152 | 1005 | 1.1113 | 53477744 | | 0.1225 | 0.6183 | 1010 | 1.1106 | 53748328 | | 0.0958 | 0.6213 | 1015 | 1.1101 | 54010424 | | 0.1196 | 0.6244 | 1020 | 1.1087 | 54275720 | | 0.1315 | 0.6274 | 1025 | 1.1089 | 54542920 | | 0.1598 | 0.6305 | 1030 | 1.1107 | 54818776 | | 0.0691 | 0.6336 | 1035 | 1.1101 | 55085064 | | 0.169 | 0.6366 | 1040 | 1.1083 | 55347920 | | 0.1646 | 0.6397 | 1045 | 1.1080 | 55614456 | | 0.1282 | 0.6427 | 1050 | 1.1119 | 55881672 | | 0.0907 | 0.6458 | 1055 | 1.1111 | 56150248 | | 0.1679 | 0.6489 | 1060 | 1.1094 | 56419024 | | 0.1195 | 0.6519 | 1065 | 1.1086 | 56688040 | | 0.0997 | 0.6550 | 1070 | 1.1096 | 56950304 | | 0.1244 | 0.6580 | 1075 | 1.1101 | 57213240 | | 0.125 | 0.6611 | 1080 | 1.1097 | 57482992 | | 0.1003 | 0.6642 | 1085 | 1.1082 | 57749096 | | 0.1012 | 0.6672 | 1090 | 1.1102 | 58016096 | | 0.1385 | 0.6703 | 1095 | 1.1128 | 58278616 | | 0.1107 | 0.6733 | 1100 | 1.1106 | 58550208 | | 0.1068 | 0.6764 | 1105 | 1.1098 | 58817704 | | 0.0849 | 0.6795 | 1110 | 1.1090 | 59083184 | | 0.1274 | 0.6825 | 1115 | 1.1086 | 59354504 | | 0.1851 | 0.6856 | 1120 | 1.1069 | 59616376 | | 0.1387 | 0.6887 | 1125 | 1.1083 | 59884072 | | 0.0868 | 0.6917 | 1130 | 1.1104 | 60139920 | | 0.1334 | 0.6948 | 1135 | 1.1088 | 60404744 | | 0.094 | 0.6978 | 1140 | 1.1077 | 60671000 | | 0.1603 | 0.7009 | 1145 | 1.1078 | 60936712 | | 0.1009 | 0.7040 | 1150 | 1.1076 | 61206824 | | 0.1042 | 0.7070 | 1155 | 1.1082 | 61473752 | | 0.122 | 0.7101 | 1160 | 1.1075 | 61739248 | | 0.1634 | 0.7131 | 1165 | 1.1084 | 62003000 | | 0.1515 | 0.7162 | 1170 | 1.1076 | 62272528 | | 0.0904 | 0.7193 | 1175 | 1.1071 | 62536240 | | 0.108 | 0.7223 | 1180 | 1.1080 | 62803016 | | 0.1404 | 0.7254 | 1185 | 1.1084 | 63068376 | | 0.0756 | 0.7284 | 1190 | 1.1074 | 63330312 | | 0.0864 | 0.7315 | 1195 | 1.1061 | 63590464 | | 0.1426 | 0.7346 | 1200 | 1.1081 | 63854424 | | 0.1258 | 0.7376 | 1205 | 1.1056 | 64126304 | | 0.1423 | 0.7407 | 1210 | 1.1043 | 64390424 | | 0.0885 | 0.7437 | 1215 | 1.1045 | 64656304 | | 0.1446 | 0.7468 | 1220 | 1.1050 | 64919800 | | 0.1502 | 0.7499 | 1225 | 1.1061 | 65185752 | | 0.1086 | 0.7529 | 1230 | 1.1054 | 65453432 | | 0.1108 | 0.7560 | 1235 | 1.1042 | 65721720 | | 0.1431 | 0.7590 | 1240 | 1.1051 | 65984384 | | 0.0921 | 0.7621 | 1245 | 1.1055 | 66252896 | | 0.0673 | 0.7652 | 1250 | 1.1048 | 66521000 | | 0.1394 | 0.7682 | 1255 | 1.1052 | 66791376 | | 0.1376 | 0.7713 | 1260 | 1.1058 | 67056040 | | 0.1094 | 0.7744 | 1265 | 1.1044 | 67327944 | | 0.1252 | 0.7774 | 1270 | 1.1023 | 67602768 | | 0.174 | 0.7805 | 1275 | 1.1026 | 67863376 | | 0.1244 | 0.7835 | 1280 | 1.1053 | 68128112 | | 0.1203 | 0.7866 | 1285 | 1.1066 | 68389168 | | 0.122 | 0.7897 | 1290 | 1.1067 | 68657288 | | 0.1259 | 0.7927 | 1295 | 1.1040 | 68919632 | | 0.1168 | 0.7958 | 1300 | 1.1045 | 69176216 | | 0.1332 | 0.7988 | 1305 | 1.1047 | 69448024 | | 0.1274 | 0.8019 | 1310 | 1.1031 | 69715216 | | 0.1335 | 0.8050 | 1315 | 1.1024 | 69981016 | | 0.0842 | 0.8080 | 1320 | 1.1036 | 70251888 | | 0.064 | 0.8111 | 1325 | 1.1054 | 70519008 | | 0.0986 | 0.8141 | 1330 | 1.1046 | 70785336 | | 0.1131 | 0.8172 | 1335 | 1.1035 | 71052848 | | 0.0974 | 0.8203 | 1340 | 1.1039 | 71320416 | | 0.1494 | 0.8233 | 1345 | 1.1049 | 71582848 | | 0.0684 | 0.8264 | 1350 | 1.1044 | 71854240 | | 0.1464 | 0.8294 | 1355 | 1.1037 | 72118376 | | 0.1046 | 0.8325 | 1360 | 1.1028 | 72385656 | | 0.09 | 0.8356 | 1365 | 1.1033 | 72647632 | | 0.1081 | 0.8386 | 1370 | 1.1035 | 72908208 | | 0.1112 | 0.8417 | 1375 | 1.1048 | 73169752 | | 0.1062 | 0.8447 | 1380 | 1.1055 | 73431688 | | 0.1695 | 0.8478 | 1385 | 1.1036 | 73693528 | | 0.092 | 0.8509 | 1390 | 1.1028 | 73966912 | | 0.0773 | 0.8539 | 1395 | 1.1036 | 74234992 | | 0.087 | 0.8570 | 1400 | 1.1041 | 74501672 | | 0.1277 | 0.8601 | 1405 | 1.1027 | 74769040 | | 0.1034 | 0.8631 | 1410 | 1.1013 | 75035016 | | 0.1369 | 0.8662 | 1415 | 1.1025 | 75303368 | | 0.097 | 0.8692 | 1420 | 1.1045 | 75577208 | | 0.1432 | 0.8723 | 1425 | 1.1045 | 75843792 | | 0.1016 | 0.8754 | 1430 | 1.1030 | 76109448 | | 0.171 | 0.8784 | 1435 | 1.1017 | 76387328 | | 0.1054 | 0.8815 | 1440 | 1.1008 | 76658816 | | 0.1378 | 0.8845 | 1445 | 1.1013 | 76921392 | | 0.1313 | 0.8876 | 1450 | 1.1008 | 77187424 | | 0.1399 | 0.8907 | 1455 | 1.1003 | 77451896 | | 0.1271 | 0.8937 | 1460 | 1.1012 | 77718632 | | 0.0674 | 0.8968 | 1465 | 1.0999 | 77983160 | | 0.1293 | 0.8998 | 1470 | 1.0992 | 78252208 | | 0.0977 | 0.9029 | 1475 | 1.1001 | 78520256 | | 0.0814 | 0.9060 | 1480 | 1.0998 | 78788256 | | 0.1247 | 0.9090 | 1485 | 1.1001 | 79057400 | | 0.0882 | 0.9121 | 1490 | 1.0999 | 79314976 | | 0.0836 | 0.9151 | 1495 | 1.1003 | 79577216 | | 0.0888 | 0.9182 | 1500 | 1.1002 | 79844960 | | 0.0883 | 0.9213 | 1505 | 1.1015 | 80117296 | | 0.0711 | 0.9243 | 1510 | 1.1010 | 80380040 | | 0.1289 | 0.9274 | 1515 | 1.0990 | 80645280 | | 0.1078 | 0.9304 | 1520 | 1.1004 | 80904752 | | 0.1234 | 0.9335 | 1525 | 1.1023 | 81170816 | | 0.1693 | 0.9366 | 1530 | 1.1016 | 81438696 | | 0.1576 | 0.9396 | 1535 | 1.1001 | 81700296 | | 0.1521 | 0.9427 | 1540 | 1.0984 | 81965672 | | 0.096 | 0.9457 | 1545 | 1.0993 | 82241664 | | 0.0927 | 0.9488 | 1550 | 1.1007 | 82504096 | | 0.1097 | 0.9519 | 1555 | 1.0996 | 82778112 | | 0.1 | 0.9549 | 1560 | 1.0992 | 83038176 | | 0.0914 | 0.9580 | 1565 | 1.1000 | 83300784 | | 0.08 | 0.9611 | 1570 | 1.0991 | 83566568 | | 0.1005 | 0.9641 | 1575 | 1.0983 | 83831192 | | 0.1275 | 0.9672 | 1580 | 1.0992 | 84098456 | | 0.0894 | 0.9702 | 1585 | 1.1005 | 84359792 | | 0.1184 | 0.9733 | 1590 | 1.1011 | 84624056 | | 0.0719 | 0.9764 | 1595 | 1.0991 | 84887528 | | 0.1142 | 0.9794 | 1600 | 1.0978 | 85150200 | | 0.119 | 0.9825 | 1605 | 1.0985 | 85413608 | | 0.1301 | 0.9855 | 1610 | 1.0999 | 85675128 | | 0.1025 | 0.9886 | 1615 | 1.1003 | 85940824 | | 0.1423 | 0.9917 | 1620 | 1.0983 | 86213512 | | 0.0789 | 0.9947 | 1625 | 1.0973 | 86485776 | | 0.0738 | 0.9978 | 1630 | 1.0985 | 86750264 | ### Framework versions - Transformers 4.44.0 - Pytorch 2.4.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1