wikitext103_roberta-base_v2
This model is a fine-tuned version of roberta-base on the wikitext wikitext-103-raw-v1 dataset. It achieves the following results on the evaluation set:
- Loss: 1.0966
- Accuracy: 0.7583
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
1.4212 | 0.07 | 500 | 1.3008 | 0.7236 |
1.3933 | 0.14 | 1000 | 1.2827 | 0.7227 |
1.3917 | 0.2 | 1500 | 1.2816 | 0.7266 |
1.3824 | 0.27 | 2000 | 1.2947 | 0.7251 |
1.3835 | 0.34 | 2500 | 1.2555 | 0.7289 |
1.3758 | 0.41 | 3000 | 1.2612 | 0.7279 |
1.3745 | 0.47 | 3500 | 1.2791 | 0.7245 |
1.3761 | 0.54 | 4000 | 1.2622 | 0.7286 |
1.3735 | 0.61 | 4500 | 1.2318 | 0.7360 |
1.3717 | 0.68 | 5000 | 1.2777 | 0.7260 |
1.3675 | 0.74 | 5500 | 1.2590 | 0.7309 |
1.3585 | 0.81 | 6000 | 1.2839 | 0.7254 |
1.3579 | 0.88 | 6500 | 1.2341 | 0.7347 |
1.3588 | 0.95 | 7000 | 1.2413 | 0.7327 |
1.351 | 1.01 | 7500 | 1.2459 | 0.7317 |
1.3394 | 1.08 | 8000 | 1.2422 | 0.7314 |
1.3429 | 1.15 | 8500 | 1.2285 | 0.7349 |
1.3393 | 1.22 | 9000 | 1.2405 | 0.7324 |
1.3421 | 1.29 | 9500 | 1.2255 | 0.7354 |
1.3426 | 1.35 | 10000 | 1.2296 | 0.7334 |
1.3326 | 1.42 | 10500 | 1.2158 | 0.7351 |
1.3355 | 1.49 | 11000 | 1.2256 | 0.7364 |
1.3324 | 1.56 | 11500 | 1.2208 | 0.7356 |
1.3331 | 1.62 | 12000 | 1.2230 | 0.7347 |
1.3326 | 1.69 | 12500 | 1.2505 | 0.7316 |
1.3339 | 1.76 | 13000 | 1.2471 | 0.7322 |
1.3286 | 1.83 | 13500 | 1.2185 | 0.7359 |
1.3314 | 1.89 | 14000 | 1.2333 | 0.7363 |
1.325 | 1.96 | 14500 | 1.2384 | 0.7320 |
1.3251 | 2.03 | 15000 | 1.2142 | 0.7333 |
1.3136 | 2.1 | 15500 | 1.2162 | 0.7346 |
1.3202 | 2.17 | 16000 | 1.2207 | 0.7369 |
1.3168 | 2.23 | 16500 | 1.1931 | 0.7391 |
1.3134 | 2.3 | 17000 | 1.1857 | 0.7398 |
1.3085 | 2.37 | 17500 | 1.2112 | 0.7383 |
1.3165 | 2.44 | 18000 | 1.2284 | 0.7365 |
1.3144 | 2.5 | 18500 | 1.2013 | 0.7388 |
1.319 | 2.57 | 19000 | 1.2173 | 0.7356 |
1.3147 | 2.64 | 19500 | 1.1786 | 0.7404 |
1.311 | 2.71 | 20000 | 1.2009 | 0.7373 |
1.3131 | 2.77 | 20500 | 1.1992 | 0.7366 |
1.3036 | 2.84 | 21000 | 1.2167 | 0.7370 |
1.3122 | 2.91 | 21500 | 1.2139 | 0.7379 |
1.3091 | 2.98 | 22000 | 1.2197 | 0.7365 |
1.304 | 3.04 | 22500 | 1.1864 | 0.7372 |
1.3015 | 3.11 | 23000 | 1.2046 | 0.7355 |
1.2916 | 3.18 | 23500 | 1.2312 | 0.7345 |
1.2966 | 3.25 | 24000 | 1.2116 | 0.7373 |
1.2991 | 3.32 | 24500 | 1.2263 | 0.7378 |
1.3003 | 3.38 | 25000 | 1.1844 | 0.7413 |
1.2942 | 3.45 | 25500 | 1.1959 | 0.7369 |
1.2988 | 3.52 | 26000 | 1.2018 | 0.7381 |
1.2936 | 3.59 | 26500 | 1.1993 | 0.7388 |
1.2937 | 3.65 | 27000 | 1.2155 | 0.7358 |
1.3021 | 3.72 | 27500 | 1.1794 | 0.7396 |
1.2937 | 3.79 | 28000 | 1.1983 | 0.7401 |
1.291 | 3.86 | 28500 | 1.1695 | 0.7448 |
1.2932 | 3.92 | 29000 | 1.1981 | 0.7410 |
1.2938 | 3.99 | 29500 | 1.2000 | 0.7383 |
1.2789 | 4.06 | 30000 | 1.1918 | 0.7402 |
1.2806 | 4.13 | 30500 | 1.2065 | 0.7368 |
1.2799 | 4.19 | 31000 | 1.2036 | 0.7374 |
1.2851 | 4.26 | 31500 | 1.2056 | 0.7375 |
1.2789 | 4.33 | 32000 | 1.1857 | 0.7415 |
1.2847 | 4.4 | 32500 | 1.1947 | 0.7376 |
1.2843 | 4.47 | 33000 | 1.1869 | 0.7399 |
1.2822 | 4.53 | 33500 | 1.1963 | 0.7386 |
1.2755 | 4.6 | 34000 | 1.1897 | 0.7424 |
1.283 | 4.67 | 34500 | 1.1673 | 0.7438 |
1.2765 | 4.74 | 35000 | 1.1855 | 0.7419 |
1.2762 | 4.8 | 35500 | 1.1773 | 0.7412 |
1.2776 | 4.87 | 36000 | 1.1898 | 0.7408 |
1.2847 | 4.94 | 36500 | 1.1625 | 0.7438 |
1.2732 | 5.01 | 37000 | 1.1947 | 0.7397 |
1.2667 | 5.07 | 37500 | 1.2097 | 0.7385 |
1.2678 | 5.14 | 38000 | 1.1873 | 0.7398 |
1.2681 | 5.21 | 38500 | 1.1682 | 0.7468 |
1.2699 | 5.28 | 39000 | 1.1740 | 0.7457 |
1.2675 | 5.35 | 39500 | 1.2123 | 0.7379 |
1.2604 | 5.41 | 40000 | 1.1953 | 0.7396 |
1.2688 | 5.48 | 40500 | 1.1849 | 0.7398 |
1.2698 | 5.55 | 41000 | 1.1709 | 0.7414 |
1.2689 | 5.62 | 41500 | 1.1764 | 0.7438 |
1.269 | 5.68 | 42000 | 1.1824 | 0.7409 |
1.2715 | 5.75 | 42500 | 1.1785 | 0.7409 |
1.2628 | 5.82 | 43000 | 1.1739 | 0.7434 |
1.2617 | 5.89 | 43500 | 1.1815 | 0.7406 |
1.2565 | 5.95 | 44000 | 1.1885 | 0.7415 |
1.2639 | 6.02 | 44500 | 1.1782 | 0.7420 |
1.2557 | 6.09 | 45000 | 1.2061 | 0.7382 |
1.2503 | 6.16 | 45500 | 1.1741 | 0.7397 |
1.2514 | 6.22 | 46000 | 1.1673 | 0.7436 |
1.254 | 6.29 | 46500 | 1.1829 | 0.7400 |
1.2583 | 6.36 | 47000 | 1.1777 | 0.7391 |
1.2518 | 6.43 | 47500 | 1.1893 | 0.7412 |
1.2519 | 6.5 | 48000 | 1.1775 | 0.7411 |
1.2477 | 6.56 | 48500 | 1.1809 | 0.7452 |
1.2546 | 6.63 | 49000 | 1.1652 | 0.7455 |
1.2564 | 6.7 | 49500 | 1.1730 | 0.7435 |
1.254 | 6.77 | 50000 | 1.1741 | 0.7427 |
1.2495 | 6.83 | 50500 | 1.1540 | 0.7476 |
1.2502 | 6.9 | 51000 | 1.1454 | 0.7488 |
1.2527 | 6.97 | 51500 | 1.1705 | 0.7429 |
1.2418 | 7.04 | 52000 | 1.1714 | 0.7441 |
1.2386 | 7.1 | 52500 | 1.1619 | 0.7455 |
1.2407 | 7.17 | 53000 | 1.1703 | 0.7428 |
1.2429 | 7.24 | 53500 | 1.1597 | 0.7437 |
1.2398 | 7.31 | 54000 | 1.1802 | 0.7411 |
1.2507 | 7.37 | 54500 | 1.1539 | 0.7465 |
1.2369 | 7.44 | 55000 | 1.1711 | 0.7421 |
1.2463 | 7.51 | 55500 | 1.1849 | 0.7409 |
1.2389 | 7.58 | 56000 | 1.1720 | 0.7447 |
1.2395 | 7.65 | 56500 | 1.1614 | 0.7456 |
1.2429 | 7.71 | 57000 | 1.1604 | 0.7460 |
1.2384 | 7.78 | 57500 | 1.1852 | 0.7408 |
1.2419 | 7.85 | 58000 | 1.1593 | 0.7461 |
1.2381 | 7.92 | 58500 | 1.1618 | 0.7454 |
1.2384 | 7.98 | 59000 | 1.1551 | 0.7446 |
1.2314 | 8.05 | 59500 | 1.1474 | 0.7451 |
1.2277 | 8.12 | 60000 | 1.1636 | 0.7435 |
1.23 | 8.19 | 60500 | 1.1545 | 0.7482 |
1.2292 | 8.25 | 61000 | 1.1694 | 0.7457 |
1.2337 | 8.32 | 61500 | 1.1682 | 0.7437 |
1.2274 | 8.39 | 62000 | 1.1519 | 0.7484 |
1.232 | 8.46 | 62500 | 1.1693 | 0.7435 |
1.2315 | 8.53 | 63000 | 1.1638 | 0.7434 |
1.2293 | 8.59 | 63500 | 1.1640 | 0.7461 |
1.2287 | 8.66 | 64000 | 1.1464 | 0.7519 |
1.2283 | 8.73 | 64500 | 1.1439 | 0.7481 |
1.2279 | 8.8 | 65000 | 1.1496 | 0.7477 |
1.2276 | 8.86 | 65500 | 1.1545 | 0.7449 |
1.2301 | 8.93 | 66000 | 1.1312 | 0.7487 |
1.2248 | 9.0 | 66500 | 1.1444 | 0.7465 |
1.2266 | 9.07 | 67000 | 1.1525 | 0.7430 |
1.2198 | 9.13 | 67500 | 1.1551 | 0.7462 |
1.219 | 9.2 | 68000 | 1.1434 | 0.7479 |
1.2212 | 9.27 | 68500 | 1.1707 | 0.7416 |
1.2265 | 9.34 | 69000 | 1.1744 | 0.7422 |
1.2216 | 9.4 | 69500 | 1.1818 | 0.7393 |
1.2226 | 9.47 | 70000 | 1.1662 | 0.7454 |
1.2224 | 9.54 | 70500 | 1.1346 | 0.7460 |
1.2186 | 9.61 | 71000 | 1.1534 | 0.7463 |
1.2179 | 9.68 | 71500 | 1.1399 | 0.7478 |
1.2177 | 9.74 | 72000 | 1.1545 | 0.7442 |
1.2154 | 9.81 | 72500 | 1.1711 | 0.7427 |
1.2179 | 9.88 | 73000 | 1.1349 | 0.7514 |
1.2184 | 9.95 | 73500 | 1.1427 | 0.7495 |
1.2193 | 10.01 | 74000 | 1.1223 | 0.7495 |
1.2063 | 10.08 | 74500 | 1.1357 | 0.7488 |
1.2025 | 10.15 | 75000 | 1.1476 | 0.7486 |
1.2097 | 10.22 | 75500 | 1.1382 | 0.7493 |
1.2106 | 10.28 | 76000 | 1.1414 | 0.7500 |
1.2146 | 10.35 | 76500 | 1.1138 | 0.7533 |
1.2129 | 10.42 | 77000 | 1.1447 | 0.7478 |
1.2078 | 10.49 | 77500 | 1.1557 | 0.7509 |
1.204 | 10.55 | 78000 | 1.1243 | 0.7538 |
1.2101 | 10.62 | 78500 | 1.1352 | 0.7507 |
1.207 | 10.69 | 79000 | 1.1366 | 0.7526 |
1.2067 | 10.76 | 79500 | 1.1450 | 0.7482 |
1.1997 | 10.83 | 80000 | 1.1334 | 0.7504 |
1.2114 | 10.89 | 80500 | 1.1348 | 0.7524 |
1.2087 | 10.96 | 81000 | 1.1221 | 0.7508 |
1.2065 | 11.03 | 81500 | 1.1306 | 0.7486 |
1.1985 | 11.1 | 82000 | 1.1648 | 0.7471 |
1.205 | 11.16 | 82500 | 1.1088 | 0.7527 |
1.2026 | 11.23 | 83000 | 1.1253 | 0.7513 |
1.2 | 11.3 | 83500 | 1.1330 | 0.7474 |
1.1997 | 11.37 | 84000 | 1.1424 | 0.7494 |
1.1989 | 11.43 | 84500 | 1.1289 | 0.7478 |
1.1956 | 11.5 | 85000 | 1.1163 | 0.7525 |
1.1997 | 11.57 | 85500 | 1.1354 | 0.7502 |
1.2011 | 11.64 | 86000 | 1.1371 | 0.7488 |
1.1998 | 11.71 | 86500 | 1.1276 | 0.7525 |
1.1957 | 11.77 | 87000 | 1.1078 | 0.7558 |
1.2027 | 11.84 | 87500 | 1.1626 | 0.7454 |
1.2013 | 11.91 | 88000 | 1.1228 | 0.7527 |
1.1944 | 11.98 | 88500 | 1.1413 | 0.7478 |
1.1946 | 12.04 | 89000 | 1.1250 | 0.7514 |
1.196 | 12.11 | 89500 | 1.1448 | 0.7468 |
1.1893 | 12.18 | 90000 | 1.1357 | 0.7478 |
1.1865 | 12.25 | 90500 | 1.1209 | 0.7525 |
1.1921 | 12.31 | 91000 | 1.1200 | 0.7517 |
1.1928 | 12.38 | 91500 | 1.1145 | 0.7512 |
1.1904 | 12.45 | 92000 | 1.1108 | 0.7546 |
1.1955 | 12.52 | 92500 | 1.1062 | 0.7541 |
1.1898 | 12.58 | 93000 | 1.1264 | 0.7520 |
1.1917 | 12.65 | 93500 | 1.1129 | 0.7536 |
1.1895 | 12.72 | 94000 | 1.1288 | 0.7494 |
1.1966 | 12.79 | 94500 | 1.1436 | 0.7474 |
1.1887 | 12.86 | 95000 | 1.1220 | 0.7530 |
1.1856 | 12.92 | 95500 | 1.1442 | 0.7500 |
1.1934 | 12.99 | 96000 | 1.1348 | 0.7487 |
1.1848 | 13.06 | 96500 | 1.1172 | 0.7521 |
1.1821 | 13.13 | 97000 | 1.1042 | 0.7566 |
1.1817 | 13.19 | 97500 | 1.1273 | 0.7495 |
1.1773 | 13.26 | 98000 | 1.0958 | 0.7540 |
1.1774 | 13.33 | 98500 | 1.1140 | 0.7511 |
1.1841 | 13.4 | 99000 | 1.1086 | 0.7535 |
1.1825 | 13.46 | 99500 | 1.0903 | 0.7576 |
1.1845 | 13.53 | 100000 | 1.1291 | 0.7486 |
1.1853 | 13.6 | 100500 | 1.1318 | 0.7486 |
1.1761 | 13.67 | 101000 | 1.1218 | 0.7553 |
1.1825 | 13.73 | 101500 | 1.1307 | 0.7485 |
1.1849 | 13.8 | 102000 | 1.1273 | 0.7504 |
1.1792 | 13.87 | 102500 | 1.1291 | 0.7497 |
1.1852 | 13.94 | 103000 | 1.1134 | 0.7521 |
1.1745 | 14.01 | 103500 | 1.1252 | 0.7511 |
1.1746 | 14.07 | 104000 | 1.1148 | 0.7509 |
1.1765 | 14.14 | 104500 | 1.1202 | 0.7499 |
1.1762 | 14.21 | 105000 | 1.1134 | 0.7527 |
1.1752 | 14.28 | 105500 | 1.1171 | 0.7551 |
1.176 | 14.34 | 106000 | 1.1155 | 0.7527 |
1.1732 | 14.41 | 106500 | 1.1333 | 0.7481 |
1.1753 | 14.48 | 107000 | 1.0982 | 0.7574 |
1.1713 | 14.55 | 107500 | 1.1343 | 0.7491 |
1.1692 | 14.61 | 108000 | 1.1021 | 0.7549 |
1.17 | 14.68 | 108500 | 1.1107 | 0.7504 |
1.1699 | 14.75 | 109000 | 1.1227 | 0.7505 |
1.1763 | 14.82 | 109500 | 1.1152 | 0.7524 |
1.1729 | 14.88 | 110000 | 1.0939 | 0.7563 |
1.1731 | 14.95 | 110500 | 1.1531 | 0.7446 |
1.1744 | 15.02 | 111000 | 1.1451 | 0.7489 |
1.169 | 15.09 | 111500 | 1.1211 | 0.7527 |
1.1644 | 15.16 | 112000 | 1.1135 | 0.7553 |
1.1726 | 15.22 | 112500 | 1.0904 | 0.7551 |
1.1653 | 15.29 | 113000 | 1.0807 | 0.7586 |
1.1651 | 15.36 | 113500 | 1.1386 | 0.7487 |
1.1663 | 15.43 | 114000 | 1.1115 | 0.7531 |
1.1635 | 15.49 | 114500 | 1.1272 | 0.7504 |
1.1646 | 15.56 | 115000 | 1.0982 | 0.7541 |
1.1639 | 15.63 | 115500 | 1.1104 | 0.7545 |
1.1598 | 15.7 | 116000 | 1.1335 | 0.7493 |
1.1612 | 15.76 | 116500 | 1.1088 | 0.7536 |
1.159 | 15.83 | 117000 | 1.0896 | 0.7554 |
1.1686 | 15.9 | 117500 | 1.1212 | 0.7522 |
1.158 | 15.97 | 118000 | 1.1104 | 0.7528 |
1.1633 | 16.04 | 118500 | 1.0980 | 0.7538 |
1.1622 | 16.1 | 119000 | 1.1275 | 0.7509 |
1.1625 | 16.17 | 119500 | 1.1065 | 0.7546 |
1.1582 | 16.24 | 120000 | 1.1181 | 0.7515 |
1.1568 | 16.31 | 120500 | 1.1020 | 0.7558 |
1.1573 | 16.37 | 121000 | 1.1156 | 0.7533 |
1.1549 | 16.44 | 121500 | 1.1206 | 0.7508 |
1.1592 | 16.51 | 122000 | 1.0985 | 0.7543 |
1.1584 | 16.58 | 122500 | 1.1171 | 0.7532 |
1.1589 | 16.64 | 123000 | 1.0686 | 0.7612 |
1.1566 | 16.71 | 123500 | 1.0948 | 0.7564 |
1.157 | 16.78 | 124000 | 1.0896 | 0.7568 |
1.1598 | 16.85 | 124500 | 1.0865 | 0.7582 |
1.1567 | 16.91 | 125000 | 1.1091 | 0.7566 |
1.1643 | 16.98 | 125500 | 1.1232 | 0.7522 |
1.1536 | 17.05 | 126000 | 1.0931 | 0.7583 |
1.1486 | 17.12 | 126500 | 1.1100 | 0.7540 |
1.1551 | 17.19 | 127000 | 1.1019 | 0.7538 |
1.1491 | 17.25 | 127500 | 1.0965 | 0.7546 |
1.152 | 17.32 | 128000 | 1.0725 | 0.7591 |
1.1521 | 17.39 | 128500 | 1.1246 | 0.7527 |
1.1518 | 17.46 | 129000 | 1.1025 | 0.7570 |
1.1525 | 17.52 | 129500 | 1.1028 | 0.7553 |
1.1509 | 17.59 | 130000 | 1.1141 | 0.7540 |
1.1522 | 17.66 | 130500 | 1.1236 | 0.7523 |
1.1488 | 17.73 | 131000 | 1.0938 | 0.7590 |
1.1477 | 17.79 | 131500 | 1.1070 | 0.7520 |
1.1498 | 17.86 | 132000 | 1.0886 | 0.7561 |
1.1489 | 17.93 | 132500 | 1.0874 | 0.7579 |
1.1462 | 18.0 | 133000 | 1.1016 | 0.7557 |
1.1448 | 18.06 | 133500 | 1.0938 | 0.7546 |
1.1425 | 18.13 | 134000 | 1.0959 | 0.7552 |
1.1414 | 18.2 | 134500 | 1.0867 | 0.7559 |
1.1453 | 18.27 | 135000 | 1.0756 | 0.7592 |
1.1448 | 18.34 | 135500 | 1.0937 | 0.7545 |
1.1471 | 18.4 | 136000 | 1.1154 | 0.7538 |
1.1484 | 18.47 | 136500 | 1.1114 | 0.7538 |
1.1463 | 18.54 | 137000 | 1.1002 | 0.7514 |
1.1512 | 18.61 | 137500 | 1.0664 | 0.7587 |
1.1464 | 18.67 | 138000 | 1.0736 | 0.7584 |
1.1457 | 18.74 | 138500 | 1.0802 | 0.7604 |
1.1464 | 18.81 | 139000 | 1.1091 | 0.7542 |
1.1415 | 18.88 | 139500 | 1.0856 | 0.7595 |
1.149 | 18.94 | 140000 | 1.0959 | 0.7557 |
1.1445 | 19.01 | 140500 | 1.0714 | 0.7600 |
1.1378 | 19.08 | 141000 | 1.1179 | 0.7529 |
1.143 | 19.15 | 141500 | 1.0850 | 0.7609 |
1.1412 | 19.22 | 142000 | 1.1089 | 0.7572 |
1.1393 | 19.28 | 142500 | 1.0955 | 0.7580 |
1.1492 | 19.35 | 143000 | 1.0983 | 0.7559 |
1.1455 | 19.42 | 143500 | 1.1248 | 0.7541 |
1.1442 | 19.49 | 144000 | 1.1034 | 0.7567 |
1.1385 | 19.55 | 144500 | 1.0718 | 0.7599 |
1.1393 | 19.62 | 145000 | 1.1188 | 0.7512 |
1.1408 | 19.69 | 145500 | 1.0967 | 0.7571 |
1.1443 | 19.76 | 146000 | 1.1152 | 0.7525 |
1.1495 | 19.82 | 146500 | 1.1064 | 0.7535 |
1.1397 | 19.89 | 147000 | 1.0800 | 0.7603 |
1.1399 | 19.96 | 147500 | 1.0812 | 0.7567 |
Framework versions
- Transformers 4.21.3
- Pytorch 1.13.0+cu117
- Datasets 2.7.1
- Tokenizers 0.12.1
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Dataset used to train liuyanchen1015/wikitext103_roberta-base
Evaluation results
- Accuracy on wikitext wikitext-103-raw-v1self-reported0.758