Edit model card

wikitext103_roberta-base_v2

This model is a fine-tuned version of roberta-base on the wikitext wikitext-103-raw-v1 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0966
  • Accuracy: 0.7583

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.4212 0.07 500 1.3008 0.7236
1.3933 0.14 1000 1.2827 0.7227
1.3917 0.2 1500 1.2816 0.7266
1.3824 0.27 2000 1.2947 0.7251
1.3835 0.34 2500 1.2555 0.7289
1.3758 0.41 3000 1.2612 0.7279
1.3745 0.47 3500 1.2791 0.7245
1.3761 0.54 4000 1.2622 0.7286
1.3735 0.61 4500 1.2318 0.7360
1.3717 0.68 5000 1.2777 0.7260
1.3675 0.74 5500 1.2590 0.7309
1.3585 0.81 6000 1.2839 0.7254
1.3579 0.88 6500 1.2341 0.7347
1.3588 0.95 7000 1.2413 0.7327
1.351 1.01 7500 1.2459 0.7317
1.3394 1.08 8000 1.2422 0.7314
1.3429 1.15 8500 1.2285 0.7349
1.3393 1.22 9000 1.2405 0.7324
1.3421 1.29 9500 1.2255 0.7354
1.3426 1.35 10000 1.2296 0.7334
1.3326 1.42 10500 1.2158 0.7351
1.3355 1.49 11000 1.2256 0.7364
1.3324 1.56 11500 1.2208 0.7356
1.3331 1.62 12000 1.2230 0.7347
1.3326 1.69 12500 1.2505 0.7316
1.3339 1.76 13000 1.2471 0.7322
1.3286 1.83 13500 1.2185 0.7359
1.3314 1.89 14000 1.2333 0.7363
1.325 1.96 14500 1.2384 0.7320
1.3251 2.03 15000 1.2142 0.7333
1.3136 2.1 15500 1.2162 0.7346
1.3202 2.17 16000 1.2207 0.7369
1.3168 2.23 16500 1.1931 0.7391
1.3134 2.3 17000 1.1857 0.7398
1.3085 2.37 17500 1.2112 0.7383
1.3165 2.44 18000 1.2284 0.7365
1.3144 2.5 18500 1.2013 0.7388
1.319 2.57 19000 1.2173 0.7356
1.3147 2.64 19500 1.1786 0.7404
1.311 2.71 20000 1.2009 0.7373
1.3131 2.77 20500 1.1992 0.7366
1.3036 2.84 21000 1.2167 0.7370
1.3122 2.91 21500 1.2139 0.7379
1.3091 2.98 22000 1.2197 0.7365
1.304 3.04 22500 1.1864 0.7372
1.3015 3.11 23000 1.2046 0.7355
1.2916 3.18 23500 1.2312 0.7345
1.2966 3.25 24000 1.2116 0.7373
1.2991 3.32 24500 1.2263 0.7378
1.3003 3.38 25000 1.1844 0.7413
1.2942 3.45 25500 1.1959 0.7369
1.2988 3.52 26000 1.2018 0.7381
1.2936 3.59 26500 1.1993 0.7388
1.2937 3.65 27000 1.2155 0.7358
1.3021 3.72 27500 1.1794 0.7396
1.2937 3.79 28000 1.1983 0.7401
1.291 3.86 28500 1.1695 0.7448
1.2932 3.92 29000 1.1981 0.7410
1.2938 3.99 29500 1.2000 0.7383
1.2789 4.06 30000 1.1918 0.7402
1.2806 4.13 30500 1.2065 0.7368
1.2799 4.19 31000 1.2036 0.7374
1.2851 4.26 31500 1.2056 0.7375
1.2789 4.33 32000 1.1857 0.7415
1.2847 4.4 32500 1.1947 0.7376
1.2843 4.47 33000 1.1869 0.7399
1.2822 4.53 33500 1.1963 0.7386
1.2755 4.6 34000 1.1897 0.7424
1.283 4.67 34500 1.1673 0.7438
1.2765 4.74 35000 1.1855 0.7419
1.2762 4.8 35500 1.1773 0.7412
1.2776 4.87 36000 1.1898 0.7408
1.2847 4.94 36500 1.1625 0.7438
1.2732 5.01 37000 1.1947 0.7397
1.2667 5.07 37500 1.2097 0.7385
1.2678 5.14 38000 1.1873 0.7398
1.2681 5.21 38500 1.1682 0.7468
1.2699 5.28 39000 1.1740 0.7457
1.2675 5.35 39500 1.2123 0.7379
1.2604 5.41 40000 1.1953 0.7396
1.2688 5.48 40500 1.1849 0.7398
1.2698 5.55 41000 1.1709 0.7414
1.2689 5.62 41500 1.1764 0.7438
1.269 5.68 42000 1.1824 0.7409
1.2715 5.75 42500 1.1785 0.7409
1.2628 5.82 43000 1.1739 0.7434
1.2617 5.89 43500 1.1815 0.7406
1.2565 5.95 44000 1.1885 0.7415
1.2639 6.02 44500 1.1782 0.7420
1.2557 6.09 45000 1.2061 0.7382
1.2503 6.16 45500 1.1741 0.7397
1.2514 6.22 46000 1.1673 0.7436
1.254 6.29 46500 1.1829 0.7400
1.2583 6.36 47000 1.1777 0.7391
1.2518 6.43 47500 1.1893 0.7412
1.2519 6.5 48000 1.1775 0.7411
1.2477 6.56 48500 1.1809 0.7452
1.2546 6.63 49000 1.1652 0.7455
1.2564 6.7 49500 1.1730 0.7435
1.254 6.77 50000 1.1741 0.7427
1.2495 6.83 50500 1.1540 0.7476
1.2502 6.9 51000 1.1454 0.7488
1.2527 6.97 51500 1.1705 0.7429
1.2418 7.04 52000 1.1714 0.7441
1.2386 7.1 52500 1.1619 0.7455
1.2407 7.17 53000 1.1703 0.7428
1.2429 7.24 53500 1.1597 0.7437
1.2398 7.31 54000 1.1802 0.7411
1.2507 7.37 54500 1.1539 0.7465
1.2369 7.44 55000 1.1711 0.7421
1.2463 7.51 55500 1.1849 0.7409
1.2389 7.58 56000 1.1720 0.7447
1.2395 7.65 56500 1.1614 0.7456
1.2429 7.71 57000 1.1604 0.7460
1.2384 7.78 57500 1.1852 0.7408
1.2419 7.85 58000 1.1593 0.7461
1.2381 7.92 58500 1.1618 0.7454
1.2384 7.98 59000 1.1551 0.7446
1.2314 8.05 59500 1.1474 0.7451
1.2277 8.12 60000 1.1636 0.7435
1.23 8.19 60500 1.1545 0.7482
1.2292 8.25 61000 1.1694 0.7457
1.2337 8.32 61500 1.1682 0.7437
1.2274 8.39 62000 1.1519 0.7484
1.232 8.46 62500 1.1693 0.7435
1.2315 8.53 63000 1.1638 0.7434
1.2293 8.59 63500 1.1640 0.7461
1.2287 8.66 64000 1.1464 0.7519
1.2283 8.73 64500 1.1439 0.7481
1.2279 8.8 65000 1.1496 0.7477
1.2276 8.86 65500 1.1545 0.7449
1.2301 8.93 66000 1.1312 0.7487
1.2248 9.0 66500 1.1444 0.7465
1.2266 9.07 67000 1.1525 0.7430
1.2198 9.13 67500 1.1551 0.7462
1.219 9.2 68000 1.1434 0.7479
1.2212 9.27 68500 1.1707 0.7416
1.2265 9.34 69000 1.1744 0.7422
1.2216 9.4 69500 1.1818 0.7393
1.2226 9.47 70000 1.1662 0.7454
1.2224 9.54 70500 1.1346 0.7460
1.2186 9.61 71000 1.1534 0.7463
1.2179 9.68 71500 1.1399 0.7478
1.2177 9.74 72000 1.1545 0.7442
1.2154 9.81 72500 1.1711 0.7427
1.2179 9.88 73000 1.1349 0.7514
1.2184 9.95 73500 1.1427 0.7495
1.2193 10.01 74000 1.1223 0.7495
1.2063 10.08 74500 1.1357 0.7488
1.2025 10.15 75000 1.1476 0.7486
1.2097 10.22 75500 1.1382 0.7493
1.2106 10.28 76000 1.1414 0.7500
1.2146 10.35 76500 1.1138 0.7533
1.2129 10.42 77000 1.1447 0.7478
1.2078 10.49 77500 1.1557 0.7509
1.204 10.55 78000 1.1243 0.7538
1.2101 10.62 78500 1.1352 0.7507
1.207 10.69 79000 1.1366 0.7526
1.2067 10.76 79500 1.1450 0.7482
1.1997 10.83 80000 1.1334 0.7504
1.2114 10.89 80500 1.1348 0.7524
1.2087 10.96 81000 1.1221 0.7508
1.2065 11.03 81500 1.1306 0.7486
1.1985 11.1 82000 1.1648 0.7471
1.205 11.16 82500 1.1088 0.7527
1.2026 11.23 83000 1.1253 0.7513
1.2 11.3 83500 1.1330 0.7474
1.1997 11.37 84000 1.1424 0.7494
1.1989 11.43 84500 1.1289 0.7478
1.1956 11.5 85000 1.1163 0.7525
1.1997 11.57 85500 1.1354 0.7502
1.2011 11.64 86000 1.1371 0.7488
1.1998 11.71 86500 1.1276 0.7525
1.1957 11.77 87000 1.1078 0.7558
1.2027 11.84 87500 1.1626 0.7454
1.2013 11.91 88000 1.1228 0.7527
1.1944 11.98 88500 1.1413 0.7478
1.1946 12.04 89000 1.1250 0.7514
1.196 12.11 89500 1.1448 0.7468
1.1893 12.18 90000 1.1357 0.7478
1.1865 12.25 90500 1.1209 0.7525
1.1921 12.31 91000 1.1200 0.7517
1.1928 12.38 91500 1.1145 0.7512
1.1904 12.45 92000 1.1108 0.7546
1.1955 12.52 92500 1.1062 0.7541
1.1898 12.58 93000 1.1264 0.7520
1.1917 12.65 93500 1.1129 0.7536
1.1895 12.72 94000 1.1288 0.7494
1.1966 12.79 94500 1.1436 0.7474
1.1887 12.86 95000 1.1220 0.7530
1.1856 12.92 95500 1.1442 0.7500
1.1934 12.99 96000 1.1348 0.7487
1.1848 13.06 96500 1.1172 0.7521
1.1821 13.13 97000 1.1042 0.7566
1.1817 13.19 97500 1.1273 0.7495
1.1773 13.26 98000 1.0958 0.7540
1.1774 13.33 98500 1.1140 0.7511
1.1841 13.4 99000 1.1086 0.7535
1.1825 13.46 99500 1.0903 0.7576
1.1845 13.53 100000 1.1291 0.7486
1.1853 13.6 100500 1.1318 0.7486
1.1761 13.67 101000 1.1218 0.7553
1.1825 13.73 101500 1.1307 0.7485
1.1849 13.8 102000 1.1273 0.7504
1.1792 13.87 102500 1.1291 0.7497
1.1852 13.94 103000 1.1134 0.7521
1.1745 14.01 103500 1.1252 0.7511
1.1746 14.07 104000 1.1148 0.7509
1.1765 14.14 104500 1.1202 0.7499
1.1762 14.21 105000 1.1134 0.7527
1.1752 14.28 105500 1.1171 0.7551
1.176 14.34 106000 1.1155 0.7527
1.1732 14.41 106500 1.1333 0.7481
1.1753 14.48 107000 1.0982 0.7574
1.1713 14.55 107500 1.1343 0.7491
1.1692 14.61 108000 1.1021 0.7549
1.17 14.68 108500 1.1107 0.7504
1.1699 14.75 109000 1.1227 0.7505
1.1763 14.82 109500 1.1152 0.7524
1.1729 14.88 110000 1.0939 0.7563
1.1731 14.95 110500 1.1531 0.7446
1.1744 15.02 111000 1.1451 0.7489
1.169 15.09 111500 1.1211 0.7527
1.1644 15.16 112000 1.1135 0.7553
1.1726 15.22 112500 1.0904 0.7551
1.1653 15.29 113000 1.0807 0.7586
1.1651 15.36 113500 1.1386 0.7487
1.1663 15.43 114000 1.1115 0.7531
1.1635 15.49 114500 1.1272 0.7504
1.1646 15.56 115000 1.0982 0.7541
1.1639 15.63 115500 1.1104 0.7545
1.1598 15.7 116000 1.1335 0.7493
1.1612 15.76 116500 1.1088 0.7536
1.159 15.83 117000 1.0896 0.7554
1.1686 15.9 117500 1.1212 0.7522
1.158 15.97 118000 1.1104 0.7528
1.1633 16.04 118500 1.0980 0.7538
1.1622 16.1 119000 1.1275 0.7509
1.1625 16.17 119500 1.1065 0.7546
1.1582 16.24 120000 1.1181 0.7515
1.1568 16.31 120500 1.1020 0.7558
1.1573 16.37 121000 1.1156 0.7533
1.1549 16.44 121500 1.1206 0.7508
1.1592 16.51 122000 1.0985 0.7543
1.1584 16.58 122500 1.1171 0.7532
1.1589 16.64 123000 1.0686 0.7612
1.1566 16.71 123500 1.0948 0.7564
1.157 16.78 124000 1.0896 0.7568
1.1598 16.85 124500 1.0865 0.7582
1.1567 16.91 125000 1.1091 0.7566
1.1643 16.98 125500 1.1232 0.7522
1.1536 17.05 126000 1.0931 0.7583
1.1486 17.12 126500 1.1100 0.7540
1.1551 17.19 127000 1.1019 0.7538
1.1491 17.25 127500 1.0965 0.7546
1.152 17.32 128000 1.0725 0.7591
1.1521 17.39 128500 1.1246 0.7527
1.1518 17.46 129000 1.1025 0.7570
1.1525 17.52 129500 1.1028 0.7553
1.1509 17.59 130000 1.1141 0.7540
1.1522 17.66 130500 1.1236 0.7523
1.1488 17.73 131000 1.0938 0.7590
1.1477 17.79 131500 1.1070 0.7520
1.1498 17.86 132000 1.0886 0.7561
1.1489 17.93 132500 1.0874 0.7579
1.1462 18.0 133000 1.1016 0.7557
1.1448 18.06 133500 1.0938 0.7546
1.1425 18.13 134000 1.0959 0.7552
1.1414 18.2 134500 1.0867 0.7559
1.1453 18.27 135000 1.0756 0.7592
1.1448 18.34 135500 1.0937 0.7545
1.1471 18.4 136000 1.1154 0.7538
1.1484 18.47 136500 1.1114 0.7538
1.1463 18.54 137000 1.1002 0.7514
1.1512 18.61 137500 1.0664 0.7587
1.1464 18.67 138000 1.0736 0.7584
1.1457 18.74 138500 1.0802 0.7604
1.1464 18.81 139000 1.1091 0.7542
1.1415 18.88 139500 1.0856 0.7595
1.149 18.94 140000 1.0959 0.7557
1.1445 19.01 140500 1.0714 0.7600
1.1378 19.08 141000 1.1179 0.7529
1.143 19.15 141500 1.0850 0.7609
1.1412 19.22 142000 1.1089 0.7572
1.1393 19.28 142500 1.0955 0.7580
1.1492 19.35 143000 1.0983 0.7559
1.1455 19.42 143500 1.1248 0.7541
1.1442 19.49 144000 1.1034 0.7567
1.1385 19.55 144500 1.0718 0.7599
1.1393 19.62 145000 1.1188 0.7512
1.1408 19.69 145500 1.0967 0.7571
1.1443 19.76 146000 1.1152 0.7525
1.1495 19.82 146500 1.1064 0.7535
1.1397 19.89 147000 1.0800 0.7603
1.1399 19.96 147500 1.0812 0.7567

Framework versions

  • Transformers 4.21.3
  • Pytorch 1.13.0+cu117
  • Datasets 2.7.1
  • Tokenizers 0.12.1
Downloads last month
23

Dataset used to train liuyanchen1015/wikitext103_roberta-base

Evaluation results