Edit model card

output

This model is a fine-tuned version of /home/woody/b114cb/b114cb10/zymCTRL/gpt2-large/config.json on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3014

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-05
  • train_batch_size: 1
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss
3.2882 0.02 10 2.9581
2.5059 0.04 20 2.3844
2.3368 0.06 30 2.3644
2.3476 0.08 40 2.3494
2.3185 0.1 50 2.3697
2.3468 0.12 60 2.3255
2.262 0.14 70 2.2512
2.1646 0.16 80 2.1945
2.1558 0.18 90 2.1885
2.1934 0.2 100 2.1483
2.0855 0.22 110 2.1152
2.0844 0.24 120 2.0839
2.0647 0.26 130 2.0615
1.9665 0.28 140 2.0330
1.9761 0.3 150 2.0068
1.9428 0.32 160 1.9914
1.9351 0.34 170 1.9369
1.9366 0.36 180 1.9139
1.9548 0.38 190 1.8789
1.9625 0.4 200 1.8486
1.8584 0.42 210 1.8198
1.8857 0.44 220 1.8118
1.7574 0.46 230 1.7603
1.8114 0.48 240 1.7370
1.7303 0.5 250 1.7205
1.7535 0.52 260 1.7124
1.7775 0.54 270 1.7013
1.685 0.56 280 1.6612
1.5898 0.58 290 1.6578
1.7875 0.6 300 1.6458
1.628 0.62 310 1.6253
1.6186 0.64 320 1.6195
1.6899 0.66 330 1.6102
1.5908 0.68 340 1.5907
1.6514 0.7 350 1.6104
1.6027 0.72 360 1.5766
1.6319 0.74 370 1.5623
1.6103 0.76 380 1.5764
1.4518 0.78 390 1.5449
1.498 0.8 400 1.5345
1.5266 0.82 410 1.5413
1.5622 0.84 420 1.5229
1.4863 0.86 430 1.5208
1.5492 0.88 440 1.4996
1.5515 0.9 450 1.4857
1.4799 0.92 460 1.4935
1.4514 0.94 470 1.4745
1.5462 0.96 480 1.4784
1.6032 0.98 490 1.4911
1.7418 1.0 500 1.4733
1.4983 1.02 510 1.4646
1.5383 1.04 520 1.4442
1.3454 1.06 530 1.4332
1.3128 1.08 540 1.4261
1.5472 1.1 550 1.4232
1.252 1.12 560 1.3924
1.3538 1.14 570 1.3975
1.5448 1.16 580 1.3915
1.4016 1.18 590 1.4025
1.3041 1.2 600 1.3837
1.3857 1.22 610 1.3890
1.2923 1.24 620 1.3452
1.28 1.26 630 1.3492
1.4052 1.28 640 1.3254
1.3992 1.3 650 1.3670
1.5044 1.32 660 1.3153
1.2274 1.34 670 1.3142
1.2392 1.36 680 1.3150
1.365 1.38 690 1.2966
1.3024 1.4 700 1.2688
1.347 1.42 710 1.2874
1.3898 1.44 720 1.2543
1.4256 1.46 730 1.2397
1.2566 1.48 740 1.2430
1.2473 1.5 750 1.2135
1.1466 1.52 760 1.2171
1.3065 1.54 770 1.1897
1.3033 1.56 780 1.1646
1.1166 1.58 790 1.1723
1.0874 1.6 800 1.1511
1.017 1.62 810 1.1396
1.0437 1.64 820 1.1016
1.2206 1.66 830 1.0841
0.9738 1.68 840 1.0760
1.1351 1.7 850 1.0562
1.0697 1.72 860 1.0556
1.0296 1.74 870 1.0342
1.0904 1.76 880 1.0047
1.01 1.78 890 1.0184
0.951 1.8 900 0.9845
1.0111 1.82 910 0.9675
1.0824 1.84 920 0.9759
0.9745 1.86 930 0.9336
0.8632 1.88 940 0.9347
0.9959 1.9 950 0.9395
0.8906 1.92 960 0.8965
1.0552 1.94 970 0.8892
0.8387 1.96 980 0.8822
1.0068 1.98 990 0.8805
1.083 2.0 1000 0.8490
0.8407 2.02 1010 0.8457
0.7468 2.04 1020 0.8285
0.8421 2.06 1030 0.8055
0.8407 2.08 1040 0.8160
0.8126 2.1 1050 0.8266
0.7318 2.12 1060 0.8151
0.9142 2.14 1070 0.7876
0.6483 2.16 1080 0.7866
0.8092 2.18 1090 0.7818
0.8235 2.2 1100 0.7708
0.7062 2.22 1110 0.7693
0.7348 2.24 1120 0.7875
0.7507 2.26 1130 0.7567
0.7588 2.28 1140 0.7565
0.605 2.3 1150 0.7298
0.8721 2.32 1160 0.7254
0.6988 2.34 1170 0.7072
0.6294 2.36 1180 0.7082
0.7117 2.38 1190 0.7113
0.8558 2.4 1200 0.6991
0.6187 2.42 1210 0.6905
0.6791 2.44 1220 0.6875
0.5447 2.46 1230 0.6869
0.7299 2.48 1240 0.6777
0.5829 2.5 1250 0.6658
0.6435 2.52 1260 0.6603
0.7303 2.54 1270 0.6578
0.7244 2.56 1280 0.6594
0.6463 2.58 1290 0.6409
0.7766 2.6 1300 0.6417
0.6012 2.62 1310 0.6461
0.5974 2.64 1320 0.6365
0.556 2.66 1330 0.6301
0.6369 2.68 1340 0.6247
0.5699 2.7 1350 0.6163
0.624 2.72 1360 0.6138
0.6774 2.74 1370 0.6135
0.5553 2.76 1380 0.6076
0.604 2.78 1390 0.5938
0.6087 2.8 1400 0.5956
0.5935 2.82 1410 0.5933
0.6042 2.84 1420 0.5911
0.6425 2.86 1430 0.5844
0.6316 2.88 1440 0.5745
0.597 2.9 1450 0.5695
0.5754 2.92 1460 0.5704
0.5197 2.94 1470 0.5697
0.6256 2.96 1480 0.5596
0.5818 2.98 1490 0.5599
0.5464 3.01 1500 0.5565
0.4616 3.03 1510 0.5629
0.6482 3.05 1520 0.5529
0.5356 3.07 1530 0.5526
0.5688 3.09 1540 0.5528
0.6018 3.11 1550 0.5408
0.5794 3.13 1560 0.5371
0.5443 3.15 1570 0.5375
0.4435 3.17 1580 0.5345
0.5087 3.19 1590 0.5293
0.518 3.21 1600 0.5336
0.5914 3.23 1610 0.5316
0.5667 3.25 1620 0.5254
0.5218 3.27 1630 0.5207
0.4267 3.29 1640 0.5270
0.5839 3.31 1650 0.5199
0.5095 3.33 1660 0.5268
0.4616 3.35 1670 0.5192
0.5027 3.37 1680 0.5106
0.441 3.39 1690 0.5150
0.4416 3.41 1700 0.5156
0.4411 3.43 1710 0.5103
0.47 3.45 1720 0.5038
0.5079 3.47 1730 0.5048
0.3913 3.49 1740 0.5082
0.4977 3.51 1750 0.4976
0.5905 3.53 1760 0.4975
0.4362 3.55 1770 0.4962
0.4309 3.57 1780 0.5008
0.4477 3.59 1790 0.4988
0.4826 3.61 1800 0.4886
0.6181 3.63 1810 0.4885
0.4738 3.65 1820 0.4879
0.4932 3.67 1830 0.4818
0.4684 3.69 1840 0.4812
0.5484 3.71 1850 0.4767
0.5086 3.73 1860 0.4791
0.3548 3.75 1870 0.4793
0.5229 3.77 1880 0.4765
0.4578 3.79 1890 0.4704
0.5277 3.81 1900 0.4691
0.4683 3.83 1910 0.4649
0.448 3.85 1920 0.4684
0.3752 3.87 1930 0.4697
0.4631 3.89 1940 0.4678
0.4277 3.91 1950 0.4608
0.3646 3.93 1960 0.4609
0.5276 3.95 1970 0.4543
0.431 3.97 1980 0.4539
0.5465 3.99 1990 0.4550
0.4954 4.01 2000 0.4523
0.4886 4.03 2010 0.4499
0.4898 4.05 2020 0.4462
0.4072 4.07 2030 0.4479
0.4565 4.09 2040 0.4458
0.3739 4.11 2050 0.4475
0.4211 4.13 2060 0.4486
0.4048 4.15 2070 0.4393
0.5064 4.17 2080 0.4351
0.4652 4.19 2090 0.4379
0.4061 4.21 2100 0.4341
0.3784 4.23 2110 0.4390
0.4142 4.25 2120 0.4354
0.3625 4.27 2130 0.4415
0.3807 4.29 2140 0.4403
0.4154 4.31 2150 0.4308
0.4509 4.33 2160 0.4298
0.4254 4.35 2170 0.4239
0.4323 4.37 2180 0.4214
0.4359 4.39 2190 0.4291
0.3759 4.41 2200 0.4224
0.4534 4.43 2210 0.4225
0.4013 4.45 2220 0.4262
0.4331 4.47 2230 0.4214
0.4373 4.49 2240 0.4198
0.4975 4.51 2250 0.4236
0.423 4.53 2260 0.4189
0.4503 4.55 2270 0.4171
0.3796 4.57 2280 0.4172
0.4063 4.59 2290 0.4125
0.3841 4.61 2300 0.4119
0.2956 4.63 2310 0.4147
0.3486 4.65 2320 0.4246
0.3585 4.67 2330 0.4117
0.4496 4.69 2340 0.4091
0.399 4.71 2350 0.4049
0.3885 4.73 2360 0.4004
0.3728 4.75 2370 0.4003
0.2698 4.77 2380 0.4009
0.3799 4.79 2390 0.4003
0.4888 4.81 2400 0.3974
0.3795 4.83 2410 0.3995
0.4249 4.85 2420 0.3968
0.4635 4.87 2430 0.4001
0.4965 4.89 2440 0.3934
0.3745 4.91 2450 0.3987
0.3601 4.93 2460 0.3986
0.2878 4.95 2470 0.3941
0.4297 4.97 2480 0.3890
0.278 4.99 2490 0.3975
0.4509 5.01 2500 0.3907
0.3202 5.03 2510 0.3872
0.3047 5.05 2520 0.3956
0.2931 5.07 2530 0.3925
0.3487 5.09 2540 0.3910
0.2792 5.11 2550 0.3901
0.3446 5.13 2560 0.3873
0.3482 5.15 2570 0.3840
0.3464 5.17 2580 0.3835
0.3212 5.19 2590 0.3846
0.3847 5.21 2600 0.3819
0.3212 5.23 2610 0.3897
0.358 5.25 2620 0.3811
0.3471 5.27 2630 0.3805
0.3348 5.29 2640 0.3868
0.342 5.31 2650 0.3769
0.4504 5.33 2660 0.3774
0.2713 5.35 2670 0.3803
0.3848 5.37 2680 0.3776
0.354 5.39 2690 0.3758
0.3796 5.41 2700 0.3760
0.3654 5.43 2710 0.3737
0.3448 5.45 2720 0.3812
0.355 5.47 2730 0.3759
0.288 5.49 2740 0.3711
0.2991 5.51 2750 0.3691
0.3443 5.53 2760 0.3708
0.3374 5.55 2770 0.3659
0.4078 5.57 2780 0.3709
0.2967 5.59 2790 0.3683
0.3532 5.61 2800 0.3638
0.4123 5.63 2810 0.3642
0.3195 5.65 2820 0.3655
0.3161 5.67 2830 0.3599
0.4152 5.69 2840 0.3621
0.2802 5.71 2850 0.3648
0.2909 5.73 2860 0.3604
0.3105 5.75 2870 0.3604
0.3291 5.77 2880 0.3553
0.3916 5.79 2890 0.3603
0.3657 5.81 2900 0.3544
0.3745 5.83 2910 0.3559
0.3281 5.85 2920 0.3517
0.2892 5.87 2930 0.3551
0.4121 5.89 2940 0.3489
0.2908 5.91 2950 0.3532
0.3677 5.93 2960 0.3469
0.341 5.95 2970 0.3503
0.2319 5.97 2980 0.3497
0.2624 5.99 2990 0.3468
0.3324 6.01 3000 0.3480
0.2114 6.03 3010 0.3530
0.256 6.05 3020 0.3501
0.2716 6.07 3030 0.3490
0.2921 6.09 3040 0.3466
0.2924 6.11 3050 0.3531
0.3267 6.13 3060 0.3455
0.3488 6.15 3070 0.3428
0.301 6.17 3080 0.3455
0.2656 6.19 3090 0.3450
0.2377 6.21 3100 0.3474
0.2344 6.23 3110 0.3461
0.2816 6.25 3120 0.3489
0.2675 6.27 3130 0.3427
0.3315 6.29 3140 0.3393
0.335 6.31 3150 0.3406
0.2418 6.33 3160 0.3385
0.215 6.35 3170 0.3393
0.2279 6.37 3180 0.3427
0.2907 6.39 3190 0.3379
0.2184 6.41 3200 0.3438
0.3484 6.43 3210 0.3364
0.2327 6.45 3220 0.3406
0.2571 6.47 3230 0.3400
0.2864 6.49 3240 0.3367
0.2383 6.51 3250 0.3377
0.187 6.53 3260 0.3346
0.2453 6.55 3270 0.3349
0.296 6.57 3280 0.3339
0.2601 6.59 3290 0.3335
0.2927 6.61 3300 0.3340
0.2796 6.63 3310 0.3303
0.2393 6.65 3320 0.3351
0.2764 6.67 3330 0.3288
0.2547 6.69 3340 0.3327
0.3247 6.71 3350 0.3279
0.3217 6.73 3360 0.3283
0.2881 6.75 3370 0.3307
0.2897 6.77 3380 0.3281
0.3096 6.79 3390 0.3257
0.2463 6.81 3400 0.3244
0.2404 6.83 3410 0.3254
0.2907 6.85 3420 0.3227
0.2749 6.87 3430 0.3226
0.2262 6.89 3440 0.3226
0.2799 6.91 3450 0.3233
0.2764 6.93 3460 0.3198
0.2644 6.95 3470 0.3231
0.2733 6.97 3480 0.3188
0.2861 6.99 3490 0.3192
0.1757 7.01 3500 0.3243
0.2588 7.03 3510 0.3238
0.2132 7.05 3520 0.3207
0.2787 7.07 3530 0.3272
0.2786 7.09 3540 0.3229
0.2854 7.11 3550 0.3232
0.1982 7.13 3560 0.3237
0.2022 7.15 3570 0.3254
0.2592 7.17 3580 0.3258
0.2299 7.19 3590 0.3207
0.2054 7.21 3600 0.3197
0.208 7.23 3610 0.3216
0.2432 7.25 3620 0.3228
0.2452 7.27 3630 0.3181
0.264 7.29 3640 0.3238
0.2019 7.31 3650 0.3178
0.2299 7.33 3660 0.3218
0.2465 7.35 3670 0.3172
0.2466 7.37 3680 0.3167
0.2824 7.39 3690 0.3143
0.2314 7.41 3700 0.3143
0.2822 7.43 3710 0.3143
0.2254 7.45 3720 0.3139
0.2454 7.47 3730 0.3218
0.2656 7.49 3740 0.3116
0.2172 7.51 3750 0.3154
0.2408 7.53 3760 0.3127
0.1761 7.55 3770 0.3149
0.2232 7.57 3780 0.3114
0.2902 7.59 3790 0.3136
0.2485 7.61 3800 0.3146
0.1901 7.63 3810 0.3094
0.2962 7.65 3820 0.3120
0.2093 7.67 3830 0.3133
0.368 7.69 3840 0.3064
0.2849 7.71 3850 0.3091
0.1948 7.73 3860 0.3075
0.2241 7.75 3870 0.3078
0.1935 7.77 3880 0.3045
0.2045 7.79 3890 0.3065
0.159 7.81 3900 0.3082
0.1714 7.83 3910 0.3057
0.1984 7.85 3920 0.3059
0.2397 7.87 3930 0.3037
0.1884 7.89 3940 0.3054
0.2585 7.91 3950 0.3030
0.2476 7.93 3960 0.3058
0.2525 7.95 3970 0.3033
0.2001 7.97 3980 0.3062
0.1985 7.99 3990 0.3039
0.1984 8.02 4000 0.3139
0.2008 8.04 4010 0.3099
0.2159 8.06 4020 0.3085
0.2305 8.08 4030 0.3108
0.2007 8.1 4040 0.3050
0.2124 8.12 4050 0.3115
0.1435 8.14 4060 0.3084
0.1968 8.16 4070 0.3087
0.2507 8.18 4080 0.3084
0.1703 8.2 4090 0.3061
0.2511 8.22 4100 0.3106
0.1698 8.24 4110 0.3134
0.2518 8.26 4120 0.3101
0.1489 8.28 4130 0.3090
0.1759 8.3 4140 0.3098
0.1939 8.32 4150 0.3056
0.2168 8.34 4160 0.3106
0.2119 8.36 4170 0.3051
0.1793 8.38 4180 0.3056
0.2434 8.4 4190 0.3050
0.2601 8.42 4200 0.3065
0.1791 8.44 4210 0.3051
0.1404 8.46 4220 0.3058
0.222 8.48 4230 0.3059
0.1809 8.5 4240 0.3070
0.1745 8.52 4250 0.3066
0.2236 8.54 4260 0.3012
0.1965 8.56 4270 0.3037
0.1836 8.58 4280 0.3051
0.1912 8.6 4290 0.3017
0.2207 8.62 4300 0.3025
0.2481 8.64 4310 0.2997
0.1506 8.66 4320 0.3003
0.2216 8.68 4330 0.3035
0.1866 8.7 4340 0.3014
0.2025 8.72 4350 0.3035
0.1521 8.74 4360 0.2992
0.1598 8.76 4370 0.3034
0.185 8.78 4380 0.3017
0.2427 8.8 4390 0.2972
0.2343 8.82 4400 0.2979
0.1994 8.84 4410 0.2994
0.2671 8.86 4420 0.2986
0.1158 8.88 4430 0.2991
0.2127 8.9 4440 0.3000
0.1691 8.92 4450 0.2981
0.2103 8.94 4460 0.2979
0.1392 8.96 4470 0.2982
0.1712 8.98 4480 0.2943
0.2435 9.0 4490 0.2958
0.1715 9.02 4500 0.3055
0.1641 9.04 4510 0.3048
0.1529 9.06 4520 0.3029
0.1566 9.08 4530 0.3047
0.1382 9.1 4540 0.3027
0.1605 9.12 4550 0.3023
0.2167 9.14 4560 0.3055
0.1506 9.16 4570 0.3037
0.192 9.18 4580 0.3039
0.139 9.2 4590 0.3030
0.1974 9.22 4600 0.3038
0.167 9.24 4610 0.3037
0.2409 9.26 4620 0.3034
0.1494 9.28 4630 0.3048
0.1762 9.3 4640 0.3037
0.183 9.32 4650 0.3042
0.1773 9.34 4660 0.3043
0.1509 9.36 4670 0.3053
0.1994 9.38 4680 0.3045
0.1928 9.4 4690 0.3036
0.1158 9.42 4700 0.3038
0.1503 9.44 4710 0.3019
0.1556 9.46 4720 0.3029
0.1327 9.48 4730 0.3050
0.1772 9.5 4740 0.3057
0.1555 9.52 4750 0.3028
0.1363 9.54 4760 0.3014
0.139 9.56 4770 0.3010
0.1639 9.58 4780 0.3013
0.1669 9.6 4790 0.3015
0.144 9.62 4800 0.3023
0.1925 9.64 4810 0.3034
0.1615 9.66 4820 0.3025
0.1625 9.68 4830 0.3019
0.1355 9.7 4840 0.3023
0.1671 9.72 4850 0.3019
0.1447 9.74 4860 0.3021
0.1465 9.76 4870 0.3024
0.1794 9.78 4880 0.3021
0.156 9.8 4890 0.3011
0.1018 9.82 4900 0.3005
0.1403 9.84 4910 0.3011
0.1126 9.86 4920 0.3006
0.1595 9.88 4930 0.3007
0.1415 9.9 4940 0.3012
0.1651 9.92 4950 0.3015
0.1558 9.94 4960 0.3015
0.1734 9.96 4970 0.3014
0.1909 9.98 4980 0.3014
0.1246 10.0 4990 0.3014

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.12.1+cu116
  • Datasets 2.10.0
  • Tokenizers 0.12.1
Downloads last month
1