gpt2-finetuned-justification-v1
This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4104
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.2403 | 1.0 | 676 | 0.1991 |
0.1824 | 2.0 | 1352 | 0.1990 |
0.1366 | 3.0 | 2028 | 0.2091 |
0.1098 | 4.0 | 2704 | 0.2222 |
0.0997 | 5.0 | 3380 | 0.2386 |
0.0724 | 6.0 | 4056 | 0.2535 |
0.0608 | 7.0 | 4732 | 0.2694 |
0.0516 | 8.0 | 5408 | 0.2861 |
0.0409 | 9.0 | 6084 | 0.2941 |
0.0356 | 10.0 | 6760 | 0.3040 |
0.0319 | 11.0 | 7436 | 0.3124 |
0.0265 | 12.0 | 8112 | 0.3184 |
0.0242 | 13.0 | 8788 | 0.3235 |
0.0225 | 14.0 | 9464 | 0.3261 |
0.0197 | 15.0 | 10140 | 0.3330 |
0.0183 | 16.0 | 10816 | 0.3372 |
0.0185 | 17.0 | 11492 | 0.3410 |
0.0157 | 18.0 | 12168 | 0.3394 |
0.0155 | 19.0 | 12844 | 0.3468 |
0.0147 | 20.0 | 13520 | 0.3522 |
0.0135 | 21.0 | 14196 | 0.3532 |
0.0135 | 22.0 | 14872 | 0.3538 |
0.0125 | 23.0 | 15548 | 0.3605 |
0.0123 | 24.0 | 16224 | 0.3594 |
0.012 | 25.0 | 16900 | 0.3635 |
0.0116 | 26.0 | 17576 | 0.3649 |
0.0114 | 27.0 | 18252 | 0.3665 |
0.011 | 28.0 | 18928 | 0.3685 |
0.0108 | 29.0 | 19604 | 0.3689 |
0.0108 | 30.0 | 20280 | 0.3724 |
0.0103 | 31.0 | 20956 | 0.3719 |
0.0102 | 32.0 | 21632 | 0.3717 |
0.01 | 33.0 | 22308 | 0.3764 |
0.0102 | 34.0 | 22984 | 0.3751 |
0.0094 | 35.0 | 23660 | 0.3787 |
0.0099 | 36.0 | 24336 | 0.3789 |
0.0096 | 37.0 | 25012 | 0.3857 |
0.0094 | 38.0 | 25688 | 0.3825 |
0.0093 | 39.0 | 26364 | 0.3831 |
0.0091 | 40.0 | 27040 | 0.3878 |
0.0091 | 41.0 | 27716 | 0.3857 |
0.0089 | 42.0 | 28392 | 0.3863 |
0.0089 | 43.0 | 29068 | 0.3878 |
0.0089 | 44.0 | 29744 | 0.3895 |
0.0087 | 45.0 | 30420 | 0.3885 |
0.0088 | 46.0 | 31096 | 0.3900 |
0.0084 | 47.0 | 31772 | 0.3930 |
0.0087 | 48.0 | 32448 | 0.3916 |
0.0084 | 49.0 | 33124 | 0.3907 |
0.0083 | 50.0 | 33800 | 0.3922 |
0.0083 | 51.0 | 34476 | 0.3937 |
0.0082 | 52.0 | 35152 | 0.3934 |
0.0082 | 53.0 | 35828 | 0.3976 |
0.0081 | 54.0 | 36504 | 0.3959 |
0.008 | 55.0 | 37180 | 0.3996 |
0.0079 | 56.0 | 37856 | 0.3999 |
0.0079 | 57.0 | 38532 | 0.3997 |
0.0079 | 58.0 | 39208 | 0.4024 |
0.0078 | 59.0 | 39884 | 0.4027 |
0.0079 | 60.0 | 40560 | 0.3980 |
0.0077 | 61.0 | 41236 | 0.4019 |
0.0077 | 62.0 | 41912 | 0.4019 |
0.0078 | 63.0 | 42588 | 0.4020 |
0.0076 | 64.0 | 43264 | 0.4062 |
0.0077 | 65.0 | 43940 | 0.4041 |
0.0077 | 66.0 | 44616 | 0.4011 |
0.0076 | 67.0 | 45292 | 0.4029 |
0.0075 | 68.0 | 45968 | 0.4046 |
0.0074 | 69.0 | 46644 | 0.4043 |
0.0075 | 70.0 | 47320 | 0.4066 |
0.0075 | 71.0 | 47996 | 0.4055 |
0.0074 | 72.0 | 48672 | 0.4064 |
0.0075 | 73.0 | 49348 | 0.4089 |
0.0074 | 74.0 | 50024 | 0.4089 |
0.0072 | 75.0 | 50700 | 0.4087 |
0.0073 | 76.0 | 51376 | 0.4066 |
0.0073 | 77.0 | 52052 | 0.4035 |
0.0072 | 78.0 | 52728 | 0.4050 |
0.0072 | 79.0 | 53404 | 0.4059 |
0.0071 | 80.0 | 54080 | 0.4104 |
0.0071 | 81.0 | 54756 | 0.4095 |
0.0072 | 82.0 | 55432 | 0.4081 |
0.0072 | 83.0 | 56108 | 0.4095 |
0.0071 | 84.0 | 56784 | 0.4092 |
0.007 | 85.0 | 57460 | 0.4099 |
0.007 | 86.0 | 58136 | 0.4070 |
0.007 | 87.0 | 58812 | 0.4070 |
0.007 | 88.0 | 59488 | 0.4057 |
0.0069 | 89.0 | 60164 | 0.4090 |
0.0069 | 90.0 | 60840 | 0.4106 |
0.007 | 91.0 | 61516 | 0.4096 |
0.0069 | 92.0 | 62192 | 0.4106 |
0.0069 | 93.0 | 62868 | 0.4101 |
0.0069 | 94.0 | 63544 | 0.4099 |
0.0068 | 95.0 | 64220 | 0.4104 |
0.0068 | 96.0 | 64896 | 0.4106 |
0.0068 | 97.0 | 65572 | 0.4102 |
0.0067 | 98.0 | 66248 | 0.4102 |
0.0067 | 99.0 | 66924 | 0.4104 |
0.0067 | 100.0 | 67600 | 0.4104 |
Framework versions
- Transformers 4.36.2
- Pytorch 2.2.2+cu121
- Datasets 2.16.0
- Tokenizers 0.15.2
- Downloads last month
- 34
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.