MariaZafar/gpt2-finetuned-wikitext2
This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 0.7785
- Validation Loss: 3.7004
- Epoch: 49
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
Training results
Train Loss | Validation Loss | Epoch |
---|---|---|
5.8858 | 7.5655 | 0 |
4.0619 | 5.8193 | 1 |
3.3766 | 4.9585 | 2 |
3.0686 | 4.5764 | 3 |
2.9022 | 4.3847 | 4 |
2.7838 | 4.2249 | 5 |
2.6997 | 4.1060 | 6 |
2.6154 | 4.0100 | 7 |
2.5575 | 3.9412 | 8 |
2.4933 | 3.8447 | 9 |
2.4397 | 3.7619 | 10 |
2.3835 | 3.7510 | 11 |
2.3403 | 3.6810 | 12 |
2.2924 | 3.6716 | 13 |
2.2513 | 3.6335 | 14 |
2.2031 | 3.6208 | 15 |
2.1619 | 3.5915 | 16 |
2.1234 | 3.5497 | 17 |
2.0792 | 3.5540 | 18 |
2.0398 | 3.5461 | 19 |
1.9976 | 3.5282 | 20 |
1.9577 | 3.5260 | 21 |
1.9176 | 3.5041 | 22 |
1.8745 | 3.4994 | 23 |
1.8304 | 3.5250 | 24 |
1.7881 | 3.4864 | 25 |
1.7423 | 3.4718 | 26 |
1.6993 | 3.5194 | 27 |
1.6503 | 3.5019 | 28 |
1.6025 | 3.5055 | 29 |
1.5500 | 3.5109 | 30 |
1.4964 | 3.5389 | 31 |
1.4448 | 3.5393 | 32 |
1.3954 | 3.5363 | 33 |
1.3464 | 3.5446 | 34 |
1.2978 | 3.5117 | 35 |
1.2494 | 3.5225 | 36 |
1.2004 | 3.5443 | 37 |
1.1534 | 3.5909 | 38 |
1.1124 | 3.5380 | 39 |
1.0709 | 3.6162 | 40 |
1.0265 | 3.6758 | 41 |
0.9936 | 3.6168 | 42 |
0.9590 | 3.6243 | 43 |
0.9238 | 3.6308 | 44 |
0.8886 | 3.6429 | 45 |
0.8635 | 3.7137 | 46 |
0.8352 | 3.6512 | 47 |
0.8050 | 3.7033 | 48 |
0.7785 | 3.7004 | 49 |
Framework versions
- Transformers 4.19.2
- TensorFlow 2.8.0
- Datasets 2.2.1
- Tokenizers 0.12.1
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.