Edit model card

distilgpt_oscarth_0080

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 2.8143
  • Validation Loss: 2.7051
  • Epoch: 79

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
5.6021 4.5759 0
4.4536 4.1235 1
4.1386 3.9013 2
3.9546 3.7563 3
3.8255 3.6477 4
3.7271 3.5617 5
3.6488 3.4936 6
3.5844 3.4379 7
3.5301 3.3891 8
3.4833 3.3448 9
3.4427 3.3098 10
3.4068 3.2750 11
3.3749 3.2425 12
3.3462 3.2211 13
3.3202 3.1941 14
3.2964 3.1720 15
3.2749 3.1512 16
3.2548 3.1322 17
3.2363 3.1141 18
3.2188 3.0982 19
3.2025 3.0818 20
3.1871 3.0678 21
3.1724 3.0533 22
3.1583 3.0376 23
3.1446 3.0256 24
3.1318 3.0122 25
3.1195 3.0016 26
3.1079 2.9901 27
3.0968 2.9826 28
3.0863 2.9711 29
3.0761 2.9593 30
3.0665 2.9514 31
3.0572 2.9432 32
3.0483 2.9347 33
3.0396 2.9250 34
3.0313 2.9160 35
3.0232 2.9095 36
3.0153 2.9028 37
3.0078 2.8949 38
3.0004 2.8864 39
2.9932 2.8799 40
2.9864 2.8722 41
2.9797 2.8657 42
2.9731 2.8605 43
2.9668 2.8544 44
2.9606 2.8481 45
2.9545 2.8421 46
2.9487 2.8368 47
2.9429 2.8288 48
2.9374 2.8257 49
2.9319 2.8204 50
2.9266 2.8152 51
2.9215 2.8093 52
2.9164 2.8054 53
2.9114 2.7998 54
2.9066 2.7980 55
2.9019 2.7913 56
2.8972 2.7865 57
2.8927 2.7816 58
2.8883 2.7784 59
2.8840 2.7741 60
2.8798 2.7696 61
2.8755 2.7650 62
2.8714 2.7608 63
2.8674 2.7571 64
2.8634 2.7532 65
2.8593 2.7482 66
2.8555 2.7466 67
2.8519 2.7417 68
2.8481 2.7378 69
2.8445 2.7345 70
2.8410 2.7309 71
2.8373 2.7273 72
2.8339 2.7233 73
2.8305 2.7225 74
2.8272 2.7169 75
2.8238 2.7133 76
2.8206 2.7117 77
2.8174 2.7090 78
2.8143 2.7051 79

Framework versions

  • Transformers 4.20.1
  • TensorFlow 2.8.2
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
6