distilgpt_new2_0060 / README.md
bigmorning's picture
add model
5236a82
|
raw
history blame
3.8 kB
metadata
tags:
  - generated_from_keras_callback
model-index:
  - name: distilgpt_new2_0060
    results: []

distilgpt_new2_0060

This model is a fine-tuned version of [/content/drive/MyDrive/Colab Notebooks/oscar/trybackup/new_backup_008585](https://huggingface.co//content/drive/MyDrive/Colab Notebooks/oscar/trybackup/new_backup_008585) on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 2.6610
  • Validation Loss: 2.5503
  • Epoch: 59

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
2.7786 2.6701 0
2.7759 2.6662 1
2.7730 2.6642 2
2.7704 2.6602 3
2.7679 2.6590 4
2.7654 2.6565 5
2.7629 2.6538 6
2.7604 2.6525 7
2.7579 2.6489 8
2.7556 2.6452 9
2.7531 2.6454 10
2.7505 2.6411 11
2.7483 2.6387 12
2.7458 2.6370 13
2.7435 2.6365 14
2.7412 2.6308 15
2.7391 2.6289 16
2.7368 2.6288 17
2.7347 2.6257 18
2.7325 2.6238 19
2.7303 2.6228 20
2.7282 2.6189 21
2.7262 2.6170 22
2.7240 2.6145 23
2.7221 2.6137 24
2.7201 2.6108 25
2.7181 2.6082 26
2.7161 2.6057 27
2.7141 2.6039 28
2.7122 2.6031 29
2.7103 2.6018 30
2.7083 2.5984 31
2.7064 2.5975 32
2.7045 2.5941 33
2.7026 2.5945 34
2.7008 2.5930 35
2.6989 2.5874 36
2.6972 2.5864 37
2.6953 2.5858 38
2.6935 2.5831 39
2.6917 2.5809 40
2.6900 2.5791 41
2.6881 2.5780 42
2.6865 2.5765 43
2.6848 2.5761 44
2.6832 2.5720 45
2.6814 2.5709 46
2.6797 2.5689 47
2.6782 2.5692 48
2.6766 2.5672 49
2.6750 2.5646 50
2.6733 2.5635 51
2.6719 2.5623 52
2.6701 2.5594 53
2.6684 2.5586 54
2.6670 2.5584 55
2.6655 2.5556 56
2.6640 2.5542 57
2.6626 2.5521 58
2.6610 2.5503 59

Framework versions

  • Transformers 4.20.1
  • TensorFlow 2.8.2
  • Datasets 2.3.2
  • Tokenizers 0.12.1