distilgpt_new2_0040 / README.md
bigmorning's picture
add model
0a403d4
|
raw
history blame
2.98 kB
metadata
tags:
  - generated_from_keras_callback
model-index:
  - name: distilgpt_new2_0040
    results: []

distilgpt_new2_0040

This model is a fine-tuned version of [/content/drive/MyDrive/Colab Notebooks/oscar/trybackup/new_backup_008585](https://huggingface.co//content/drive/MyDrive/Colab Notebooks/oscar/trybackup/new_backup_008585) on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 2.6935
  • Validation Loss: 2.5831
  • Epoch: 39

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
2.7786 2.6701 0
2.7759 2.6662 1
2.7730 2.6642 2
2.7704 2.6602 3
2.7679 2.6590 4
2.7654 2.6565 5
2.7629 2.6538 6
2.7604 2.6525 7
2.7579 2.6489 8
2.7556 2.6452 9
2.7531 2.6454 10
2.7505 2.6411 11
2.7483 2.6387 12
2.7458 2.6370 13
2.7435 2.6365 14
2.7412 2.6308 15
2.7391 2.6289 16
2.7368 2.6288 17
2.7347 2.6257 18
2.7325 2.6238 19
2.7303 2.6228 20
2.7282 2.6189 21
2.7262 2.6170 22
2.7240 2.6145 23
2.7221 2.6137 24
2.7201 2.6108 25
2.7181 2.6082 26
2.7161 2.6057 27
2.7141 2.6039 28
2.7122 2.6031 29
2.7103 2.6018 30
2.7083 2.5984 31
2.7064 2.5975 32
2.7045 2.5941 33
2.7026 2.5945 34
2.7008 2.5930 35
2.6989 2.5874 36
2.6972 2.5864 37
2.6953 2.5858 38
2.6935 2.5831 39

Framework versions

  • Transformers 4.20.1
  • TensorFlow 2.8.2
  • Datasets 2.3.2
  • Tokenizers 0.12.1