metadata

tags:
  - generated_from_keras_callback
model-index:
  - name: distilgpt_new2_0060
    results: []

distilgpt_new2_0060

This model is a fine-tuned version of [/content/drive/MyDrive/Colab Notebooks/oscar/trybackup/new_backup_008585](https://huggingface.co//content/drive/MyDrive/Colab Notebooks/oscar/trybackup/new_backup_008585) on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 2.6610
Validation Loss: 2.5503
Epoch: 59

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
2.7786	2.6701	0
2.7759	2.6662	1
2.7730	2.6642	2
2.7704	2.6602	3
2.7679	2.6590	4
2.7654	2.6565	5
2.7629	2.6538	6
2.7604	2.6525	7
2.7579	2.6489	8
2.7556	2.6452	9
2.7531	2.6454	10
2.7505	2.6411	11
2.7483	2.6387	12
2.7458	2.6370	13
2.7435	2.6365	14
2.7412	2.6308	15
2.7391	2.6289	16
2.7368	2.6288	17
2.7347	2.6257	18
2.7325	2.6238	19
2.7303	2.6228	20
2.7282	2.6189	21
2.7262	2.6170	22
2.7240	2.6145	23
2.7221	2.6137	24
2.7201	2.6108	25
2.7181	2.6082	26
2.7161	2.6057	27
2.7141	2.6039	28
2.7122	2.6031	29
2.7103	2.6018	30
2.7083	2.5984	31
2.7064	2.5975	32
2.7045	2.5941	33
2.7026	2.5945	34
2.7008	2.5930	35
2.6989	2.5874	36
2.6972	2.5864	37
2.6953	2.5858	38
2.6935	2.5831	39
2.6917	2.5809	40
2.6900	2.5791	41
2.6881	2.5780	42
2.6865	2.5765	43
2.6848	2.5761	44
2.6832	2.5720	45
2.6814	2.5709	46
2.6797	2.5689	47
2.6782	2.5692	48
2.6766	2.5672	49
2.6750	2.5646	50
2.6733	2.5635	51
2.6719	2.5623	52
2.6701	2.5594	53
2.6684	2.5586	54
2.6670	2.5584	55
2.6655	2.5556	56
2.6640	2.5542	57
2.6626	2.5521	58
2.6610	2.5503	59

Framework versions

Transformers 4.20.1
TensorFlow 2.8.2
Datasets 2.3.2
Tokenizers 0.12.1