bigmorning
/

distilgpt_new5_0040

Text Generation Transformers TensorFlow gpt2 generated_from_keras_callback Inference Endpoints text-generation-inference

Model card Files Files and versions Community

Edit model card

distilgpt_new5_0040

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 2.4633
Validation Loss: 2.3432
Epoch: 39

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
2.4839	2.3639	0
2.4833	2.3630	1
2.4827	2.3620	2
2.4821	2.3632	3
2.4816	2.3617	4
2.4811	2.3614	5
2.4805	2.3613	6
2.4799	2.3613	7
2.4794	2.3600	8
2.4788	2.3589	9
2.4784	2.3582	10
2.4779	2.3563	11
2.4774	2.3579	12
2.4768	2.3563	13
2.4762	2.3561	14
2.4756	2.3554	15
2.4751	2.3539	16
2.4746	2.3550	17
2.4741	2.3534	18
2.4736	2.3530	19
2.4731	2.3522	20
2.4725	2.3522	21
2.4719	2.3525	22
2.4714	2.3519	23
2.4709	2.3505	24
2.4705	2.3489	25
2.4699	2.3488	26
2.4694	2.3498	27
2.4689	2.3472	28
2.4683	2.3476	29
2.4679	2.3477	30
2.4675	2.3468	31
2.4668	2.3454	32
2.4665	2.3455	33
2.4659	2.3456	34
2.4655	2.3436	35
2.4649	2.3433	36
2.4644	2.3437	37
2.4638	2.3428	38
2.4633	2.3432	39

Framework versions

Transformers 4.20.1
TensorFlow 2.8.2
Datasets 2.4.0
Tokenizers 0.12.1

Downloads last month: 5

Evaluation results

Metadata error: specify a dataset to view leaderboard