bigmorning
/

distilgpt_new5_0020

Text Generation Transformers TensorFlow gpt2 generated_from_keras_callback Inference Endpoints text-generation-inference

Model card Files Files and versions Community

Edit model card

distilgpt_new5_0020

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 2.4736
Validation Loss: 2.3530
Epoch: 19

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
2.4839	2.3639	0
2.4833	2.3630	1
2.4827	2.3620	2
2.4821	2.3632	3
2.4816	2.3617	4
2.4811	2.3614	5
2.4805	2.3613	6
2.4799	2.3613	7
2.4794	2.3600	8
2.4788	2.3589	9
2.4784	2.3582	10
2.4779	2.3563	11
2.4774	2.3579	12
2.4768	2.3563	13
2.4762	2.3561	14
2.4756	2.3554	15
2.4751	2.3539	16
2.4746	2.3550	17
2.4741	2.3534	18
2.4736	2.3530	19

Framework versions

Transformers 4.20.1
TensorFlow 2.8.2
Datasets 2.4.0
Tokenizers 0.12.1

Downloads last month: 4

Evaluation results

Metadata error: specify a dataset to view leaderboard