bigmorning
/

distilbert_new_0060

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

Edit model card

distilgpt_new_0060

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 1.1173
Validation Loss: 1.0714
Epoch: 59

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
3.5889	2.6197	0
2.4784	2.2040	1
2.1855	1.9980	2
2.0181	1.8643	3
1.9031	1.7652	4
1.8166	1.6924	5
1.7467	1.6360	6
1.6904	1.5843	7
1.6430	1.5421	8
1.6021	1.5059	9
1.5668	1.4761	10
1.5359	1.4481	11
1.5071	1.4220	12
1.4841	1.4020	13
1.4608	1.3797	14
1.4399	1.3595	15
1.4213	1.3426	16
1.4031	1.3266	17
1.3875	1.3113	18
1.3735	1.3024	19
1.3600	1.2871	20
1.3456	1.2753	21
1.3336	1.2648	22
1.3214	1.2539	23
1.3103	1.2451	24
1.3005	1.2335	25
1.2905	1.2258	26
1.2815	1.2179	27
1.2728	1.2123	28
1.2643	1.2029	29
1.2564	1.1980	30
1.2494	1.1877	31
1.2414	1.1806	32
1.2348	1.1788	33
1.2290	1.1699	34
1.2209	1.1654	35
1.2156	1.1575	36
1.2110	1.1537	37
1.2046	1.1499	38
1.1986	1.1436	39
1.1940	1.1408	40
1.1877	1.1356	41
1.1830	1.1314	42
1.1779	1.1278	43
1.1737	1.1211	44
1.1692	1.1192	45
1.1647	1.1163	46
1.1611	1.1107	47
1.1560	1.1066	48
1.1521	1.1060	49
1.1489	1.1002	50
1.1440	1.0960	51
1.1406	1.0931	52
1.1373	1.0897	53
1.1329	1.0855	54
1.1302	1.0842	55
1.1265	1.0818	56
1.1237	1.0784	57
1.1204	1.0737	58
1.1173	1.0714	59

Framework versions

Transformers 4.20.1
TensorFlow 2.8.2
Datasets 2.3.2
Tokenizers 0.12.1

Downloads last month: 1

Evaluation results

Metadata error: specify a dataset to view leaderboard