metadata

library_name: transformers
license: mit
base_model: gpt2
tags:
  - generated_from_keras_callback
model-index:
  - name: turkishElectrick-mini-model
    results: []

turkishElectrick-mini-model

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 0.6456
Validation Loss: 1.7437
Epoch: 99

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -981, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: mixed_float16

Training results

Train Loss	Validation Loss	Epoch
7.8609	7.6497	0
7.4033	6.9102	1
6.7940	6.4910	2
6.4110	6.1667	3
6.1566	5.9352	4
5.9535	5.7224	5
5.7576	5.5135	6
5.5523	5.2730	7
5.3273	5.0157	8
5.0893	4.7472	9
4.8421	4.4614	10
4.5883	4.1934	11
4.3480	3.9637	12
4.1266	3.7447	13
3.9195	3.5359	14
3.7044	3.3124	15
3.5097	3.1111	16
3.3371	2.9532	17
3.1614	2.7941	18
3.0044	2.6662	19
2.8511	2.5749	20
2.7244	2.4281	21
2.5806	2.3450	22
2.4819	2.2632	23
2.3593	2.1921	24
2.2577	2.1169	25
2.1563	2.0540	26
2.0613	2.0063	27
1.9667	1.9627	28
1.8827	1.9393	29
1.8151	1.8864	30
1.7214	1.8717	31
1.6412	1.8502	32
1.5774	1.7942	33
1.5114	1.7909	34
1.4588	1.7749	35
1.4006	1.7770	36
1.3340	1.7404	37
1.2674	1.7468	38
1.2138	1.7298	39
1.1611	1.7218	40
1.1231	1.7275	41
1.0758	1.7187	42
1.0199	1.7249	43
0.9813	1.6946	44
0.9286	1.7022	45
0.8793	1.7378	46
0.8404	1.6809	47
0.8028	1.7204	48
0.7706	1.7212	49
0.7406	1.7010	50
0.6994	1.7265	51
0.6785	1.7437	52
0.6438	1.7437	53
0.6456	1.7437	54
0.6406	1.7437	55
0.6422	1.7437	56
0.6453	1.7437	57
0.6428	1.7437	58
0.6454	1.7437	59
0.6477	1.7437	60
0.6438	1.7437	61
0.6477	1.7437	62
0.6462	1.7437	63
0.6461	1.7437	64
0.6469	1.7437	65
0.6448	1.7437	66
0.6450	1.7437	67
0.6469	1.7437	68
0.6407	1.7437	69
0.6492	1.7437	70
0.6410	1.7437	71
0.6445	1.7437	72
0.6385	1.7437	73
0.6413	1.7437	74
0.6397	1.7437	75
0.6456	1.7437	76
0.6403	1.7437	77
0.6439	1.7437	78
0.6398	1.7437	79
0.6415	1.7437	80
0.6431	1.7437	81
0.6421	1.7437	82
0.6423	1.7437	83
0.6454	1.7437	84
0.6406	1.7437	85
0.6440	1.7437	86
0.6423	1.7437	87
0.6431	1.7437	88
0.6448	1.7437	89
0.6436	1.7437	90
0.6362	1.7437	91
0.6445	1.7437	92
0.6407	1.7437	93
0.6410	1.7437	94
0.6431	1.7437	95
0.6434	1.7437	96
0.6415	1.7437	97
0.6438	1.7437	98
0.6456	1.7437	99

Framework versions

Transformers 4.44.2
TensorFlow 2.17.0
Datasets 3.0.0
Tokenizers 0.19.1