deneme_spor

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 4.9093
Validation Loss: 5.9538
Epoch: 149

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -963, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
9.1978	8.9070	0
8.7400	8.5517	1
8.4947	8.3909	2
8.3502	8.2608	3
8.2126	8.1241	4
8.0688	7.9827	5
7.9232	7.8449	6
7.7844	7.7107	7
7.6446	7.5719	8
7.4919	7.4263	9
7.3429	7.2975	10
7.2042	7.1774	11
7.0643	7.0685	12
6.9229	6.9668	13
6.7836	6.8770	14
6.6425	6.7752	15
6.4982	6.6895	16
6.3539	6.5963	17
6.2035	6.5170	18
6.0612	6.4285	19
5.9164	6.3429	20
5.7708	6.2664	21
5.6249	6.1997	22
5.4822	6.1348	23
5.3368	6.0659	24
5.1959	6.0042	25
5.0527	5.9525	26
4.9070	5.9538	27
4.9062	5.9538	28
4.9095	5.9538	29
4.9056	5.9538	30
4.9111	5.9538	31
4.9080	5.9538	32
4.9072	5.9538	33
4.9063	5.9538	34
4.9086	5.9538	35
4.9081	5.9538	36
4.9115	5.9538	37
4.9052	5.9538	38
4.9073	5.9538	39
4.9064	5.9538	40
4.9096	5.9538	41
4.9093	5.9538	42
4.9077	5.9538	43
4.9078	5.9538	44
4.9073	5.9538	45
4.9076	5.9538	46
4.9096	5.9538	47
4.9093	5.9538	48
4.9093	5.9538	49
4.9082	5.9538	50
4.9106	5.9538	51
4.9076	5.9538	52
4.9079	5.9538	53
4.9093	5.9538	54
4.9096	5.9538	55
4.9063	5.9538	56
4.9071	5.9538	57
4.9122	5.9538	58
4.9108	5.9538	59
4.9072	5.9538	60
4.9073	5.9538	61
4.9085	5.9538	62
4.9080	5.9538	63
4.9092	5.9538	64
4.9077	5.9538	65
4.9087	5.9538	66
4.9073	5.9538	67
4.9078	5.9538	68
4.9102	5.9538	69
4.9095	5.9538	70
4.9099	5.9538	71
4.9081	5.9538	72
4.9089	5.9538	73
4.9068	5.9538	74
4.9091	5.9538	75
4.9078	5.9538	76
4.9083	5.9538	77
4.9067	5.9538	78
4.9077	5.9538	79
4.9111	5.9538	80
4.9088	5.9538	81
4.9085	5.9538	82
4.9093	5.9538	83
4.9086	5.9538	84
4.9088	5.9538	85
4.9057	5.9538	86
4.9104	5.9538	87
4.9081	5.9538	88
4.9070	5.9538	89
4.9076	5.9538	90
4.9078	5.9538	91
4.9097	5.9538	92
4.9082	5.9538	93
4.9061	5.9538	94
4.9111	5.9538	95
4.9067	5.9538	96
4.9070	5.9538	97
4.9089	5.9538	98
4.9051	5.9538	99
4.9072	5.9538	100
4.9110	5.9538	101
4.9094	5.9538	102
4.9089	5.9538	103
4.9072	5.9538	104
4.9072	5.9538	105
4.9055	5.9538	106
4.9079	5.9538	107
4.9075	5.9538	108
4.9100	5.9538	109
4.9106	5.9538	110
4.9081	5.9538	111
4.9094	5.9538	112
4.9108	5.9538	113
4.9082	5.9538	114
4.9089	5.9538	115
4.9099	5.9538	116
4.9063	5.9538	117
4.9094	5.9538	118
4.9059	5.9538	119
4.9096	5.9538	120
4.9065	5.9538	121
4.9092	5.9538	122
4.9092	5.9538	123
4.9107	5.9538	124
4.9061	5.9538	125
4.9117	5.9538	126
4.9087	5.9538	127
4.9062	5.9538	128
4.9105	5.9538	129
4.9093	5.9538	130
4.9078	5.9538	131
4.9067	5.9538	132
4.9104	5.9538	133
4.9065	5.9538	134
4.9077	5.9538	135
4.9101	5.9538	136
4.9063	5.9538	137
4.9091	5.9538	138
4.9100	5.9538	139
4.9101	5.9538	140
4.9057	5.9538	141
4.9080	5.9538	142
4.9076	5.9538	143
4.9085	5.9538	144
4.9071	5.9538	145
4.9107	5.9538	146
4.9102	5.9538	147
4.9071	5.9538	148
4.9093	5.9538	149

Framework versions

Transformers 4.38.2
TensorFlow 2.15.0
Datasets 2.18.0
Tokenizers 0.15.2

denizzhansahin
/

deneme_spor

deneme_spor

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for denizzhansahin/deneme_spor

Evaluation results