denizzhansahin
/

deneme_linux

Text Generation

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

deneme_linux / README.md

denizzhansahin's picture

Upload model

473440b verified 7 months ago

|

history blame contribute delete

5.76 kB

metadata

license: mit
base_model: gpt2
tags:
  - generated_from_keras_callback
model-index:
  - name: deneme_linux
    results: []

deneme_linux

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 0.6315
Validation Loss: 7.0703
Epoch: 99

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -995, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
0.6331	7.0703	0
0.6319	7.0703	1
0.6283	7.0703	2
0.6276	7.0703	3
0.6295	7.0703	4
0.6356	7.0703	5
0.6282	7.0703	6
0.6287	7.0703	7
0.6309	7.0703	8
0.6291	7.0703	9
0.6320	7.0703	10
0.6284	7.0703	11
0.6333	7.0703	12
0.6302	7.0703	13
0.6346	7.0703	14
0.6285	7.0703	15
0.6248	7.0703	16
0.6317	7.0703	17
0.6291	7.0703	18
0.6305	7.0703	19
0.6321	7.0703	20
0.6317	7.0703	21
0.6274	7.0703	22
0.6283	7.0703	23
0.6359	7.0703	24
0.6334	7.0703	25
0.6306	7.0703	26
0.6375	7.0703	27
0.6267	7.0703	28
0.6349	7.0703	29
0.6298	7.0703	30
0.6314	7.0703	31
0.6347	7.0703	32
0.6284	7.0703	33
0.6300	7.0703	34
0.6287	7.0703	35
0.6337	7.0703	36
0.6348	7.0703	37
0.6297	7.0703	38
0.6376	7.0703	39
0.6340	7.0703	40
0.6311	7.0703	41
0.6327	7.0703	42
0.6343	7.0703	43
0.6297	7.0703	44
0.6316	7.0703	45
0.6302	7.0703	46
0.6324	7.0703	47
0.6355	7.0703	48
0.6278	7.0703	49
0.6324	7.0703	50
0.6332	7.0703	51
0.6294	7.0703	52
0.6348	7.0703	53
0.6288	7.0703	54
0.6332	7.0703	55
0.6334	7.0703	56
0.6302	7.0703	57
0.6287	7.0703	58
0.6274	7.0703	59
0.6272	7.0703	60
0.6264	7.0703	61
0.6298	7.0703	62
0.6275	7.0703	63
0.6315	7.0703	64
0.6293	7.0703	65
0.6325	7.0703	66
0.6277	7.0703	67
0.6292	7.0703	68
0.6254	7.0703	69
0.6351	7.0703	70
0.6362	7.0703	71
0.6312	7.0703	72
0.6307	7.0703	73
0.6260	7.0703	74
0.6289	7.0703	75
0.6333	7.0703	76
0.6259	7.0703	77
0.6270	7.0703	78
0.6300	7.0703	79
0.6321	7.0703	80
0.6352	7.0703	81
0.6283	7.0703	82
0.6377	7.0703	83
0.6291	7.0703	84
0.6263	7.0703	85
0.6302	7.0703	86
0.6336	7.0703	87
0.6326	7.0703	88
0.6365	7.0703	89
0.6328	7.0703	90
0.6281	7.0703	91
0.6360	7.0703	92
0.6347	7.0703	93
0.6318	7.0703	94
0.6334	7.0703	95
0.6349	7.0703	96
0.6274	7.0703	97
0.6266	7.0703	98
0.6315	7.0703	99

Framework versions

Transformers 4.38.2
TensorFlow 2.15.0
Datasets 2.18.0
Tokenizers 0.15.2