Edit model card

deneme

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 4.9458
  • Validation Loss: 6.8594
  • Epoch: 29

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -958, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
9.6934 9.6704 0
9.4105 9.4160 1
9.1023 9.1460 2
8.7843 8.8812 3
8.4798 8.6399 4
8.2009 8.4204 5
7.9445 8.2215 6
7.7117 8.0412 7
7.4965 7.8807 8
7.2929 7.7610 9
7.1028 7.6524 10
6.9102 7.5430 11
6.7196 7.4545 12
6.5149 7.3501 13
6.3265 7.2721 14
6.1167 7.1747 15
5.9178 7.0857 16
5.7101 7.0159 17
5.5075 6.9416 18
5.2990 6.8977 19
5.0966 6.8594 20
4.9436 6.8594 21
4.9450 6.8594 22
4.9446 6.8594 23
4.9422 6.8594 24
4.9412 6.8594 25
4.9436 6.8594 26
4.9455 6.8594 27
4.9467 6.8594 28
4.9458 6.8594 29

Framework versions

  • Transformers 4.38.2
  • TensorFlow 2.15.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
2

Finetuned from