Edit model card

deneme_spor

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 4.9093
  • Validation Loss: 5.9538
  • Epoch: 149

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -963, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
9.1978 8.9070 0
8.7400 8.5517 1
8.4947 8.3909 2
8.3502 8.2608 3
8.2126 8.1241 4
8.0688 7.9827 5
7.9232 7.8449 6
7.7844 7.7107 7
7.6446 7.5719 8
7.4919 7.4263 9
7.3429 7.2975 10
7.2042 7.1774 11
7.0643 7.0685 12
6.9229 6.9668 13
6.7836 6.8770 14
6.6425 6.7752 15
6.4982 6.6895 16
6.3539 6.5963 17
6.2035 6.5170 18
6.0612 6.4285 19
5.9164 6.3429 20
5.7708 6.2664 21
5.6249 6.1997 22
5.4822 6.1348 23
5.3368 6.0659 24
5.1959 6.0042 25
5.0527 5.9525 26
4.9070 5.9538 27
4.9062 5.9538 28
4.9095 5.9538 29
4.9056 5.9538 30
4.9111 5.9538 31
4.9080 5.9538 32
4.9072 5.9538 33
4.9063 5.9538 34
4.9086 5.9538 35
4.9081 5.9538 36
4.9115 5.9538 37
4.9052 5.9538 38
4.9073 5.9538 39
4.9064 5.9538 40
4.9096 5.9538 41
4.9093 5.9538 42
4.9077 5.9538 43
4.9078 5.9538 44
4.9073 5.9538 45
4.9076 5.9538 46
4.9096 5.9538 47
4.9093 5.9538 48
4.9093 5.9538 49
4.9082 5.9538 50
4.9106 5.9538 51
4.9076 5.9538 52
4.9079 5.9538 53
4.9093 5.9538 54
4.9096 5.9538 55
4.9063 5.9538 56
4.9071 5.9538 57
4.9122 5.9538 58
4.9108 5.9538 59
4.9072 5.9538 60
4.9073 5.9538 61
4.9085 5.9538 62
4.9080 5.9538 63
4.9092 5.9538 64
4.9077 5.9538 65
4.9087 5.9538 66
4.9073 5.9538 67
4.9078 5.9538 68
4.9102 5.9538 69
4.9095 5.9538 70
4.9099 5.9538 71
4.9081 5.9538 72
4.9089 5.9538 73
4.9068 5.9538 74
4.9091 5.9538 75
4.9078 5.9538 76
4.9083 5.9538 77
4.9067 5.9538 78
4.9077 5.9538 79
4.9111 5.9538 80
4.9088 5.9538 81
4.9085 5.9538 82
4.9093 5.9538 83
4.9086 5.9538 84
4.9088 5.9538 85
4.9057 5.9538 86
4.9104 5.9538 87
4.9081 5.9538 88
4.9070 5.9538 89
4.9076 5.9538 90
4.9078 5.9538 91
4.9097 5.9538 92
4.9082 5.9538 93
4.9061 5.9538 94
4.9111 5.9538 95
4.9067 5.9538 96
4.9070 5.9538 97
4.9089 5.9538 98
4.9051 5.9538 99
4.9072 5.9538 100
4.9110 5.9538 101
4.9094 5.9538 102
4.9089 5.9538 103
4.9072 5.9538 104
4.9072 5.9538 105
4.9055 5.9538 106
4.9079 5.9538 107
4.9075 5.9538 108
4.9100 5.9538 109
4.9106 5.9538 110
4.9081 5.9538 111
4.9094 5.9538 112
4.9108 5.9538 113
4.9082 5.9538 114
4.9089 5.9538 115
4.9099 5.9538 116
4.9063 5.9538 117
4.9094 5.9538 118
4.9059 5.9538 119
4.9096 5.9538 120
4.9065 5.9538 121
4.9092 5.9538 122
4.9092 5.9538 123
4.9107 5.9538 124
4.9061 5.9538 125
4.9117 5.9538 126
4.9087 5.9538 127
4.9062 5.9538 128
4.9105 5.9538 129
4.9093 5.9538 130
4.9078 5.9538 131
4.9067 5.9538 132
4.9104 5.9538 133
4.9065 5.9538 134
4.9077 5.9538 135
4.9101 5.9538 136
4.9063 5.9538 137
4.9091 5.9538 138
4.9100 5.9538 139
4.9101 5.9538 140
4.9057 5.9538 141
4.9080 5.9538 142
4.9076 5.9538 143
4.9085 5.9538 144
4.9071 5.9538 145
4.9107 5.9538 146
4.9102 5.9538 147
4.9071 5.9538 148
4.9093 5.9538 149

Framework versions

  • Transformers 4.38.2
  • TensorFlow 2.15.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1

Finetuned from