e-hossam96 commited on
Commit
326d518
1 Parent(s): 626fd1a

End of training

Browse files
Files changed (3) hide show
  1. README.md +159 -21
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 4.9499
20
 
21
  ## Model description
22
 
@@ -35,35 +35,173 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 0.0006
39
- - train_batch_size: 32
40
  - eval_batch_size: 64
41
  - seed: 42
42
- - gradient_accumulation_steps: 8
43
  - total_train_batch_size: 256
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_ratio: 0.01
47
- - num_epochs: 2
48
 
49
  ### Training results
50
 
51
- | Training Loss | Epoch | Step | Validation Loss |
52
- |:-------------:|:------:|:----:|:---------------:|
53
- | 7.6152 | 0.1422 | 100 | 6.9246 |
54
- | 6.6089 | 0.2844 | 200 | 6.3326 |
55
- | 6.1811 | 0.4266 | 300 | 5.9524 |
56
- | 5.8677 | 0.5688 | 400 | 5.6719 |
57
- | 5.6433 | 0.7110 | 500 | 5.4863 |
58
- | 5.503 | 0.8532 | 600 | 5.3572 |
59
- | 5.3964 | 0.9954 | 700 | 5.2521 |
60
- | 5.2963 | 1.1376 | 800 | 5.1742 |
61
- | 5.2239 | 1.2798 | 900 | 5.1095 |
62
- | 5.1744 | 1.4220 | 1000 | 5.0590 |
63
- | 5.1376 | 1.5642 | 1100 | 5.0150 |
64
- | 5.1061 | 1.7064 | 1200 | 4.9836 |
65
- | 5.0786 | 1.8486 | 1300 | 4.9605 |
66
- | 5.0725 | 1.9908 | 1400 | 4.9499 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
 
68
 
69
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 3.2854
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 0.001
39
+ - train_batch_size: 64
40
  - eval_batch_size: 64
41
  - seed: 42
42
+ - gradient_accumulation_steps: 4
43
  - total_train_batch_size: 256
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_ratio: 0.01
47
+ - num_epochs: 24
48
 
49
  ### Training results
50
 
51
+ | Training Loss | Epoch | Step | Validation Loss |
52
+ |:-------------:|:------:|:------:|:---------------:|
53
+ | 5.62 | 0.0585 | 1000 | 5.3754 |
54
+ | 4.6527 | 0.1170 | 2000 | 4.4918 |
55
+ | 4.2818 | 0.1755 | 3000 | 4.1137 |
56
+ | 4.1289 | 0.2340 | 4000 | 3.9388 |
57
+ | 4.0021 | 0.2924 | 5000 | 3.8274 |
58
+ | 3.9301 | 0.3509 | 6000 | 3.7534 |
59
+ | 3.8822 | 0.4094 | 7000 | 3.6986 |
60
+ | 3.8375 | 0.4679 | 8000 | 3.6557 |
61
+ | 3.7918 | 0.5264 | 9000 | 3.6266 |
62
+ | 3.7723 | 0.5849 | 10000 | 3.5994 |
63
+ | 3.7549 | 0.6434 | 11000 | 3.5787 |
64
+ | 3.7324 | 0.7019 | 12000 | 3.5612 |
65
+ | 3.7249 | 0.7604 | 13000 | 3.5436 |
66
+ | 3.6989 | 0.8188 | 14000 | 3.5323 |
67
+ | 3.7003 | 0.8773 | 15000 | 3.5169 |
68
+ | 3.6919 | 0.9358 | 16000 | 3.5055 |
69
+ | 3.6717 | 0.9943 | 17000 | 3.4966 |
70
+ | 3.6612 | 1.0528 | 18000 | 3.4868 |
71
+ | 3.6467 | 1.1113 | 19000 | 3.4787 |
72
+ | 3.6497 | 1.1698 | 20000 | 3.4707 |
73
+ | 3.6193 | 1.2283 | 21000 | 3.4639 |
74
+ | 3.6302 | 1.2868 | 22000 | 3.4572 |
75
+ | 3.6225 | 1.3452 | 23000 | 3.4516 |
76
+ | 3.635 | 1.4037 | 24000 | 3.4458 |
77
+ | 3.6115 | 1.4622 | 25000 | 3.4416 |
78
+ | 3.6162 | 1.5207 | 26000 | 3.4348 |
79
+ | 3.6142 | 1.5792 | 27000 | 3.4329 |
80
+ | 3.5956 | 1.6377 | 28000 | 3.4293 |
81
+ | 3.5885 | 1.6962 | 29000 | 3.4226 |
82
+ | 3.603 | 1.7547 | 30000 | 3.4195 |
83
+ | 3.5947 | 1.8132 | 31000 | 3.4142 |
84
+ | 3.588 | 1.8716 | 32000 | 3.4113 |
85
+ | 3.5803 | 1.9301 | 33000 | 3.4065 |
86
+ | 3.5891 | 1.9886 | 34000 | 3.4044 |
87
+ | 3.5801 | 2.0471 | 35000 | 3.4032 |
88
+ | 3.5739 | 2.1056 | 36000 | 3.3988 |
89
+ | 3.5661 | 2.1641 | 37000 | 3.3981 |
90
+ | 3.5657 | 2.2226 | 38000 | 3.3934 |
91
+ | 3.5727 | 2.2811 | 39000 | 3.3907 |
92
+ | 3.5617 | 2.3396 | 40000 | 3.3885 |
93
+ | 3.5579 | 2.3980 | 41000 | 3.3855 |
94
+ | 3.5553 | 2.4565 | 42000 | 3.3816 |
95
+ | 3.5647 | 2.5150 | 43000 | 3.3803 |
96
+ | 3.5531 | 2.5735 | 44000 | 3.3799 |
97
+ | 3.5494 | 2.6320 | 45000 | 3.3777 |
98
+ | 3.5525 | 2.6905 | 46000 | 3.3759 |
99
+ | 3.5487 | 2.7490 | 47000 | 3.3725 |
100
+ | 3.5551 | 2.8075 | 48000 | 3.3711 |
101
+ | 3.5511 | 2.8660 | 49000 | 3.3681 |
102
+ | 3.5463 | 2.9244 | 50000 | 3.3695 |
103
+ | 3.5419 | 2.9829 | 51000 | 3.3660 |
104
+ | 3.5414 | 3.0414 | 52000 | 3.3648 |
105
+ | 3.5388 | 3.0999 | 53000 | 3.3605 |
106
+ | 3.5333 | 3.1584 | 54000 | 3.3619 |
107
+ | 3.525 | 3.2169 | 55000 | 3.3588 |
108
+ | 3.5361 | 3.2754 | 56000 | 3.3572 |
109
+ | 3.5302 | 3.3339 | 57000 | 3.3540 |
110
+ | 3.5355 | 3.3924 | 58000 | 3.3553 |
111
+ | 3.5391 | 3.4508 | 59000 | 3.3504 |
112
+ | 3.531 | 3.5093 | 60000 | 3.3495 |
113
+ | 3.5293 | 3.5678 | 61000 | 3.3483 |
114
+ | 3.5269 | 3.6263 | 62000 | 3.3489 |
115
+ | 3.5181 | 3.6848 | 63000 | 3.3494 |
116
+ | 3.5205 | 3.7433 | 64000 | 3.3480 |
117
+ | 3.5237 | 3.8018 | 65000 | 3.3440 |
118
+ | 3.5316 | 3.8603 | 66000 | 3.3417 |
119
+ | 3.5222 | 3.9188 | 67000 | 3.3433 |
120
+ | 3.5174 | 3.9772 | 68000 | 3.3418 |
121
+ | 3.518 | 4.0357 | 69000 | 3.3414 |
122
+ | 3.5036 | 4.0942 | 70000 | 3.3365 |
123
+ | 3.5101 | 4.1527 | 71000 | 3.3367 |
124
+ | 3.5145 | 4.2112 | 72000 | 3.3361 |
125
+ | 3.5053 | 4.2697 | 73000 | 3.3355 |
126
+ | 3.5153 | 4.3282 | 74000 | 3.3334 |
127
+ | 3.5003 | 4.3867 | 75000 | 3.3334 |
128
+ | 3.5001 | 4.4452 | 76000 | 3.3326 |
129
+ | 3.5114 | 4.5036 | 77000 | 3.3298 |
130
+ | 3.5108 | 4.5621 | 78000 | 3.3292 |
131
+ | 3.4985 | 4.6206 | 79000 | 3.3288 |
132
+ | 3.497 | 4.6791 | 80000 | 3.3303 |
133
+ | 3.4982 | 4.7376 | 81000 | 3.3291 |
134
+ | 3.5068 | 4.7961 | 82000 | 3.3272 |
135
+ | 3.4915 | 4.8546 | 83000 | 3.3244 |
136
+ | 3.5036 | 4.9131 | 84000 | 3.3214 |
137
+ | 3.5027 | 4.9716 | 85000 | 3.3214 |
138
+ | 3.5078 | 5.0300 | 86000 | 3.3225 |
139
+ | 3.5112 | 5.0885 | 87000 | 3.3243 |
140
+ | 3.5049 | 5.1470 | 88000 | 3.3216 |
141
+ | 3.4917 | 5.2055 | 89000 | 3.3192 |
142
+ | 3.4802 | 5.2640 | 90000 | 3.3188 |
143
+ | 3.4971 | 5.3225 | 91000 | 3.3201 |
144
+ | 3.4941 | 5.3810 | 92000 | 3.3175 |
145
+ | 3.4998 | 5.4395 | 93000 | 3.3179 |
146
+ | 3.5011 | 5.4980 | 94000 | 3.3164 |
147
+ | 3.4912 | 5.5564 | 95000 | 3.3180 |
148
+ | 3.4961 | 5.6149 | 96000 | 3.3168 |
149
+ | 3.4833 | 5.6734 | 97000 | 3.3148 |
150
+ | 3.498 | 5.7319 | 98000 | 3.3133 |
151
+ | 3.4892 | 5.7904 | 99000 | 3.3142 |
152
+ | 3.4967 | 5.8489 | 100000 | 3.3142 |
153
+ | 3.4847 | 5.9074 | 101000 | 3.3094 |
154
+ | 3.4899 | 5.9659 | 102000 | 3.3102 |
155
+ | 3.4774 | 6.0244 | 103000 | 3.3110 |
156
+ | 3.4854 | 6.0828 | 104000 | 3.3106 |
157
+ | 3.4873 | 6.1413 | 105000 | 3.3087 |
158
+ | 3.4869 | 6.1998 | 106000 | 3.3102 |
159
+ | 3.4833 | 6.2583 | 107000 | 3.3063 |
160
+ | 3.491 | 6.3168 | 108000 | 3.3082 |
161
+ | 3.4776 | 6.3753 | 109000 | 3.3075 |
162
+ | 3.4924 | 6.4338 | 110000 | 3.3068 |
163
+ | 3.4804 | 6.4923 | 111000 | 3.3050 |
164
+ | 3.4805 | 6.5508 | 112000 | 3.3041 |
165
+ | 3.4892 | 6.6093 | 113000 | 3.3031 |
166
+ | 3.4775 | 6.6677 | 114000 | 3.3032 |
167
+ | 3.481 | 6.7262 | 115000 | 3.3036 |
168
+ | 3.4782 | 6.7847 | 116000 | 3.3025 |
169
+ | 3.4804 | 6.8432 | 117000 | 3.3017 |
170
+ | 3.4841 | 6.9017 | 118000 | 3.2999 |
171
+ | 3.4784 | 6.9602 | 119000 | 3.3008 |
172
+ | 3.4821 | 7.0187 | 120000 | 3.3001 |
173
+ | 3.4671 | 7.0772 | 121000 | 3.3008 |
174
+ | 3.485 | 7.1357 | 122000 | 3.2976 |
175
+ | 3.4737 | 7.1941 | 123000 | 3.2985 |
176
+ | 3.4793 | 7.2526 | 124000 | 3.2979 |
177
+ | 3.4651 | 7.3111 | 125000 | 3.2968 |
178
+ | 3.4847 | 7.3696 | 126000 | 3.2974 |
179
+ | 3.474 | 7.4281 | 127000 | 3.2973 |
180
+ | 3.4769 | 7.4866 | 128000 | 3.2955 |
181
+ | 3.486 | 7.5451 | 129000 | 3.2953 |
182
+ | 3.4684 | 7.6036 | 130000 | 3.2944 |
183
+ | 3.4826 | 7.6621 | 131000 | 3.2949 |
184
+ | 3.4685 | 7.7205 | 132000 | 3.2944 |
185
+ | 3.4608 | 7.7790 | 133000 | 3.2931 |
186
+ | 3.4655 | 7.8375 | 134000 | 3.2953 |
187
+ | 3.4648 | 7.8960 | 135000 | 3.2928 |
188
+ | 3.4632 | 7.9545 | 136000 | 3.2936 |
189
+ | 3.4666 | 8.0130 | 137000 | 3.2902 |
190
+ | 3.4663 | 8.0715 | 138000 | 3.2939 |
191
+ | 3.4713 | 8.1300 | 139000 | 3.2904 |
192
+ | 3.4654 | 8.1885 | 140000 | 3.2917 |
193
+ | 3.466 | 8.2469 | 141000 | 3.2913 |
194
+ | 3.4724 | 8.3054 | 142000 | 3.2889 |
195
+ | 3.4695 | 8.3639 | 143000 | 3.2890 |
196
+ | 3.4729 | 8.4224 | 144000 | 3.2876 |
197
+ | 3.4551 | 8.4809 | 145000 | 3.2898 |
198
+ | 3.4652 | 8.5394 | 146000 | 3.2885 |
199
+ | 3.4689 | 8.5979 | 147000 | 3.2854 |
200
+ | 3.4647 | 8.6564 | 148000 | 3.2857 |
201
+ | 3.4653 | 8.7149 | 149000 | 3.2857 |
202
+ | 3.4552 | 8.7733 | 150000 | 3.2861 |
203
+ | 3.47 | 8.8318 | 151000 | 3.2868 |
204
+ | 3.4627 | 8.8903 | 152000 | 3.2854 |
205
 
206
 
207
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8aea299f4f7e52d3140e17c3e44315e4867c88c09826f7d688d7736005ead2be
3
  size 22080496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f05d284de06f966859d50432ebaa80cf6d6bd6b9485a9984695ea86e6fc9dbda
3
  size 22080496
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9816fd657dc3ca74aba11ed7ddcfb22cd9813231085213afa94832c8f93cd28d
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5c749c246d14a02ab2c5b292a6312faf35f70556ea7daf54af9e1303e431065
3
  size 5240