jcmc commited on
Commit
b932980
1 Parent(s): f890238

End of training

Browse files
all_results.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 100.0,
3
+ "eval_loss": NaN,
4
+ "eval_runtime": 35.5402,
5
+ "eval_samples": 509,
6
+ "eval_samples_per_second": 14.322,
7
+ "eval_steps_per_second": 14.322,
8
+ "eval_wer": 1.0,
9
+ "train_loss": 2.8321772269315497,
10
+ "train_runtime": 30413.0752,
11
+ "train_samples": 1035,
12
+ "train_samples_per_second": 3.403,
13
+ "train_steps_per_second": 0.848
14
+ }
eval_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 100.0,
3
+ "eval_loss": NaN,
4
+ "eval_runtime": 35.5402,
5
+ "eval_samples": 509,
6
+ "eval_samples_per_second": 14.322,
7
+ "eval_steps_per_second": 14.322,
8
+ "eval_wer": 1.0
9
+ }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b017d326f26e2c3c653d1dba80f5105bad74c0be8d2384b77f9f8016f8f544a5
3
  size 3850486961
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:15b9232dd41b0e32cf9d8505d0e93c67d60b373700db56e4580434ac7cc12895
3
  size 3850486961
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 100.0,
3
+ "train_loss": 2.8321772269315497,
4
+ "train_runtime": 30413.0752,
5
+ "train_samples": 1035,
6
+ "train_samples_per_second": 3.403,
7
+ "train_steps_per_second": 0.848
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,2032 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 99.99710144927536,
5
+ "global_step": 25800,
6
+ "is_hyper_param_search": false,
7
+ "is_local_process_zero": true,
8
+ "is_world_process_zero": true,
9
+ "log_history": [
10
+ {
11
+ "epoch": 0.39,
12
+ "learning_rate": 0.000194,
13
+ "loss": 4.664,
14
+ "step": 100
15
+ },
16
+ {
17
+ "epoch": 0.77,
18
+ "learning_rate": 0.00039400000000000004,
19
+ "loss": 3.2967,
20
+ "step": 200
21
+ },
22
+ {
23
+ "epoch": 1.16,
24
+ "learning_rate": 0.000594,
25
+ "loss": 3.0785,
26
+ "step": 300
27
+ },
28
+ {
29
+ "epoch": 1.55,
30
+ "learning_rate": 0.0007940000000000001,
31
+ "loss": 3.1121,
32
+ "step": 400
33
+ },
34
+ {
35
+ "epoch": 1.94,
36
+ "learning_rate": 0.000994,
37
+ "loss": 3.0395,
38
+ "step": 500
39
+ },
40
+ {
41
+ "epoch": 1.94,
42
+ "eval_loss": 3.0830917358398438,
43
+ "eval_runtime": 36.9824,
44
+ "eval_samples_per_second": 13.763,
45
+ "eval_steps_per_second": 13.763,
46
+ "eval_wer": 1.0,
47
+ "step": 500
48
+ },
49
+ {
50
+ "epoch": 2.32,
51
+ "learning_rate": 0.0009961660079051383,
52
+ "loss": 3.0808,
53
+ "step": 600
54
+ },
55
+ {
56
+ "epoch": 2.71,
57
+ "learning_rate": 0.0009922134387351778,
58
+ "loss": 3.0233,
59
+ "step": 700
60
+ },
61
+ {
62
+ "epoch": 3.1,
63
+ "learning_rate": 0.0009882608695652175,
64
+ "loss": 3.0452,
65
+ "step": 800
66
+ },
67
+ {
68
+ "epoch": 3.49,
69
+ "learning_rate": 0.000984308300395257,
70
+ "loss": 3.0196,
71
+ "step": 900
72
+ },
73
+ {
74
+ "epoch": 3.87,
75
+ "learning_rate": 0.0009803557312252965,
76
+ "loss": 3.0126,
77
+ "step": 1000
78
+ },
79
+ {
80
+ "epoch": 3.87,
81
+ "eval_loss": 2.993453025817871,
82
+ "eval_runtime": 36.8132,
83
+ "eval_samples_per_second": 13.827,
84
+ "eval_steps_per_second": 13.827,
85
+ "eval_wer": 1.0,
86
+ "step": 1000
87
+ },
88
+ {
89
+ "epoch": 4.26,
90
+ "learning_rate": 0.000976403162055336,
91
+ "loss": 3.0203,
92
+ "step": 1100
93
+ },
94
+ {
95
+ "epoch": 4.65,
96
+ "learning_rate": 0.0009724505928853755,
97
+ "loss": 2.9963,
98
+ "step": 1200
99
+ },
100
+ {
101
+ "epoch": 5.04,
102
+ "learning_rate": 0.0009684980237154151,
103
+ "loss": 2.9925,
104
+ "step": 1300
105
+ },
106
+ {
107
+ "epoch": 5.43,
108
+ "learning_rate": 0.0009645454545454546,
109
+ "loss": 2.9327,
110
+ "step": 1400
111
+ },
112
+ {
113
+ "epoch": 5.81,
114
+ "learning_rate": 0.0009605928853754941,
115
+ "loss": 2.9259,
116
+ "step": 1500
117
+ },
118
+ {
119
+ "epoch": 5.81,
120
+ "eval_loss": 2.991508722305298,
121
+ "eval_runtime": 35.6023,
122
+ "eval_samples_per_second": 14.297,
123
+ "eval_steps_per_second": 14.297,
124
+ "eval_wer": 1.0,
125
+ "step": 1500
126
+ },
127
+ {
128
+ "epoch": 6.2,
129
+ "learning_rate": 0.0009566403162055336,
130
+ "loss": 2.9407,
131
+ "step": 1600
132
+ },
133
+ {
134
+ "epoch": 6.59,
135
+ "learning_rate": 0.0009526877470355732,
136
+ "loss": 2.9415,
137
+ "step": 1700
138
+ },
139
+ {
140
+ "epoch": 6.97,
141
+ "learning_rate": 0.0009487351778656127,
142
+ "loss": 2.9128,
143
+ "step": 1800
144
+ },
145
+ {
146
+ "epoch": 7.36,
147
+ "learning_rate": 0.0009447826086956523,
148
+ "loss": 2.9412,
149
+ "step": 1900
150
+ },
151
+ {
152
+ "epoch": 7.75,
153
+ "learning_rate": 0.0009408300395256917,
154
+ "loss": 2.9109,
155
+ "step": 2000
156
+ },
157
+ {
158
+ "epoch": 7.75,
159
+ "eval_loss": 2.900594472885132,
160
+ "eval_runtime": 36.5216,
161
+ "eval_samples_per_second": 13.937,
162
+ "eval_steps_per_second": 13.937,
163
+ "eval_wer": 1.0,
164
+ "step": 2000
165
+ },
166
+ {
167
+ "epoch": 8.14,
168
+ "learning_rate": 0.0009368774703557313,
169
+ "loss": 2.9261,
170
+ "step": 2100
171
+ },
172
+ {
173
+ "epoch": 8.53,
174
+ "learning_rate": 0.00093300395256917,
175
+ "loss": 2.9683,
176
+ "step": 2200
177
+ },
178
+ {
179
+ "epoch": 8.91,
180
+ "learning_rate": 0.0009290513833992095,
181
+ "loss": 2.9236,
182
+ "step": 2300
183
+ },
184
+ {
185
+ "epoch": 9.3,
186
+ "learning_rate": 0.000925098814229249,
187
+ "loss": 2.9302,
188
+ "step": 2400
189
+ },
190
+ {
191
+ "epoch": 9.69,
192
+ "learning_rate": 0.0009211462450592886,
193
+ "loss": 2.8934,
194
+ "step": 2500
195
+ },
196
+ {
197
+ "epoch": 9.69,
198
+ "eval_loss": 2.926596164703369,
199
+ "eval_runtime": 36.21,
200
+ "eval_samples_per_second": 14.057,
201
+ "eval_steps_per_second": 14.057,
202
+ "eval_wer": 1.0,
203
+ "step": 2500
204
+ },
205
+ {
206
+ "epoch": 10.08,
207
+ "learning_rate": 0.0009171936758893281,
208
+ "loss": 2.9226,
209
+ "step": 2600
210
+ },
211
+ {
212
+ "epoch": 10.46,
213
+ "learning_rate": 0.0009132411067193676,
214
+ "loss": 2.9074,
215
+ "step": 2700
216
+ },
217
+ {
218
+ "epoch": 10.85,
219
+ "learning_rate": 0.0009092885375494071,
220
+ "loss": 2.9047,
221
+ "step": 2800
222
+ },
223
+ {
224
+ "epoch": 11.24,
225
+ "learning_rate": 0.0009053359683794467,
226
+ "loss": 2.959,
227
+ "step": 2900
228
+ },
229
+ {
230
+ "epoch": 11.63,
231
+ "learning_rate": 0.0009013833992094862,
232
+ "loss": 2.9014,
233
+ "step": 3000
234
+ },
235
+ {
236
+ "epoch": 11.63,
237
+ "eval_loss": 2.8970346450805664,
238
+ "eval_runtime": 36.0674,
239
+ "eval_samples_per_second": 14.112,
240
+ "eval_steps_per_second": 14.112,
241
+ "eval_wer": 1.0,
242
+ "step": 3000
243
+ },
244
+ {
245
+ "epoch": 12.02,
246
+ "learning_rate": 0.0008974308300395258,
247
+ "loss": 2.9198,
248
+ "step": 3100
249
+ },
250
+ {
251
+ "epoch": 12.4,
252
+ "learning_rate": 0.0008934782608695652,
253
+ "loss": 2.9008,
254
+ "step": 3200
255
+ },
256
+ {
257
+ "epoch": 12.79,
258
+ "learning_rate": 0.0008895256916996048,
259
+ "loss": 2.8744,
260
+ "step": 3300
261
+ },
262
+ {
263
+ "epoch": 13.18,
264
+ "learning_rate": 0.0008855731225296443,
265
+ "loss": 2.9067,
266
+ "step": 3400
267
+ },
268
+ {
269
+ "epoch": 13.56,
270
+ "learning_rate": 0.0008816205533596839,
271
+ "loss": 2.8932,
272
+ "step": 3500
273
+ },
274
+ {
275
+ "epoch": 13.56,
276
+ "eval_loss": 2.8873867988586426,
277
+ "eval_runtime": 35.7145,
278
+ "eval_samples_per_second": 14.252,
279
+ "eval_steps_per_second": 14.252,
280
+ "eval_wer": 1.0,
281
+ "step": 3500
282
+ },
283
+ {
284
+ "epoch": 13.95,
285
+ "learning_rate": 0.0008776679841897234,
286
+ "loss": 2.8955,
287
+ "step": 3600
288
+ },
289
+ {
290
+ "epoch": 14.34,
291
+ "learning_rate": 0.0008752173913043478,
292
+ "loss": 5.6005,
293
+ "step": 3700
294
+ },
295
+ {
296
+ "epoch": 14.73,
297
+ "learning_rate": 0.0008749802371541502,
298
+ "loss": 183.2638,
299
+ "step": 3800
300
+ },
301
+ {
302
+ "epoch": 15.12,
303
+ "learning_rate": 0.0008722134387351779,
304
+ "loss": 433.3232,
305
+ "step": 3900
306
+ },
307
+ {
308
+ "epoch": 15.5,
309
+ "learning_rate": 0.0008682608695652174,
310
+ "loss": 0.0,
311
+ "step": 4000
312
+ },
313
+ {
314
+ "epoch": 15.5,
315
+ "eval_loss": NaN,
316
+ "eval_runtime": 35.8122,
317
+ "eval_samples_per_second": 14.213,
318
+ "eval_steps_per_second": 14.213,
319
+ "eval_wer": 1.0,
320
+ "step": 4000
321
+ },
322
+ {
323
+ "epoch": 15.89,
324
+ "learning_rate": 0.000864308300395257,
325
+ "loss": 0.0,
326
+ "step": 4100
327
+ },
328
+ {
329
+ "epoch": 16.28,
330
+ "learning_rate": 0.0008603557312252964,
331
+ "loss": 0.0,
332
+ "step": 4200
333
+ },
334
+ {
335
+ "epoch": 16.66,
336
+ "learning_rate": 0.000856403162055336,
337
+ "loss": 0.0,
338
+ "step": 4300
339
+ },
340
+ {
341
+ "epoch": 17.05,
342
+ "learning_rate": 0.0008524505928853755,
343
+ "loss": 0.0,
344
+ "step": 4400
345
+ },
346
+ {
347
+ "epoch": 17.44,
348
+ "learning_rate": 0.0008484980237154151,
349
+ "loss": 0.0,
350
+ "step": 4500
351
+ },
352
+ {
353
+ "epoch": 17.44,
354
+ "eval_loss": NaN,
355
+ "eval_runtime": 35.7216,
356
+ "eval_samples_per_second": 14.249,
357
+ "eval_steps_per_second": 14.249,
358
+ "eval_wer": 1.0,
359
+ "step": 4500
360
+ },
361
+ {
362
+ "epoch": 17.83,
363
+ "learning_rate": 0.0008445454545454546,
364
+ "loss": 0.0,
365
+ "step": 4600
366
+ },
367
+ {
368
+ "epoch": 18.22,
369
+ "learning_rate": 0.0008405928853754941,
370
+ "loss": 0.0,
371
+ "step": 4700
372
+ },
373
+ {
374
+ "epoch": 18.6,
375
+ "learning_rate": 0.0008366403162055336,
376
+ "loss": 0.0,
377
+ "step": 4800
378
+ },
379
+ {
380
+ "epoch": 18.99,
381
+ "learning_rate": 0.0008326877470355732,
382
+ "loss": 0.0,
383
+ "step": 4900
384
+ },
385
+ {
386
+ "epoch": 19.38,
387
+ "learning_rate": 0.0008287351778656127,
388
+ "loss": 0.0,
389
+ "step": 5000
390
+ },
391
+ {
392
+ "epoch": 19.38,
393
+ "eval_loss": NaN,
394
+ "eval_runtime": 36.4137,
395
+ "eval_samples_per_second": 13.978,
396
+ "eval_steps_per_second": 13.978,
397
+ "eval_wer": 1.0,
398
+ "step": 5000
399
+ },
400
+ {
401
+ "epoch": 19.77,
402
+ "learning_rate": 0.0008247826086956522,
403
+ "loss": 0.0,
404
+ "step": 5100
405
+ },
406
+ {
407
+ "epoch": 20.15,
408
+ "learning_rate": 0.0008208300395256917,
409
+ "loss": 0.0,
410
+ "step": 5200
411
+ },
412
+ {
413
+ "epoch": 20.54,
414
+ "learning_rate": 0.0008168774703557313,
415
+ "loss": 0.0,
416
+ "step": 5300
417
+ },
418
+ {
419
+ "epoch": 20.93,
420
+ "learning_rate": 0.0008129249011857708,
421
+ "loss": 0.0,
422
+ "step": 5400
423
+ },
424
+ {
425
+ "epoch": 21.32,
426
+ "learning_rate": 0.0008089723320158103,
427
+ "loss": 0.0,
428
+ "step": 5500
429
+ },
430
+ {
431
+ "epoch": 21.32,
432
+ "eval_loss": NaN,
433
+ "eval_runtime": 36.0046,
434
+ "eval_samples_per_second": 14.137,
435
+ "eval_steps_per_second": 14.137,
436
+ "eval_wer": 1.0,
437
+ "step": 5500
438
+ },
439
+ {
440
+ "epoch": 21.7,
441
+ "learning_rate": 0.0008050197628458498,
442
+ "loss": 0.0,
443
+ "step": 5600
444
+ },
445
+ {
446
+ "epoch": 22.09,
447
+ "learning_rate": 0.0008010671936758893,
448
+ "loss": 0.0,
449
+ "step": 5700
450
+ },
451
+ {
452
+ "epoch": 22.48,
453
+ "learning_rate": 0.0007971146245059289,
454
+ "loss": 0.0,
455
+ "step": 5800
456
+ },
457
+ {
458
+ "epoch": 22.87,
459
+ "learning_rate": 0.0007931620553359684,
460
+ "loss": 0.0,
461
+ "step": 5900
462
+ },
463
+ {
464
+ "epoch": 23.26,
465
+ "learning_rate": 0.000789209486166008,
466
+ "loss": 0.0,
467
+ "step": 6000
468
+ },
469
+ {
470
+ "epoch": 23.26,
471
+ "eval_loss": NaN,
472
+ "eval_runtime": 38.3031,
473
+ "eval_samples_per_second": 13.289,
474
+ "eval_steps_per_second": 13.289,
475
+ "eval_wer": 1.0,
476
+ "step": 6000
477
+ },
478
+ {
479
+ "epoch": 23.64,
480
+ "learning_rate": 0.0007852569169960474,
481
+ "loss": 0.0,
482
+ "step": 6100
483
+ },
484
+ {
485
+ "epoch": 24.03,
486
+ "learning_rate": 0.000781304347826087,
487
+ "loss": 0.0,
488
+ "step": 6200
489
+ },
490
+ {
491
+ "epoch": 24.42,
492
+ "learning_rate": 0.0007773517786561265,
493
+ "loss": 0.0,
494
+ "step": 6300
495
+ },
496
+ {
497
+ "epoch": 24.8,
498
+ "learning_rate": 0.0007733992094861661,
499
+ "loss": 0.0,
500
+ "step": 6400
501
+ },
502
+ {
503
+ "epoch": 25.19,
504
+ "learning_rate": 0.0007694466403162056,
505
+ "loss": 0.0,
506
+ "step": 6500
507
+ },
508
+ {
509
+ "epoch": 25.19,
510
+ "eval_loss": NaN,
511
+ "eval_runtime": 37.5153,
512
+ "eval_samples_per_second": 13.568,
513
+ "eval_steps_per_second": 13.568,
514
+ "eval_wer": 1.0,
515
+ "step": 6500
516
+ },
517
+ {
518
+ "epoch": 25.58,
519
+ "learning_rate": 0.0007654940711462451,
520
+ "loss": 0.0,
521
+ "step": 6600
522
+ },
523
+ {
524
+ "epoch": 25.97,
525
+ "learning_rate": 0.0007615415019762846,
526
+ "loss": 0.0,
527
+ "step": 6700
528
+ },
529
+ {
530
+ "epoch": 26.36,
531
+ "learning_rate": 0.0007575889328063242,
532
+ "loss": 0.0,
533
+ "step": 6800
534
+ },
535
+ {
536
+ "epoch": 26.74,
537
+ "learning_rate": 0.0007536363636363637,
538
+ "loss": 0.0,
539
+ "step": 6900
540
+ },
541
+ {
542
+ "epoch": 27.13,
543
+ "learning_rate": 0.0007496837944664033,
544
+ "loss": 0.0,
545
+ "step": 7000
546
+ },
547
+ {
548
+ "epoch": 27.13,
549
+ "eval_loss": NaN,
550
+ "eval_runtime": 36.3783,
551
+ "eval_samples_per_second": 13.992,
552
+ "eval_steps_per_second": 13.992,
553
+ "eval_wer": 1.0,
554
+ "step": 7000
555
+ },
556
+ {
557
+ "epoch": 27.52,
558
+ "learning_rate": 0.0007457312252964427,
559
+ "loss": 0.0,
560
+ "step": 7100
561
+ },
562
+ {
563
+ "epoch": 27.9,
564
+ "learning_rate": 0.0007417786561264823,
565
+ "loss": 0.0,
566
+ "step": 7200
567
+ },
568
+ {
569
+ "epoch": 28.29,
570
+ "learning_rate": 0.0007378260869565218,
571
+ "loss": 0.0,
572
+ "step": 7300
573
+ },
574
+ {
575
+ "epoch": 28.68,
576
+ "learning_rate": 0.0007338735177865614,
577
+ "loss": 0.0,
578
+ "step": 7400
579
+ },
580
+ {
581
+ "epoch": 29.07,
582
+ "learning_rate": 0.0007299209486166009,
583
+ "loss": 0.0,
584
+ "step": 7500
585
+ },
586
+ {
587
+ "epoch": 29.07,
588
+ "eval_loss": NaN,
589
+ "eval_runtime": 36.3152,
590
+ "eval_samples_per_second": 14.016,
591
+ "eval_steps_per_second": 14.016,
592
+ "eval_wer": 1.0,
593
+ "step": 7500
594
+ },
595
+ {
596
+ "epoch": 29.46,
597
+ "learning_rate": 0.0007259683794466402,
598
+ "loss": 0.0,
599
+ "step": 7600
600
+ },
601
+ {
602
+ "epoch": 29.84,
603
+ "learning_rate": 0.0007220158102766799,
604
+ "loss": 0.0,
605
+ "step": 7700
606
+ },
607
+ {
608
+ "epoch": 30.23,
609
+ "learning_rate": 0.0007180632411067193,
610
+ "loss": 0.0,
611
+ "step": 7800
612
+ },
613
+ {
614
+ "epoch": 30.62,
615
+ "learning_rate": 0.000714110671936759,
616
+ "loss": 0.0,
617
+ "step": 7900
618
+ },
619
+ {
620
+ "epoch": 31.01,
621
+ "learning_rate": 0.0007101581027667984,
622
+ "loss": 0.0,
623
+ "step": 8000
624
+ },
625
+ {
626
+ "epoch": 31.01,
627
+ "eval_loss": NaN,
628
+ "eval_runtime": 38.4864,
629
+ "eval_samples_per_second": 13.225,
630
+ "eval_steps_per_second": 13.225,
631
+ "eval_wer": 1.0,
632
+ "step": 8000
633
+ },
634
+ {
635
+ "epoch": 31.39,
636
+ "learning_rate": 0.0007062055335968379,
637
+ "loss": 0.0,
638
+ "step": 8100
639
+ },
640
+ {
641
+ "epoch": 31.78,
642
+ "learning_rate": 0.0007022529644268774,
643
+ "loss": 0.0,
644
+ "step": 8200
645
+ },
646
+ {
647
+ "epoch": 32.17,
648
+ "learning_rate": 0.000698300395256917,
649
+ "loss": 0.0,
650
+ "step": 8300
651
+ },
652
+ {
653
+ "epoch": 32.56,
654
+ "learning_rate": 0.0006943478260869565,
655
+ "loss": 0.0,
656
+ "step": 8400
657
+ },
658
+ {
659
+ "epoch": 32.94,
660
+ "learning_rate": 0.0006903952569169961,
661
+ "loss": 0.0,
662
+ "step": 8500
663
+ },
664
+ {
665
+ "epoch": 32.94,
666
+ "eval_loss": NaN,
667
+ "eval_runtime": 35.9416,
668
+ "eval_samples_per_second": 14.162,
669
+ "eval_steps_per_second": 14.162,
670
+ "eval_wer": 1.0,
671
+ "step": 8500
672
+ },
673
+ {
674
+ "epoch": 33.33,
675
+ "learning_rate": 0.0006864426877470355,
676
+ "loss": 0.0,
677
+ "step": 8600
678
+ },
679
+ {
680
+ "epoch": 33.72,
681
+ "learning_rate": 0.0006824901185770751,
682
+ "loss": 0.0,
683
+ "step": 8700
684
+ },
685
+ {
686
+ "epoch": 34.11,
687
+ "learning_rate": 0.0006785375494071146,
688
+ "loss": 0.0,
689
+ "step": 8800
690
+ },
691
+ {
692
+ "epoch": 34.49,
693
+ "learning_rate": 0.0006745849802371542,
694
+ "loss": 0.0,
695
+ "step": 8900
696
+ },
697
+ {
698
+ "epoch": 34.88,
699
+ "learning_rate": 0.0006706324110671936,
700
+ "loss": 0.0,
701
+ "step": 9000
702
+ },
703
+ {
704
+ "epoch": 34.88,
705
+ "eval_loss": NaN,
706
+ "eval_runtime": 37.9216,
707
+ "eval_samples_per_second": 13.422,
708
+ "eval_steps_per_second": 13.422,
709
+ "eval_wer": 1.0,
710
+ "step": 9000
711
+ },
712
+ {
713
+ "epoch": 35.27,
714
+ "learning_rate": 0.0006666798418972332,
715
+ "loss": 0.0,
716
+ "step": 9100
717
+ },
718
+ {
719
+ "epoch": 35.66,
720
+ "learning_rate": 0.0006627272727272727,
721
+ "loss": 0.0,
722
+ "step": 9200
723
+ },
724
+ {
725
+ "epoch": 36.05,
726
+ "learning_rate": 0.0006587747035573123,
727
+ "loss": 0.0,
728
+ "step": 9300
729
+ },
730
+ {
731
+ "epoch": 36.43,
732
+ "learning_rate": 0.0006548221343873518,
733
+ "loss": 0.0,
734
+ "step": 9400
735
+ },
736
+ {
737
+ "epoch": 36.82,
738
+ "learning_rate": 0.0006508695652173912,
739
+ "loss": 0.0,
740
+ "step": 9500
741
+ },
742
+ {
743
+ "epoch": 36.82,
744
+ "eval_loss": NaN,
745
+ "eval_runtime": 35.4774,
746
+ "eval_samples_per_second": 14.347,
747
+ "eval_steps_per_second": 14.347,
748
+ "eval_wer": 1.0,
749
+ "step": 9500
750
+ },
751
+ {
752
+ "epoch": 37.21,
753
+ "learning_rate": 0.0006469169960474308,
754
+ "loss": 0.0,
755
+ "step": 9600
756
+ },
757
+ {
758
+ "epoch": 37.6,
759
+ "learning_rate": 0.0006429644268774703,
760
+ "loss": 0.0,
761
+ "step": 9700
762
+ },
763
+ {
764
+ "epoch": 37.98,
765
+ "learning_rate": 0.0006390118577075099,
766
+ "loss": 0.0,
767
+ "step": 9800
768
+ },
769
+ {
770
+ "epoch": 38.37,
771
+ "learning_rate": 0.0006350592885375494,
772
+ "loss": 0.0,
773
+ "step": 9900
774
+ },
775
+ {
776
+ "epoch": 38.76,
777
+ "learning_rate": 0.0006311067193675889,
778
+ "loss": 0.0,
779
+ "step": 10000
780
+ },
781
+ {
782
+ "epoch": 38.76,
783
+ "eval_loss": NaN,
784
+ "eval_runtime": 35.6029,
785
+ "eval_samples_per_second": 14.297,
786
+ "eval_steps_per_second": 14.297,
787
+ "eval_wer": 1.0,
788
+ "step": 10000
789
+ },
790
+ {
791
+ "epoch": 39.15,
792
+ "learning_rate": 0.0006271541501976284,
793
+ "loss": 0.0,
794
+ "step": 10100
795
+ },
796
+ {
797
+ "epoch": 39.53,
798
+ "learning_rate": 0.000623201581027668,
799
+ "loss": 0.0,
800
+ "step": 10200
801
+ },
802
+ {
803
+ "epoch": 39.92,
804
+ "learning_rate": 0.0006192490118577075,
805
+ "loss": 0.0,
806
+ "step": 10300
807
+ },
808
+ {
809
+ "epoch": 40.31,
810
+ "learning_rate": 0.0006152964426877471,
811
+ "loss": 0.0,
812
+ "step": 10400
813
+ },
814
+ {
815
+ "epoch": 40.7,
816
+ "learning_rate": 0.0006113438735177865,
817
+ "loss": 0.0,
818
+ "step": 10500
819
+ },
820
+ {
821
+ "epoch": 40.7,
822
+ "eval_loss": NaN,
823
+ "eval_runtime": 35.5775,
824
+ "eval_samples_per_second": 14.307,
825
+ "eval_steps_per_second": 14.307,
826
+ "eval_wer": 1.0,
827
+ "step": 10500
828
+ },
829
+ {
830
+ "epoch": 41.09,
831
+ "learning_rate": 0.0006073913043478261,
832
+ "loss": 0.0,
833
+ "step": 10600
834
+ },
835
+ {
836
+ "epoch": 41.47,
837
+ "learning_rate": 0.0006034387351778656,
838
+ "loss": 0.0,
839
+ "step": 10700
840
+ },
841
+ {
842
+ "epoch": 41.86,
843
+ "learning_rate": 0.0005994861660079052,
844
+ "loss": 0.0,
845
+ "step": 10800
846
+ },
847
+ {
848
+ "epoch": 42.25,
849
+ "learning_rate": 0.0005955335968379447,
850
+ "loss": 0.0,
851
+ "step": 10900
852
+ },
853
+ {
854
+ "epoch": 42.63,
855
+ "learning_rate": 0.0005915810276679842,
856
+ "loss": 0.0,
857
+ "step": 11000
858
+ },
859
+ {
860
+ "epoch": 42.63,
861
+ "eval_loss": NaN,
862
+ "eval_runtime": 36.4216,
863
+ "eval_samples_per_second": 13.975,
864
+ "eval_steps_per_second": 13.975,
865
+ "eval_wer": 1.0,
866
+ "step": 11000
867
+ },
868
+ {
869
+ "epoch": 43.02,
870
+ "learning_rate": 0.0005876284584980237,
871
+ "loss": 0.0,
872
+ "step": 11100
873
+ },
874
+ {
875
+ "epoch": 43.41,
876
+ "learning_rate": 0.0005836758893280633,
877
+ "loss": 0.0,
878
+ "step": 11200
879
+ },
880
+ {
881
+ "epoch": 43.8,
882
+ "learning_rate": 0.0005797233201581028,
883
+ "loss": 0.0,
884
+ "step": 11300
885
+ },
886
+ {
887
+ "epoch": 44.19,
888
+ "learning_rate": 0.0005757707509881423,
889
+ "loss": 0.0,
890
+ "step": 11400
891
+ },
892
+ {
893
+ "epoch": 44.57,
894
+ "learning_rate": 0.0005718181818181818,
895
+ "loss": 0.0,
896
+ "step": 11500
897
+ },
898
+ {
899
+ "epoch": 44.57,
900
+ "eval_loss": NaN,
901
+ "eval_runtime": 36.0689,
902
+ "eval_samples_per_second": 14.112,
903
+ "eval_steps_per_second": 14.112,
904
+ "eval_wer": 1.0,
905
+ "step": 11500
906
+ },
907
+ {
908
+ "epoch": 44.96,
909
+ "learning_rate": 0.0005678656126482213,
910
+ "loss": 0.0,
911
+ "step": 11600
912
+ },
913
+ {
914
+ "epoch": 45.35,
915
+ "learning_rate": 0.0005639130434782609,
916
+ "loss": 0.0,
917
+ "step": 11700
918
+ },
919
+ {
920
+ "epoch": 45.73,
921
+ "learning_rate": 0.0005599604743083004,
922
+ "loss": 0.0,
923
+ "step": 11800
924
+ },
925
+ {
926
+ "epoch": 46.12,
927
+ "learning_rate": 0.00055600790513834,
928
+ "loss": 0.0,
929
+ "step": 11900
930
+ },
931
+ {
932
+ "epoch": 46.51,
933
+ "learning_rate": 0.0005520553359683794,
934
+ "loss": 0.0,
935
+ "step": 12000
936
+ },
937
+ {
938
+ "epoch": 46.51,
939
+ "eval_loss": NaN,
940
+ "eval_runtime": 36.0236,
941
+ "eval_samples_per_second": 14.13,
942
+ "eval_steps_per_second": 14.13,
943
+ "eval_wer": 1.0,
944
+ "step": 12000
945
+ },
946
+ {
947
+ "epoch": 46.9,
948
+ "learning_rate": 0.000548102766798419,
949
+ "loss": 0.0,
950
+ "step": 12100
951
+ },
952
+ {
953
+ "epoch": 47.29,
954
+ "learning_rate": 0.0005441501976284585,
955
+ "loss": 0.0,
956
+ "step": 12200
957
+ },
958
+ {
959
+ "epoch": 47.67,
960
+ "learning_rate": 0.0005401976284584981,
961
+ "loss": 0.0,
962
+ "step": 12300
963
+ },
964
+ {
965
+ "epoch": 48.06,
966
+ "learning_rate": 0.0005362450592885375,
967
+ "loss": 0.0,
968
+ "step": 12400
969
+ },
970
+ {
971
+ "epoch": 48.45,
972
+ "learning_rate": 0.0005322924901185771,
973
+ "loss": 0.0,
974
+ "step": 12500
975
+ },
976
+ {
977
+ "epoch": 48.45,
978
+ "eval_loss": NaN,
979
+ "eval_runtime": 34.6724,
980
+ "eval_samples_per_second": 14.68,
981
+ "eval_steps_per_second": 14.68,
982
+ "eval_wer": 1.0,
983
+ "step": 12500
984
+ },
985
+ {
986
+ "epoch": 48.83,
987
+ "learning_rate": 0.0005283399209486166,
988
+ "loss": 0.0,
989
+ "step": 12600
990
+ },
991
+ {
992
+ "epoch": 49.22,
993
+ "learning_rate": 0.0005243873517786562,
994
+ "loss": 0.0,
995
+ "step": 12700
996
+ },
997
+ {
998
+ "epoch": 49.61,
999
+ "learning_rate": 0.0005204347826086957,
1000
+ "loss": 0.0,
1001
+ "step": 12800
1002
+ },
1003
+ {
1004
+ "epoch": 50.0,
1005
+ "learning_rate": 0.0005164822134387352,
1006
+ "loss": 0.0,
1007
+ "step": 12900
1008
+ },
1009
+ {
1010
+ "epoch": 50.39,
1011
+ "learning_rate": 0.0005125296442687747,
1012
+ "loss": 0.0,
1013
+ "step": 13000
1014
+ },
1015
+ {
1016
+ "epoch": 50.39,
1017
+ "eval_loss": NaN,
1018
+ "eval_runtime": 35.7412,
1019
+ "eval_samples_per_second": 14.241,
1020
+ "eval_steps_per_second": 14.241,
1021
+ "eval_wer": 1.0,
1022
+ "step": 13000
1023
+ },
1024
+ {
1025
+ "epoch": 50.77,
1026
+ "learning_rate": 0.0005085770750988143,
1027
+ "loss": 0.0,
1028
+ "step": 13100
1029
+ },
1030
+ {
1031
+ "epoch": 51.16,
1032
+ "learning_rate": 0.0005046245059288538,
1033
+ "loss": 0.0,
1034
+ "step": 13200
1035
+ },
1036
+ {
1037
+ "epoch": 51.55,
1038
+ "learning_rate": 0.0005006719367588933,
1039
+ "loss": 0.0,
1040
+ "step": 13300
1041
+ },
1042
+ {
1043
+ "epoch": 51.94,
1044
+ "learning_rate": 0.0004967193675889328,
1045
+ "loss": 0.0,
1046
+ "step": 13400
1047
+ },
1048
+ {
1049
+ "epoch": 52.32,
1050
+ "learning_rate": 0.0004927667984189723,
1051
+ "loss": 0.0,
1052
+ "step": 13500
1053
+ },
1054
+ {
1055
+ "epoch": 52.32,
1056
+ "eval_loss": NaN,
1057
+ "eval_runtime": 35.9753,
1058
+ "eval_samples_per_second": 14.149,
1059
+ "eval_steps_per_second": 14.149,
1060
+ "eval_wer": 1.0,
1061
+ "step": 13500
1062
+ },
1063
+ {
1064
+ "epoch": 52.71,
1065
+ "learning_rate": 0.0004888142292490119,
1066
+ "loss": 0.0,
1067
+ "step": 13600
1068
+ },
1069
+ {
1070
+ "epoch": 53.1,
1071
+ "learning_rate": 0.00048486166007905143,
1072
+ "loss": 0.0,
1073
+ "step": 13700
1074
+ },
1075
+ {
1076
+ "epoch": 53.49,
1077
+ "learning_rate": 0.0004809090909090909,
1078
+ "loss": 0.0,
1079
+ "step": 13800
1080
+ },
1081
+ {
1082
+ "epoch": 53.87,
1083
+ "learning_rate": 0.0004769565217391305,
1084
+ "loss": 0.0,
1085
+ "step": 13900
1086
+ },
1087
+ {
1088
+ "epoch": 54.26,
1089
+ "learning_rate": 0.00047300395256916997,
1090
+ "loss": 0.0,
1091
+ "step": 14000
1092
+ },
1093
+ {
1094
+ "epoch": 54.26,
1095
+ "eval_loss": NaN,
1096
+ "eval_runtime": 35.9589,
1097
+ "eval_samples_per_second": 14.155,
1098
+ "eval_steps_per_second": 14.155,
1099
+ "eval_wer": 1.0,
1100
+ "step": 14000
1101
+ },
1102
+ {
1103
+ "epoch": 54.65,
1104
+ "learning_rate": 0.0004690513833992095,
1105
+ "loss": 0.0,
1106
+ "step": 14100
1107
+ },
1108
+ {
1109
+ "epoch": 55.04,
1110
+ "learning_rate": 0.00046509881422924907,
1111
+ "loss": 0.0,
1112
+ "step": 14200
1113
+ },
1114
+ {
1115
+ "epoch": 55.43,
1116
+ "learning_rate": 0.0004611462450592885,
1117
+ "loss": 0.0,
1118
+ "step": 14300
1119
+ },
1120
+ {
1121
+ "epoch": 55.81,
1122
+ "learning_rate": 0.00045719367588932807,
1123
+ "loss": 0.0,
1124
+ "step": 14400
1125
+ },
1126
+ {
1127
+ "epoch": 56.2,
1128
+ "learning_rate": 0.00045324110671936756,
1129
+ "loss": 0.0,
1130
+ "step": 14500
1131
+ },
1132
+ {
1133
+ "epoch": 56.2,
1134
+ "eval_loss": NaN,
1135
+ "eval_runtime": 36.7278,
1136
+ "eval_samples_per_second": 13.859,
1137
+ "eval_steps_per_second": 13.859,
1138
+ "eval_wer": 1.0,
1139
+ "step": 14500
1140
+ },
1141
+ {
1142
+ "epoch": 56.59,
1143
+ "learning_rate": 0.0004492885375494071,
1144
+ "loss": 0.0,
1145
+ "step": 14600
1146
+ },
1147
+ {
1148
+ "epoch": 56.97,
1149
+ "learning_rate": 0.00044533596837944666,
1150
+ "loss": 0.0,
1151
+ "step": 14700
1152
+ },
1153
+ {
1154
+ "epoch": 57.36,
1155
+ "learning_rate": 0.00044138339920948616,
1156
+ "loss": 0.0,
1157
+ "step": 14800
1158
+ },
1159
+ {
1160
+ "epoch": 57.75,
1161
+ "learning_rate": 0.0004374308300395257,
1162
+ "loss": 0.0,
1163
+ "step": 14900
1164
+ },
1165
+ {
1166
+ "epoch": 58.14,
1167
+ "learning_rate": 0.0004334782608695652,
1168
+ "loss": 0.0,
1169
+ "step": 15000
1170
+ },
1171
+ {
1172
+ "epoch": 58.14,
1173
+ "eval_loss": NaN,
1174
+ "eval_runtime": 36.3759,
1175
+ "eval_samples_per_second": 13.993,
1176
+ "eval_steps_per_second": 13.993,
1177
+ "eval_wer": 1.0,
1178
+ "step": 15000
1179
+ },
1180
+ {
1181
+ "epoch": 58.53,
1182
+ "learning_rate": 0.00042952569169960476,
1183
+ "loss": 0.0,
1184
+ "step": 15100
1185
+ },
1186
+ {
1187
+ "epoch": 58.91,
1188
+ "learning_rate": 0.00042557312252964425,
1189
+ "loss": 0.0,
1190
+ "step": 15200
1191
+ },
1192
+ {
1193
+ "epoch": 59.3,
1194
+ "learning_rate": 0.0004216205533596838,
1195
+ "loss": 0.0,
1196
+ "step": 15300
1197
+ },
1198
+ {
1199
+ "epoch": 59.69,
1200
+ "learning_rate": 0.00041766798418972336,
1201
+ "loss": 0.0,
1202
+ "step": 15400
1203
+ },
1204
+ {
1205
+ "epoch": 60.08,
1206
+ "learning_rate": 0.00041371541501976285,
1207
+ "loss": 0.0,
1208
+ "step": 15500
1209
+ },
1210
+ {
1211
+ "epoch": 60.08,
1212
+ "eval_loss": NaN,
1213
+ "eval_runtime": 38.2781,
1214
+ "eval_samples_per_second": 13.297,
1215
+ "eval_steps_per_second": 13.297,
1216
+ "eval_wer": 1.0,
1217
+ "step": 15500
1218
+ },
1219
+ {
1220
+ "epoch": 60.46,
1221
+ "learning_rate": 0.0004097628458498024,
1222
+ "loss": 0.0,
1223
+ "step": 15600
1224
+ },
1225
+ {
1226
+ "epoch": 60.85,
1227
+ "learning_rate": 0.0004058102766798419,
1228
+ "loss": 0.0,
1229
+ "step": 15700
1230
+ },
1231
+ {
1232
+ "epoch": 61.24,
1233
+ "learning_rate": 0.00040185770750988145,
1234
+ "loss": 0.0,
1235
+ "step": 15800
1236
+ },
1237
+ {
1238
+ "epoch": 61.63,
1239
+ "learning_rate": 0.000397905138339921,
1240
+ "loss": 0.0,
1241
+ "step": 15900
1242
+ },
1243
+ {
1244
+ "epoch": 62.02,
1245
+ "learning_rate": 0.0003939525691699605,
1246
+ "loss": 0.0,
1247
+ "step": 16000
1248
+ },
1249
+ {
1250
+ "epoch": 62.02,
1251
+ "eval_loss": NaN,
1252
+ "eval_runtime": 35.3802,
1253
+ "eval_samples_per_second": 14.387,
1254
+ "eval_steps_per_second": 14.387,
1255
+ "eval_wer": 1.0,
1256
+ "step": 16000
1257
+ },
1258
+ {
1259
+ "epoch": 62.4,
1260
+ "learning_rate": 0.00039000000000000005,
1261
+ "loss": 0.0,
1262
+ "step": 16100
1263
+ },
1264
+ {
1265
+ "epoch": 62.79,
1266
+ "learning_rate": 0.0003860474308300395,
1267
+ "loss": 0.0,
1268
+ "step": 16200
1269
+ },
1270
+ {
1271
+ "epoch": 63.18,
1272
+ "learning_rate": 0.00038209486166007904,
1273
+ "loss": 0.0,
1274
+ "step": 16300
1275
+ },
1276
+ {
1277
+ "epoch": 63.56,
1278
+ "learning_rate": 0.0003781422924901186,
1279
+ "loss": 0.0,
1280
+ "step": 16400
1281
+ },
1282
+ {
1283
+ "epoch": 63.95,
1284
+ "learning_rate": 0.0003741897233201581,
1285
+ "loss": 0.0,
1286
+ "step": 16500
1287
+ },
1288
+ {
1289
+ "epoch": 63.95,
1290
+ "eval_loss": NaN,
1291
+ "eval_runtime": 34.3726,
1292
+ "eval_samples_per_second": 14.808,
1293
+ "eval_steps_per_second": 14.808,
1294
+ "eval_wer": 1.0,
1295
+ "step": 16500
1296
+ },
1297
+ {
1298
+ "epoch": 64.34,
1299
+ "learning_rate": 0.00037023715415019764,
1300
+ "loss": 0.0,
1301
+ "step": 16600
1302
+ },
1303
+ {
1304
+ "epoch": 64.73,
1305
+ "learning_rate": 0.00036628458498023713,
1306
+ "loss": 0.0,
1307
+ "step": 16700
1308
+ },
1309
+ {
1310
+ "epoch": 65.12,
1311
+ "learning_rate": 0.0003623320158102767,
1312
+ "loss": 0.0,
1313
+ "step": 16800
1314
+ },
1315
+ {
1316
+ "epoch": 65.5,
1317
+ "learning_rate": 0.0003583794466403162,
1318
+ "loss": 0.0,
1319
+ "step": 16900
1320
+ },
1321
+ {
1322
+ "epoch": 65.89,
1323
+ "learning_rate": 0.00035442687747035573,
1324
+ "loss": 0.0,
1325
+ "step": 17000
1326
+ },
1327
+ {
1328
+ "epoch": 65.89,
1329
+ "eval_loss": NaN,
1330
+ "eval_runtime": 36.2073,
1331
+ "eval_samples_per_second": 14.058,
1332
+ "eval_steps_per_second": 14.058,
1333
+ "eval_wer": 1.0,
1334
+ "step": 17000
1335
+ },
1336
+ {
1337
+ "epoch": 66.28,
1338
+ "learning_rate": 0.0003504743083003953,
1339
+ "loss": 0.0,
1340
+ "step": 17100
1341
+ },
1342
+ {
1343
+ "epoch": 66.66,
1344
+ "learning_rate": 0.0003465217391304348,
1345
+ "loss": 0.0,
1346
+ "step": 17200
1347
+ },
1348
+ {
1349
+ "epoch": 67.05,
1350
+ "learning_rate": 0.00034256916996047433,
1351
+ "loss": 0.0,
1352
+ "step": 17300
1353
+ },
1354
+ {
1355
+ "epoch": 67.44,
1356
+ "learning_rate": 0.0003386166007905138,
1357
+ "loss": 0.0,
1358
+ "step": 17400
1359
+ },
1360
+ {
1361
+ "epoch": 67.83,
1362
+ "learning_rate": 0.0003346640316205534,
1363
+ "loss": 0.0,
1364
+ "step": 17500
1365
+ },
1366
+ {
1367
+ "epoch": 67.83,
1368
+ "eval_loss": NaN,
1369
+ "eval_runtime": 35.6467,
1370
+ "eval_samples_per_second": 14.279,
1371
+ "eval_steps_per_second": 14.279,
1372
+ "eval_wer": 1.0,
1373
+ "step": 17500
1374
+ },
1375
+ {
1376
+ "epoch": 68.22,
1377
+ "learning_rate": 0.00033071146245059293,
1378
+ "loss": 0.0,
1379
+ "step": 17600
1380
+ },
1381
+ {
1382
+ "epoch": 68.6,
1383
+ "learning_rate": 0.0003267588932806324,
1384
+ "loss": 0.0,
1385
+ "step": 17700
1386
+ },
1387
+ {
1388
+ "epoch": 68.99,
1389
+ "learning_rate": 0.000322806324110672,
1390
+ "loss": 0.0,
1391
+ "step": 17800
1392
+ },
1393
+ {
1394
+ "epoch": 69.38,
1395
+ "learning_rate": 0.00031885375494071147,
1396
+ "loss": 0.0,
1397
+ "step": 17900
1398
+ },
1399
+ {
1400
+ "epoch": 69.77,
1401
+ "learning_rate": 0.000314901185770751,
1402
+ "loss": 0.0,
1403
+ "step": 18000
1404
+ },
1405
+ {
1406
+ "epoch": 69.77,
1407
+ "eval_loss": NaN,
1408
+ "eval_runtime": 35.7988,
1409
+ "eval_samples_per_second": 14.218,
1410
+ "eval_steps_per_second": 14.218,
1411
+ "eval_wer": 1.0,
1412
+ "step": 18000
1413
+ },
1414
+ {
1415
+ "epoch": 70.15,
1416
+ "learning_rate": 0.00031094861660079057,
1417
+ "loss": 0.0,
1418
+ "step": 18100
1419
+ },
1420
+ {
1421
+ "epoch": 70.54,
1422
+ "learning_rate": 0.00030699604743083,
1423
+ "loss": 0.0,
1424
+ "step": 18200
1425
+ },
1426
+ {
1427
+ "epoch": 70.93,
1428
+ "learning_rate": 0.00030304347826086957,
1429
+ "loss": 0.0,
1430
+ "step": 18300
1431
+ },
1432
+ {
1433
+ "epoch": 71.32,
1434
+ "learning_rate": 0.00029909090909090906,
1435
+ "loss": 0.0,
1436
+ "step": 18400
1437
+ },
1438
+ {
1439
+ "epoch": 71.7,
1440
+ "learning_rate": 0.0002951383399209486,
1441
+ "loss": 0.0,
1442
+ "step": 18500
1443
+ },
1444
+ {
1445
+ "epoch": 71.7,
1446
+ "eval_loss": NaN,
1447
+ "eval_runtime": 36.0176,
1448
+ "eval_samples_per_second": 14.132,
1449
+ "eval_steps_per_second": 14.132,
1450
+ "eval_wer": 1.0,
1451
+ "step": 18500
1452
+ },
1453
+ {
1454
+ "epoch": 72.09,
1455
+ "learning_rate": 0.0002911857707509881,
1456
+ "loss": 0.0,
1457
+ "step": 18600
1458
+ },
1459
+ {
1460
+ "epoch": 72.48,
1461
+ "learning_rate": 0.00028723320158102766,
1462
+ "loss": 0.0,
1463
+ "step": 18700
1464
+ },
1465
+ {
1466
+ "epoch": 72.87,
1467
+ "learning_rate": 0.0002832806324110672,
1468
+ "loss": 0.0,
1469
+ "step": 18800
1470
+ },
1471
+ {
1472
+ "epoch": 73.26,
1473
+ "learning_rate": 0.0002793280632411067,
1474
+ "loss": 0.0,
1475
+ "step": 18900
1476
+ },
1477
+ {
1478
+ "epoch": 73.64,
1479
+ "learning_rate": 0.00027537549407114626,
1480
+ "loss": 0.0,
1481
+ "step": 19000
1482
+ },
1483
+ {
1484
+ "epoch": 73.64,
1485
+ "eval_loss": NaN,
1486
+ "eval_runtime": 37.4296,
1487
+ "eval_samples_per_second": 13.599,
1488
+ "eval_steps_per_second": 13.599,
1489
+ "eval_wer": 1.0,
1490
+ "step": 19000
1491
+ },
1492
+ {
1493
+ "epoch": 74.03,
1494
+ "learning_rate": 0.00027142292490118575,
1495
+ "loss": 0.0,
1496
+ "step": 19100
1497
+ },
1498
+ {
1499
+ "epoch": 74.42,
1500
+ "learning_rate": 0.0002674703557312253,
1501
+ "loss": 0.0,
1502
+ "step": 19200
1503
+ },
1504
+ {
1505
+ "epoch": 74.8,
1506
+ "learning_rate": 0.00026351778656126486,
1507
+ "loss": 0.0,
1508
+ "step": 19300
1509
+ },
1510
+ {
1511
+ "epoch": 75.19,
1512
+ "learning_rate": 0.00025956521739130435,
1513
+ "loss": 0.0,
1514
+ "step": 19400
1515
+ },
1516
+ {
1517
+ "epoch": 75.58,
1518
+ "learning_rate": 0.0002556126482213439,
1519
+ "loss": 0.0,
1520
+ "step": 19500
1521
+ },
1522
+ {
1523
+ "epoch": 75.58,
1524
+ "eval_loss": NaN,
1525
+ "eval_runtime": 35.6339,
1526
+ "eval_samples_per_second": 14.284,
1527
+ "eval_steps_per_second": 14.284,
1528
+ "eval_wer": 1.0,
1529
+ "step": 19500
1530
+ },
1531
+ {
1532
+ "epoch": 75.97,
1533
+ "learning_rate": 0.0002516600790513834,
1534
+ "loss": 0.0,
1535
+ "step": 19600
1536
+ },
1537
+ {
1538
+ "epoch": 76.36,
1539
+ "learning_rate": 0.00024770750988142295,
1540
+ "loss": 0.0,
1541
+ "step": 19700
1542
+ },
1543
+ {
1544
+ "epoch": 76.74,
1545
+ "learning_rate": 0.00024375494071146245,
1546
+ "loss": 0.0,
1547
+ "step": 19800
1548
+ },
1549
+ {
1550
+ "epoch": 77.13,
1551
+ "learning_rate": 0.00023980237154150197,
1552
+ "loss": 0.0,
1553
+ "step": 19900
1554
+ },
1555
+ {
1556
+ "epoch": 77.52,
1557
+ "learning_rate": 0.0002358498023715415,
1558
+ "loss": 0.0,
1559
+ "step": 20000
1560
+ },
1561
+ {
1562
+ "epoch": 77.52,
1563
+ "eval_loss": NaN,
1564
+ "eval_runtime": 36.2228,
1565
+ "eval_samples_per_second": 14.052,
1566
+ "eval_steps_per_second": 14.052,
1567
+ "eval_wer": 1.0,
1568
+ "step": 20000
1569
+ },
1570
+ {
1571
+ "epoch": 77.9,
1572
+ "learning_rate": 0.00023189723320158104,
1573
+ "loss": 0.0,
1574
+ "step": 20100
1575
+ },
1576
+ {
1577
+ "epoch": 78.29,
1578
+ "learning_rate": 0.00022794466403162057,
1579
+ "loss": 0.0,
1580
+ "step": 20200
1581
+ },
1582
+ {
1583
+ "epoch": 78.68,
1584
+ "learning_rate": 0.0002239920948616601,
1585
+ "loss": 0.0,
1586
+ "step": 20300
1587
+ },
1588
+ {
1589
+ "epoch": 79.07,
1590
+ "learning_rate": 0.00022003952569169962,
1591
+ "loss": 0.0,
1592
+ "step": 20400
1593
+ },
1594
+ {
1595
+ "epoch": 79.46,
1596
+ "learning_rate": 0.00021608695652173914,
1597
+ "loss": 0.0,
1598
+ "step": 20500
1599
+ },
1600
+ {
1601
+ "epoch": 79.46,
1602
+ "eval_loss": NaN,
1603
+ "eval_runtime": 36.0208,
1604
+ "eval_samples_per_second": 14.131,
1605
+ "eval_steps_per_second": 14.131,
1606
+ "eval_wer": 1.0,
1607
+ "step": 20500
1608
+ },
1609
+ {
1610
+ "epoch": 79.84,
1611
+ "learning_rate": 0.00021213438735177866,
1612
+ "loss": 0.0,
1613
+ "step": 20600
1614
+ },
1615
+ {
1616
+ "epoch": 80.23,
1617
+ "learning_rate": 0.00020818181818181819,
1618
+ "loss": 0.0,
1619
+ "step": 20700
1620
+ },
1621
+ {
1622
+ "epoch": 80.62,
1623
+ "learning_rate": 0.0002042292490118577,
1624
+ "loss": 0.0,
1625
+ "step": 20800
1626
+ },
1627
+ {
1628
+ "epoch": 81.01,
1629
+ "learning_rate": 0.00020027667984189723,
1630
+ "loss": 0.0,
1631
+ "step": 20900
1632
+ },
1633
+ {
1634
+ "epoch": 81.39,
1635
+ "learning_rate": 0.00019632411067193676,
1636
+ "loss": 0.0,
1637
+ "step": 21000
1638
+ },
1639
+ {
1640
+ "epoch": 81.39,
1641
+ "eval_loss": NaN,
1642
+ "eval_runtime": 36.0233,
1643
+ "eval_samples_per_second": 14.13,
1644
+ "eval_steps_per_second": 14.13,
1645
+ "eval_wer": 1.0,
1646
+ "step": 21000
1647
+ },
1648
+ {
1649
+ "epoch": 81.78,
1650
+ "learning_rate": 0.00019237154150197628,
1651
+ "loss": 0.0,
1652
+ "step": 21100
1653
+ },
1654
+ {
1655
+ "epoch": 82.17,
1656
+ "learning_rate": 0.00018841897233201583,
1657
+ "loss": 0.0,
1658
+ "step": 21200
1659
+ },
1660
+ {
1661
+ "epoch": 82.56,
1662
+ "learning_rate": 0.00018446640316205535,
1663
+ "loss": 0.0,
1664
+ "step": 21300
1665
+ },
1666
+ {
1667
+ "epoch": 82.94,
1668
+ "learning_rate": 0.00018051383399209488,
1669
+ "loss": 0.0,
1670
+ "step": 21400
1671
+ },
1672
+ {
1673
+ "epoch": 83.33,
1674
+ "learning_rate": 0.00017656126482213437,
1675
+ "loss": 0.0,
1676
+ "step": 21500
1677
+ },
1678
+ {
1679
+ "epoch": 83.33,
1680
+ "eval_loss": NaN,
1681
+ "eval_runtime": 35.7944,
1682
+ "eval_samples_per_second": 14.22,
1683
+ "eval_steps_per_second": 14.22,
1684
+ "eval_wer": 1.0,
1685
+ "step": 21500
1686
+ },
1687
+ {
1688
+ "epoch": 83.72,
1689
+ "learning_rate": 0.0001726086956521739,
1690
+ "loss": 0.0,
1691
+ "step": 21600
1692
+ },
1693
+ {
1694
+ "epoch": 84.11,
1695
+ "learning_rate": 0.00016865612648221342,
1696
+ "loss": 0.0,
1697
+ "step": 21700
1698
+ },
1699
+ {
1700
+ "epoch": 84.49,
1701
+ "learning_rate": 0.00016470355731225297,
1702
+ "loss": 0.0,
1703
+ "step": 21800
1704
+ },
1705
+ {
1706
+ "epoch": 84.88,
1707
+ "learning_rate": 0.0001607509881422925,
1708
+ "loss": 0.0,
1709
+ "step": 21900
1710
+ },
1711
+ {
1712
+ "epoch": 85.27,
1713
+ "learning_rate": 0.00015679841897233202,
1714
+ "loss": 0.0,
1715
+ "step": 22000
1716
+ },
1717
+ {
1718
+ "epoch": 85.27,
1719
+ "eval_loss": NaN,
1720
+ "eval_runtime": 37.0305,
1721
+ "eval_samples_per_second": 13.745,
1722
+ "eval_steps_per_second": 13.745,
1723
+ "eval_wer": 1.0,
1724
+ "step": 22000
1725
+ },
1726
+ {
1727
+ "epoch": 85.66,
1728
+ "learning_rate": 0.00015284584980237154,
1729
+ "loss": 0.0,
1730
+ "step": 22100
1731
+ },
1732
+ {
1733
+ "epoch": 86.05,
1734
+ "learning_rate": 0.00014889328063241107,
1735
+ "loss": 0.0,
1736
+ "step": 22200
1737
+ },
1738
+ {
1739
+ "epoch": 86.43,
1740
+ "learning_rate": 0.00014494071146245062,
1741
+ "loss": 0.0,
1742
+ "step": 22300
1743
+ },
1744
+ {
1745
+ "epoch": 86.82,
1746
+ "learning_rate": 0.00014098814229249014,
1747
+ "loss": 0.0,
1748
+ "step": 22400
1749
+ },
1750
+ {
1751
+ "epoch": 87.21,
1752
+ "learning_rate": 0.00013703557312252964,
1753
+ "loss": 0.0,
1754
+ "step": 22500
1755
+ },
1756
+ {
1757
+ "epoch": 87.21,
1758
+ "eval_loss": NaN,
1759
+ "eval_runtime": 36.3702,
1760
+ "eval_samples_per_second": 13.995,
1761
+ "eval_steps_per_second": 13.995,
1762
+ "eval_wer": 1.0,
1763
+ "step": 22500
1764
+ },
1765
+ {
1766
+ "epoch": 87.6,
1767
+ "learning_rate": 0.00013308300395256916,
1768
+ "loss": 0.0,
1769
+ "step": 22600
1770
+ },
1771
+ {
1772
+ "epoch": 87.98,
1773
+ "learning_rate": 0.00012913043478260868,
1774
+ "loss": 0.0,
1775
+ "step": 22700
1776
+ },
1777
+ {
1778
+ "epoch": 88.37,
1779
+ "learning_rate": 0.0001251778656126482,
1780
+ "loss": 0.0,
1781
+ "step": 22800
1782
+ },
1783
+ {
1784
+ "epoch": 88.76,
1785
+ "learning_rate": 0.00012122529644268775,
1786
+ "loss": 0.0,
1787
+ "step": 22900
1788
+ },
1789
+ {
1790
+ "epoch": 89.15,
1791
+ "learning_rate": 0.00011727272727272728,
1792
+ "loss": 0.0,
1793
+ "step": 23000
1794
+ },
1795
+ {
1796
+ "epoch": 89.15,
1797
+ "eval_loss": NaN,
1798
+ "eval_runtime": 35.6045,
1799
+ "eval_samples_per_second": 14.296,
1800
+ "eval_steps_per_second": 14.296,
1801
+ "eval_wer": 1.0,
1802
+ "step": 23000
1803
+ },
1804
+ {
1805
+ "epoch": 89.53,
1806
+ "learning_rate": 0.0001133201581027668,
1807
+ "loss": 0.0,
1808
+ "step": 23100
1809
+ },
1810
+ {
1811
+ "epoch": 89.92,
1812
+ "learning_rate": 0.00010936758893280632,
1813
+ "loss": 0.0,
1814
+ "step": 23200
1815
+ },
1816
+ {
1817
+ "epoch": 90.31,
1818
+ "learning_rate": 0.00010541501976284585,
1819
+ "loss": 0.0,
1820
+ "step": 23300
1821
+ },
1822
+ {
1823
+ "epoch": 90.7,
1824
+ "learning_rate": 0.00010146245059288538,
1825
+ "loss": 0.0,
1826
+ "step": 23400
1827
+ },
1828
+ {
1829
+ "epoch": 91.09,
1830
+ "learning_rate": 9.75098814229249e-05,
1831
+ "loss": 0.0,
1832
+ "step": 23500
1833
+ },
1834
+ {
1835
+ "epoch": 91.09,
1836
+ "eval_loss": NaN,
1837
+ "eval_runtime": 37.9737,
1838
+ "eval_samples_per_second": 13.404,
1839
+ "eval_steps_per_second": 13.404,
1840
+ "eval_wer": 1.0,
1841
+ "step": 23500
1842
+ },
1843
+ {
1844
+ "epoch": 91.47,
1845
+ "learning_rate": 9.355731225296444e-05,
1846
+ "loss": 0.0,
1847
+ "step": 23600
1848
+ },
1849
+ {
1850
+ "epoch": 91.86,
1851
+ "learning_rate": 8.960474308300395e-05,
1852
+ "loss": 0.0,
1853
+ "step": 23700
1854
+ },
1855
+ {
1856
+ "epoch": 92.25,
1857
+ "learning_rate": 8.565217391304347e-05,
1858
+ "loss": 0.0,
1859
+ "step": 23800
1860
+ },
1861
+ {
1862
+ "epoch": 92.63,
1863
+ "learning_rate": 8.169960474308301e-05,
1864
+ "loss": 0.0,
1865
+ "step": 23900
1866
+ },
1867
+ {
1868
+ "epoch": 93.02,
1869
+ "learning_rate": 7.774703557312253e-05,
1870
+ "loss": 0.0,
1871
+ "step": 24000
1872
+ },
1873
+ {
1874
+ "epoch": 93.02,
1875
+ "eval_loss": NaN,
1876
+ "eval_runtime": 36.1592,
1877
+ "eval_samples_per_second": 14.077,
1878
+ "eval_steps_per_second": 14.077,
1879
+ "eval_wer": 1.0,
1880
+ "step": 24000
1881
+ },
1882
+ {
1883
+ "epoch": 93.41,
1884
+ "learning_rate": 7.379446640316206e-05,
1885
+ "loss": 0.0,
1886
+ "step": 24100
1887
+ },
1888
+ {
1889
+ "epoch": 93.8,
1890
+ "learning_rate": 6.984189723320158e-05,
1891
+ "loss": 0.0,
1892
+ "step": 24200
1893
+ },
1894
+ {
1895
+ "epoch": 94.19,
1896
+ "learning_rate": 6.58893280632411e-05,
1897
+ "loss": 0.0,
1898
+ "step": 24300
1899
+ },
1900
+ {
1901
+ "epoch": 94.57,
1902
+ "learning_rate": 6.193675889328064e-05,
1903
+ "loss": 0.0,
1904
+ "step": 24400
1905
+ },
1906
+ {
1907
+ "epoch": 94.96,
1908
+ "learning_rate": 5.798418972332016e-05,
1909
+ "loss": 0.0,
1910
+ "step": 24500
1911
+ },
1912
+ {
1913
+ "epoch": 94.96,
1914
+ "eval_loss": NaN,
1915
+ "eval_runtime": 34.809,
1916
+ "eval_samples_per_second": 14.623,
1917
+ "eval_steps_per_second": 14.623,
1918
+ "eval_wer": 1.0,
1919
+ "step": 24500
1920
+ },
1921
+ {
1922
+ "epoch": 95.35,
1923
+ "learning_rate": 5.4031620553359686e-05,
1924
+ "loss": 0.0,
1925
+ "step": 24600
1926
+ },
1927
+ {
1928
+ "epoch": 95.73,
1929
+ "learning_rate": 5.007905138339921e-05,
1930
+ "loss": 0.0,
1931
+ "step": 24700
1932
+ },
1933
+ {
1934
+ "epoch": 96.12,
1935
+ "learning_rate": 4.6126482213438734e-05,
1936
+ "loss": 0.0,
1937
+ "step": 24800
1938
+ },
1939
+ {
1940
+ "epoch": 96.51,
1941
+ "learning_rate": 4.2173913043478264e-05,
1942
+ "loss": 0.0,
1943
+ "step": 24900
1944
+ },
1945
+ {
1946
+ "epoch": 96.9,
1947
+ "learning_rate": 3.822134387351779e-05,
1948
+ "loss": 0.0,
1949
+ "step": 25000
1950
+ },
1951
+ {
1952
+ "epoch": 96.9,
1953
+ "eval_loss": NaN,
1954
+ "eval_runtime": 37.5858,
1955
+ "eval_samples_per_second": 13.542,
1956
+ "eval_steps_per_second": 13.542,
1957
+ "eval_wer": 1.0,
1958
+ "step": 25000
1959
+ },
1960
+ {
1961
+ "epoch": 97.29,
1962
+ "learning_rate": 3.426877470355731e-05,
1963
+ "loss": 0.0,
1964
+ "step": 25100
1965
+ },
1966
+ {
1967
+ "epoch": 97.67,
1968
+ "learning_rate": 3.031620553359684e-05,
1969
+ "loss": 0.0,
1970
+ "step": 25200
1971
+ },
1972
+ {
1973
+ "epoch": 98.06,
1974
+ "learning_rate": 2.6363636363636365e-05,
1975
+ "loss": 0.0,
1976
+ "step": 25300
1977
+ },
1978
+ {
1979
+ "epoch": 98.45,
1980
+ "learning_rate": 2.241106719367589e-05,
1981
+ "loss": 0.0,
1982
+ "step": 25400
1983
+ },
1984
+ {
1985
+ "epoch": 98.83,
1986
+ "learning_rate": 1.8458498023715416e-05,
1987
+ "loss": 0.0,
1988
+ "step": 25500
1989
+ },
1990
+ {
1991
+ "epoch": 98.83,
1992
+ "eval_loss": NaN,
1993
+ "eval_runtime": 37.0473,
1994
+ "eval_samples_per_second": 13.739,
1995
+ "eval_steps_per_second": 13.739,
1996
+ "eval_wer": 1.0,
1997
+ "step": 25500
1998
+ },
1999
+ {
2000
+ "epoch": 99.22,
2001
+ "learning_rate": 1.450592885375494e-05,
2002
+ "loss": 0.0,
2003
+ "step": 25600
2004
+ },
2005
+ {
2006
+ "epoch": 99.61,
2007
+ "learning_rate": 1.0553359683794468e-05,
2008
+ "loss": 0.0,
2009
+ "step": 25700
2010
+ },
2011
+ {
2012
+ "epoch": 100.0,
2013
+ "learning_rate": 6.600790513833992e-06,
2014
+ "loss": 0.0,
2015
+ "step": 25800
2016
+ },
2017
+ {
2018
+ "epoch": 100.0,
2019
+ "step": 25800,
2020
+ "total_flos": 3.5491537261798506e+19,
2021
+ "train_loss": 2.8321772269315497,
2022
+ "train_runtime": 30413.0752,
2023
+ "train_samples_per_second": 3.403,
2024
+ "train_steps_per_second": 0.848
2025
+ }
2026
+ ],
2027
+ "max_steps": 25800,
2028
+ "num_train_epochs": 100,
2029
+ "total_flos": 3.5491537261798506e+19,
2030
+ "trial_name": null,
2031
+ "trial_params": null
2032
+ }