Rolv-Arild commited on
Commit
a40988b
1 Parent(s): 43c0931

End of training

Browse files
all_results.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 20.0,
3
+ "eval_loss": 0.19566693902015686,
4
+ "eval_runtime": 441.5982,
5
+ "eval_samples": 6208,
6
+ "eval_samples_per_second": 14.058,
7
+ "eval_steps_per_second": 0.879,
8
+ "eval_wer": 0.16972918437695592,
9
+ "train_loss": 1.2649466082470566,
10
+ "train_runtime": 115228.5602,
11
+ "train_samples": 56673,
12
+ "train_samples_per_second": 9.837,
13
+ "train_steps_per_second": 0.154
14
+ }
eval_results.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 20.0,
3
+ "eval_loss": 0.19566693902015686,
4
+ "eval_runtime": 441.5982,
5
+ "eval_samples": 6208,
6
+ "eval_samples_per_second": 14.058,
7
+ "eval_steps_per_second": 0.879,
8
+ "eval_wer": 0.16972918437695592
9
+ }
runs/Feb03_09-23-17_ficino/events.out.tfevents.1643992561.ficino.366652.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:95127f5c15a6d2e58e0c0d1196729c98d009426e214db10185819378cf36e9fe
3
+ size 364
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 20.0,
3
+ "train_loss": 1.2649466082470566,
4
+ "train_runtime": 115228.5602,
5
+ "train_samples": 56673,
6
+ "train_samples_per_second": 9.837,
7
+ "train_steps_per_second": 0.154
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,2779 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": null,
3
+ "best_model_checkpoint": null,
4
+ "epoch": 19.999153259949196,
5
+ "global_step": 17700,
6
+ "is_hyper_param_search": false,
7
+ "is_local_process_zero": true,
8
+ "is_world_process_zero": true,
9
+ "log_history": [
10
+ {
11
+ "epoch": 0.06,
12
+ "learning_rate": 1.8749999999999998e-06,
13
+ "loss": 9.7932,
14
+ "step": 50
15
+ },
16
+ {
17
+ "epoch": 0.11,
18
+ "learning_rate": 3.7125e-06,
19
+ "loss": 9.6218,
20
+ "step": 100
21
+ },
22
+ {
23
+ "epoch": 0.17,
24
+ "learning_rate": 5.549999999999999e-06,
25
+ "loss": 7.6384,
26
+ "step": 150
27
+ },
28
+ {
29
+ "epoch": 0.23,
30
+ "learning_rate": 7.425e-06,
31
+ "loss": 5.5724,
32
+ "step": 200
33
+ },
34
+ {
35
+ "epoch": 0.28,
36
+ "learning_rate": 9.299999999999999e-06,
37
+ "loss": 4.4527,
38
+ "step": 250
39
+ },
40
+ {
41
+ "epoch": 0.28,
42
+ "eval_loss": 4.014413833618164,
43
+ "eval_runtime": 432.9366,
44
+ "eval_samples_per_second": 14.339,
45
+ "eval_steps_per_second": 0.896,
46
+ "eval_wer": 1.0,
47
+ "step": 250
48
+ },
49
+ {
50
+ "epoch": 0.34,
51
+ "learning_rate": 1.1174999999999999e-05,
52
+ "loss": 3.996,
53
+ "step": 300
54
+ },
55
+ {
56
+ "epoch": 0.4,
57
+ "learning_rate": 1.3049999999999999e-05,
58
+ "loss": 3.6961,
59
+ "step": 350
60
+ },
61
+ {
62
+ "epoch": 0.45,
63
+ "learning_rate": 1.4925e-05,
64
+ "loss": 3.4442,
65
+ "step": 400
66
+ },
67
+ {
68
+ "epoch": 0.51,
69
+ "learning_rate": 1.68e-05,
70
+ "loss": 3.3442,
71
+ "step": 450
72
+ },
73
+ {
74
+ "epoch": 0.56,
75
+ "learning_rate": 1.8675e-05,
76
+ "loss": 3.1828,
77
+ "step": 500
78
+ },
79
+ {
80
+ "epoch": 0.56,
81
+ "eval_loss": 3.136876106262207,
82
+ "eval_runtime": 432.521,
83
+ "eval_samples_per_second": 14.353,
84
+ "eval_steps_per_second": 0.897,
85
+ "eval_wer": 1.0,
86
+ "step": 500
87
+ },
88
+ {
89
+ "epoch": 0.62,
90
+ "learning_rate": 2.055e-05,
91
+ "loss": 3.1052,
92
+ "step": 550
93
+ },
94
+ {
95
+ "epoch": 0.68,
96
+ "learning_rate": 2.2424999999999996e-05,
97
+ "loss": 3.0545,
98
+ "step": 600
99
+ },
100
+ {
101
+ "epoch": 0.73,
102
+ "learning_rate": 2.4299999999999998e-05,
103
+ "loss": 3.0155,
104
+ "step": 650
105
+ },
106
+ {
107
+ "epoch": 0.79,
108
+ "learning_rate": 2.6174999999999996e-05,
109
+ "loss": 3.0148,
110
+ "step": 700
111
+ },
112
+ {
113
+ "epoch": 0.85,
114
+ "learning_rate": 2.8049999999999997e-05,
115
+ "loss": 2.9927,
116
+ "step": 750
117
+ },
118
+ {
119
+ "epoch": 0.85,
120
+ "eval_loss": 3.0182671546936035,
121
+ "eval_runtime": 435.1578,
122
+ "eval_samples_per_second": 14.266,
123
+ "eval_steps_per_second": 0.892,
124
+ "eval_wer": 1.0,
125
+ "step": 750
126
+ },
127
+ {
128
+ "epoch": 0.9,
129
+ "learning_rate": 2.9925e-05,
130
+ "loss": 2.9829,
131
+ "step": 800
132
+ },
133
+ {
134
+ "epoch": 0.96,
135
+ "learning_rate": 3.1799999999999994e-05,
136
+ "loss": 2.9876,
137
+ "step": 850
138
+ },
139
+ {
140
+ "epoch": 1.02,
141
+ "learning_rate": 3.3675e-05,
142
+ "loss": 3.0304,
143
+ "step": 900
144
+ },
145
+ {
146
+ "epoch": 1.07,
147
+ "learning_rate": 3.555e-05,
148
+ "loss": 2.9783,
149
+ "step": 950
150
+ },
151
+ {
152
+ "epoch": 1.13,
153
+ "learning_rate": 3.7424999999999995e-05,
154
+ "loss": 2.9591,
155
+ "step": 1000
156
+ },
157
+ {
158
+ "epoch": 1.13,
159
+ "eval_loss": 2.999102830886841,
160
+ "eval_runtime": 430.4872,
161
+ "eval_samples_per_second": 14.421,
162
+ "eval_steps_per_second": 0.901,
163
+ "eval_wer": 1.0,
164
+ "step": 1000
165
+ },
166
+ {
167
+ "epoch": 1.19,
168
+ "learning_rate": 3.93e-05,
169
+ "loss": 2.9559,
170
+ "step": 1050
171
+ },
172
+ {
173
+ "epoch": 1.24,
174
+ "learning_rate": 4.1175e-05,
175
+ "loss": 2.9388,
176
+ "step": 1100
177
+ },
178
+ {
179
+ "epoch": 1.3,
180
+ "learning_rate": 4.3049999999999996e-05,
181
+ "loss": 2.9321,
182
+ "step": 1150
183
+ },
184
+ {
185
+ "epoch": 1.36,
186
+ "learning_rate": 4.4924999999999994e-05,
187
+ "loss": 2.9205,
188
+ "step": 1200
189
+ },
190
+ {
191
+ "epoch": 1.41,
192
+ "learning_rate": 4.68e-05,
193
+ "loss": 2.8989,
194
+ "step": 1250
195
+ },
196
+ {
197
+ "epoch": 1.41,
198
+ "eval_loss": 2.9000213146209717,
199
+ "eval_runtime": 431.5059,
200
+ "eval_samples_per_second": 14.387,
201
+ "eval_steps_per_second": 0.899,
202
+ "eval_wer": 0.999990658308967,
203
+ "step": 1250
204
+ },
205
+ {
206
+ "epoch": 1.47,
207
+ "learning_rate": 4.8675e-05,
208
+ "loss": 2.8682,
209
+ "step": 1300
210
+ },
211
+ {
212
+ "epoch": 1.52,
213
+ "learning_rate": 5.055e-05,
214
+ "loss": 2.8476,
215
+ "step": 1350
216
+ },
217
+ {
218
+ "epoch": 1.58,
219
+ "learning_rate": 5.2424999999999994e-05,
220
+ "loss": 2.7956,
221
+ "step": 1400
222
+ },
223
+ {
224
+ "epoch": 1.64,
225
+ "learning_rate": 5.429999999999999e-05,
226
+ "loss": 2.6754,
227
+ "step": 1450
228
+ },
229
+ {
230
+ "epoch": 1.69,
231
+ "learning_rate": 5.6175e-05,
232
+ "loss": 2.4286,
233
+ "step": 1500
234
+ },
235
+ {
236
+ "epoch": 1.69,
237
+ "eval_loss": 1.7688498497009277,
238
+ "eval_runtime": 430.3663,
239
+ "eval_samples_per_second": 14.425,
240
+ "eval_steps_per_second": 0.902,
241
+ "eval_wer": 0.9550384410586005,
242
+ "step": 1500
243
+ },
244
+ {
245
+ "epoch": 1.75,
246
+ "learning_rate": 5.8049999999999995e-05,
247
+ "loss": 2.218,
248
+ "step": 1550
249
+ },
250
+ {
251
+ "epoch": 1.81,
252
+ "learning_rate": 5.9925e-05,
253
+ "loss": 2.0095,
254
+ "step": 1600
255
+ },
256
+ {
257
+ "epoch": 1.86,
258
+ "learning_rate": 6.18e-05,
259
+ "loss": 1.8416,
260
+ "step": 1650
261
+ },
262
+ {
263
+ "epoch": 1.92,
264
+ "learning_rate": 6.367499999999999e-05,
265
+ "loss": 1.7642,
266
+ "step": 1700
267
+ },
268
+ {
269
+ "epoch": 1.98,
270
+ "learning_rate": 6.555e-05,
271
+ "loss": 1.6765,
272
+ "step": 1750
273
+ },
274
+ {
275
+ "epoch": 1.98,
276
+ "eval_loss": 0.6841917037963867,
277
+ "eval_runtime": 433.019,
278
+ "eval_samples_per_second": 14.337,
279
+ "eval_steps_per_second": 0.896,
280
+ "eval_wer": 0.48551570805347183,
281
+ "step": 1750
282
+ },
283
+ {
284
+ "epoch": 2.03,
285
+ "learning_rate": 6.7425e-05,
286
+ "loss": 1.5994,
287
+ "step": 1800
288
+ },
289
+ {
290
+ "epoch": 2.09,
291
+ "learning_rate": 6.93e-05,
292
+ "loss": 1.5522,
293
+ "step": 1850
294
+ },
295
+ {
296
+ "epoch": 2.15,
297
+ "learning_rate": 7.1175e-05,
298
+ "loss": 1.52,
299
+ "step": 1900
300
+ },
301
+ {
302
+ "epoch": 2.2,
303
+ "learning_rate": 7.304999999999999e-05,
304
+ "loss": 1.5086,
305
+ "step": 1950
306
+ },
307
+ {
308
+ "epoch": 2.26,
309
+ "learning_rate": 7.492499999999999e-05,
310
+ "loss": 1.4521,
311
+ "step": 2000
312
+ },
313
+ {
314
+ "epoch": 2.26,
315
+ "eval_loss": 0.5096011757850647,
316
+ "eval_runtime": 431.8266,
317
+ "eval_samples_per_second": 14.376,
318
+ "eval_steps_per_second": 0.899,
319
+ "eval_wer": 0.3735835660971349,
320
+ "step": 2000
321
+ },
322
+ {
323
+ "epoch": 2.32,
324
+ "learning_rate": 7.477070063694266e-05,
325
+ "loss": 1.4457,
326
+ "step": 2050
327
+ },
328
+ {
329
+ "epoch": 2.37,
330
+ "learning_rate": 7.453184713375795e-05,
331
+ "loss": 1.4276,
332
+ "step": 2100
333
+ },
334
+ {
335
+ "epoch": 2.43,
336
+ "learning_rate": 7.429299363057323e-05,
337
+ "loss": 1.4028,
338
+ "step": 2150
339
+ },
340
+ {
341
+ "epoch": 2.49,
342
+ "learning_rate": 7.405414012738853e-05,
343
+ "loss": 1.3887,
344
+ "step": 2200
345
+ },
346
+ {
347
+ "epoch": 2.54,
348
+ "learning_rate": 7.38152866242038e-05,
349
+ "loss": 1.3589,
350
+ "step": 2250
351
+ },
352
+ {
353
+ "epoch": 2.54,
354
+ "eval_loss": 0.44788965582847595,
355
+ "eval_runtime": 430.2855,
356
+ "eval_samples_per_second": 14.428,
357
+ "eval_steps_per_second": 0.902,
358
+ "eval_wer": 0.3335450783300793,
359
+ "step": 2250
360
+ },
361
+ {
362
+ "epoch": 2.6,
363
+ "learning_rate": 7.35764331210191e-05,
364
+ "loss": 1.3935,
365
+ "step": 2300
366
+ },
367
+ {
368
+ "epoch": 2.65,
369
+ "learning_rate": 7.333757961783438e-05,
370
+ "loss": 1.3425,
371
+ "step": 2350
372
+ },
373
+ {
374
+ "epoch": 2.71,
375
+ "learning_rate": 7.309872611464967e-05,
376
+ "loss": 1.3657,
377
+ "step": 2400
378
+ },
379
+ {
380
+ "epoch": 2.77,
381
+ "learning_rate": 7.285987261146495e-05,
382
+ "loss": 1.3645,
383
+ "step": 2450
384
+ },
385
+ {
386
+ "epoch": 2.82,
387
+ "learning_rate": 7.262101910828025e-05,
388
+ "loss": 1.3136,
389
+ "step": 2500
390
+ },
391
+ {
392
+ "epoch": 2.82,
393
+ "eval_loss": 0.40564054250717163,
394
+ "eval_runtime": 428.4501,
395
+ "eval_samples_per_second": 14.489,
396
+ "eval_steps_per_second": 0.906,
397
+ "eval_wer": 0.3123020729212402,
398
+ "step": 2500
399
+ },
400
+ {
401
+ "epoch": 2.88,
402
+ "learning_rate": 7.238216560509553e-05,
403
+ "loss": 1.3415,
404
+ "step": 2550
405
+ },
406
+ {
407
+ "epoch": 2.94,
408
+ "learning_rate": 7.214331210191082e-05,
409
+ "loss": 1.3345,
410
+ "step": 2600
411
+ },
412
+ {
413
+ "epoch": 2.99,
414
+ "learning_rate": 7.19044585987261e-05,
415
+ "loss": 1.3283,
416
+ "step": 2650
417
+ },
418
+ {
419
+ "epoch": 3.05,
420
+ "learning_rate": 7.16656050955414e-05,
421
+ "loss": 1.2788,
422
+ "step": 2700
423
+ },
424
+ {
425
+ "epoch": 3.11,
426
+ "learning_rate": 7.142675159235667e-05,
427
+ "loss": 1.2856,
428
+ "step": 2750
429
+ },
430
+ {
431
+ "epoch": 3.11,
432
+ "eval_loss": 0.38699424266815186,
433
+ "eval_runtime": 430.1514,
434
+ "eval_samples_per_second": 14.432,
435
+ "eval_steps_per_second": 0.902,
436
+ "eval_wer": 0.29870991246835504,
437
+ "step": 2750
438
+ },
439
+ {
440
+ "epoch": 3.16,
441
+ "learning_rate": 7.118789808917197e-05,
442
+ "loss": 1.2817,
443
+ "step": 2800
444
+ },
445
+ {
446
+ "epoch": 3.22,
447
+ "learning_rate": 7.094904458598725e-05,
448
+ "loss": 1.2502,
449
+ "step": 2850
450
+ },
451
+ {
452
+ "epoch": 3.28,
453
+ "learning_rate": 7.071019108280254e-05,
454
+ "loss": 1.2623,
455
+ "step": 2900
456
+ },
457
+ {
458
+ "epoch": 3.33,
459
+ "learning_rate": 7.047133757961782e-05,
460
+ "loss": 1.2302,
461
+ "step": 2950
462
+ },
463
+ {
464
+ "epoch": 3.39,
465
+ "learning_rate": 7.023248407643311e-05,
466
+ "loss": 1.2283,
467
+ "step": 3000
468
+ },
469
+ {
470
+ "epoch": 3.39,
471
+ "eval_loss": 0.3645668029785156,
472
+ "eval_runtime": 430.0013,
473
+ "eval_samples_per_second": 14.437,
474
+ "eval_steps_per_second": 0.902,
475
+ "eval_wer": 0.2828290377124067,
476
+ "step": 3000
477
+ },
478
+ {
479
+ "epoch": 3.45,
480
+ "learning_rate": 6.99936305732484e-05,
481
+ "loss": 1.1993,
482
+ "step": 3050
483
+ },
484
+ {
485
+ "epoch": 3.5,
486
+ "learning_rate": 6.975477707006369e-05,
487
+ "loss": 1.2627,
488
+ "step": 3100
489
+ },
490
+ {
491
+ "epoch": 3.56,
492
+ "learning_rate": 6.951592356687897e-05,
493
+ "loss": 1.1969,
494
+ "step": 3150
495
+ },
496
+ {
497
+ "epoch": 3.62,
498
+ "learning_rate": 6.927707006369426e-05,
499
+ "loss": 1.2054,
500
+ "step": 3200
501
+ },
502
+ {
503
+ "epoch": 3.67,
504
+ "learning_rate": 6.903821656050954e-05,
505
+ "loss": 1.2053,
506
+ "step": 3250
507
+ },
508
+ {
509
+ "epoch": 3.67,
510
+ "eval_loss": 0.3499177098274231,
511
+ "eval_runtime": 429.329,
512
+ "eval_samples_per_second": 14.46,
513
+ "eval_steps_per_second": 0.904,
514
+ "eval_wer": 0.2747578166599718,
515
+ "step": 3250
516
+ },
517
+ {
518
+ "epoch": 3.73,
519
+ "learning_rate": 6.879936305732483e-05,
520
+ "loss": 1.2144,
521
+ "step": 3300
522
+ },
523
+ {
524
+ "epoch": 3.78,
525
+ "learning_rate": 6.856050955414011e-05,
526
+ "loss": 1.1882,
527
+ "step": 3350
528
+ },
529
+ {
530
+ "epoch": 3.84,
531
+ "learning_rate": 6.832165605095541e-05,
532
+ "loss": 1.1901,
533
+ "step": 3400
534
+ },
535
+ {
536
+ "epoch": 3.9,
537
+ "learning_rate": 6.808280254777069e-05,
538
+ "loss": 1.2064,
539
+ "step": 3450
540
+ },
541
+ {
542
+ "epoch": 3.95,
543
+ "learning_rate": 6.784394904458598e-05,
544
+ "loss": 1.2087,
545
+ "step": 3500
546
+ },
547
+ {
548
+ "epoch": 3.95,
549
+ "eval_loss": 0.3345482349395752,
550
+ "eval_runtime": 430.4222,
551
+ "eval_samples_per_second": 14.423,
552
+ "eval_steps_per_second": 0.901,
553
+ "eval_wer": 0.2602781955589601,
554
+ "step": 3500
555
+ },
556
+ {
557
+ "epoch": 4.01,
558
+ "learning_rate": 6.760509554140126e-05,
559
+ "loss": 1.1945,
560
+ "step": 3550
561
+ },
562
+ {
563
+ "epoch": 4.07,
564
+ "learning_rate": 6.736624203821655e-05,
565
+ "loss": 1.1674,
566
+ "step": 3600
567
+ },
568
+ {
569
+ "epoch": 4.12,
570
+ "learning_rate": 6.712738853503183e-05,
571
+ "loss": 1.2197,
572
+ "step": 3650
573
+ },
574
+ {
575
+ "epoch": 4.18,
576
+ "learning_rate": 6.688853503184713e-05,
577
+ "loss": 1.1832,
578
+ "step": 3700
579
+ },
580
+ {
581
+ "epoch": 4.24,
582
+ "learning_rate": 6.664968152866241e-05,
583
+ "loss": 1.2002,
584
+ "step": 3750
585
+ },
586
+ {
587
+ "epoch": 4.24,
588
+ "eval_loss": 0.3320307731628418,
589
+ "eval_runtime": 429.9654,
590
+ "eval_samples_per_second": 14.438,
591
+ "eval_steps_per_second": 0.902,
592
+ "eval_wer": 0.25228170803478844,
593
+ "step": 3750
594
+ },
595
+ {
596
+ "epoch": 4.29,
597
+ "learning_rate": 6.64108280254777e-05,
598
+ "loss": 1.1655,
599
+ "step": 3800
600
+ },
601
+ {
602
+ "epoch": 4.35,
603
+ "learning_rate": 6.617197452229298e-05,
604
+ "loss": 1.1387,
605
+ "step": 3850
606
+ },
607
+ {
608
+ "epoch": 4.41,
609
+ "learning_rate": 6.593312101910828e-05,
610
+ "loss": 1.1344,
611
+ "step": 3900
612
+ },
613
+ {
614
+ "epoch": 4.46,
615
+ "learning_rate": 6.569426751592356e-05,
616
+ "loss": 1.169,
617
+ "step": 3950
618
+ },
619
+ {
620
+ "epoch": 4.52,
621
+ "learning_rate": 6.545541401273885e-05,
622
+ "loss": 1.1383,
623
+ "step": 4000
624
+ },
625
+ {
626
+ "epoch": 4.52,
627
+ "eval_loss": 0.31172633171081543,
628
+ "eval_runtime": 428.4618,
629
+ "eval_samples_per_second": 14.489,
630
+ "eval_steps_per_second": 0.906,
631
+ "eval_wer": 0.24393957794239912,
632
+ "step": 4000
633
+ },
634
+ {
635
+ "epoch": 4.58,
636
+ "learning_rate": 6.521656050955413e-05,
637
+ "loss": 1.1241,
638
+ "step": 4050
639
+ },
640
+ {
641
+ "epoch": 4.63,
642
+ "learning_rate": 6.497770700636942e-05,
643
+ "loss": 1.1505,
644
+ "step": 4100
645
+ },
646
+ {
647
+ "epoch": 4.69,
648
+ "learning_rate": 6.47388535031847e-05,
649
+ "loss": 1.1309,
650
+ "step": 4150
651
+ },
652
+ {
653
+ "epoch": 4.75,
654
+ "learning_rate": 6.45e-05,
655
+ "loss": 1.1368,
656
+ "step": 4200
657
+ },
658
+ {
659
+ "epoch": 4.8,
660
+ "learning_rate": 6.426114649681528e-05,
661
+ "loss": 1.1364,
662
+ "step": 4250
663
+ },
664
+ {
665
+ "epoch": 4.8,
666
+ "eval_loss": 0.3198467195034027,
667
+ "eval_runtime": 427.239,
668
+ "eval_samples_per_second": 14.531,
669
+ "eval_steps_per_second": 0.908,
670
+ "eval_wer": 0.2382878548674881,
671
+ "step": 4250
672
+ },
673
+ {
674
+ "epoch": 4.86,
675
+ "learning_rate": 6.402229299363057e-05,
676
+ "loss": 1.1185,
677
+ "step": 4300
678
+ },
679
+ {
680
+ "epoch": 4.91,
681
+ "learning_rate": 6.378343949044585e-05,
682
+ "loss": 1.1214,
683
+ "step": 4350
684
+ },
685
+ {
686
+ "epoch": 4.97,
687
+ "learning_rate": 6.354458598726114e-05,
688
+ "loss": 1.1188,
689
+ "step": 4400
690
+ },
691
+ {
692
+ "epoch": 5.03,
693
+ "learning_rate": 6.330573248407642e-05,
694
+ "loss": 1.1327,
695
+ "step": 4450
696
+ },
697
+ {
698
+ "epoch": 5.08,
699
+ "learning_rate": 6.306687898089172e-05,
700
+ "loss": 1.158,
701
+ "step": 4500
702
+ },
703
+ {
704
+ "epoch": 5.08,
705
+ "eval_loss": 0.3070796728134155,
706
+ "eval_runtime": 427.2037,
707
+ "eval_samples_per_second": 14.532,
708
+ "eval_steps_per_second": 0.908,
709
+ "eval_wer": 0.23418685250404028,
710
+ "step": 4500
711
+ },
712
+ {
713
+ "epoch": 5.14,
714
+ "learning_rate": 6.2828025477707e-05,
715
+ "loss": 1.1221,
716
+ "step": 4550
717
+ },
718
+ {
719
+ "epoch": 5.2,
720
+ "learning_rate": 6.258917197452229e-05,
721
+ "loss": 1.1167,
722
+ "step": 4600
723
+ },
724
+ {
725
+ "epoch": 5.25,
726
+ "learning_rate": 6.235031847133757e-05,
727
+ "loss": 1.1067,
728
+ "step": 4650
729
+ },
730
+ {
731
+ "epoch": 5.31,
732
+ "learning_rate": 6.211146496815286e-05,
733
+ "loss": 1.099,
734
+ "step": 4700
735
+ },
736
+ {
737
+ "epoch": 5.37,
738
+ "learning_rate": 6.187261146496814e-05,
739
+ "loss": 1.108,
740
+ "step": 4750
741
+ },
742
+ {
743
+ "epoch": 5.37,
744
+ "eval_loss": 0.3011206090450287,
745
+ "eval_runtime": 430.4576,
746
+ "eval_samples_per_second": 14.422,
747
+ "eval_steps_per_second": 0.901,
748
+ "eval_wer": 0.23136566181210122,
749
+ "step": 4750
750
+ },
751
+ {
752
+ "epoch": 5.42,
753
+ "learning_rate": 6.163375796178344e-05,
754
+ "loss": 1.1024,
755
+ "step": 4800
756
+ },
757
+ {
758
+ "epoch": 5.48,
759
+ "learning_rate": 6.139490445859872e-05,
760
+ "loss": 1.1039,
761
+ "step": 4850
762
+ },
763
+ {
764
+ "epoch": 5.54,
765
+ "learning_rate": 6.115605095541401e-05,
766
+ "loss": 1.1082,
767
+ "step": 4900
768
+ },
769
+ {
770
+ "epoch": 5.59,
771
+ "learning_rate": 6.09171974522293e-05,
772
+ "loss": 1.0982,
773
+ "step": 4950
774
+ },
775
+ {
776
+ "epoch": 5.65,
777
+ "learning_rate": 6.0678343949044583e-05,
778
+ "loss": 1.1025,
779
+ "step": 5000
780
+ },
781
+ {
782
+ "epoch": 5.65,
783
+ "eval_loss": 0.28753861784935,
784
+ "eval_runtime": 431.3779,
785
+ "eval_samples_per_second": 14.391,
786
+ "eval_steps_per_second": 0.899,
787
+ "eval_wer": 0.2289368221435444,
788
+ "step": 5000
789
+ },
790
+ {
791
+ "epoch": 5.71,
792
+ "learning_rate": 6.043949044585987e-05,
793
+ "loss": 1.089,
794
+ "step": 5050
795
+ },
796
+ {
797
+ "epoch": 5.76,
798
+ "learning_rate": 6.020063694267516e-05,
799
+ "loss": 1.0792,
800
+ "step": 5100
801
+ },
802
+ {
803
+ "epoch": 5.82,
804
+ "learning_rate": 5.9961783439490444e-05,
805
+ "loss": 1.1054,
806
+ "step": 5150
807
+ },
808
+ {
809
+ "epoch": 5.87,
810
+ "learning_rate": 5.972770700636942e-05,
811
+ "loss": 1.078,
812
+ "step": 5200
813
+ },
814
+ {
815
+ "epoch": 5.93,
816
+ "learning_rate": 5.948885350318471e-05,
817
+ "loss": 1.0697,
818
+ "step": 5250
819
+ },
820
+ {
821
+ "epoch": 5.93,
822
+ "eval_loss": 0.29261597990989685,
823
+ "eval_runtime": 429.3286,
824
+ "eval_samples_per_second": 14.46,
825
+ "eval_steps_per_second": 0.904,
826
+ "eval_wer": 0.22559249675376236,
827
+ "step": 5250
828
+ },
829
+ {
830
+ "epoch": 5.99,
831
+ "learning_rate": 5.925e-05,
832
+ "loss": 1.1183,
833
+ "step": 5300
834
+ },
835
+ {
836
+ "epoch": 6.05,
837
+ "learning_rate": 5.9011146496815284e-05,
838
+ "loss": 1.1614,
839
+ "step": 5350
840
+ },
841
+ {
842
+ "epoch": 6.1,
843
+ "learning_rate": 5.877229299363057e-05,
844
+ "loss": 1.075,
845
+ "step": 5400
846
+ },
847
+ {
848
+ "epoch": 6.16,
849
+ "learning_rate": 5.853343949044586e-05,
850
+ "loss": 1.0901,
851
+ "step": 5450
852
+ },
853
+ {
854
+ "epoch": 6.21,
855
+ "learning_rate": 5.8294585987261144e-05,
856
+ "loss": 1.0904,
857
+ "step": 5500
858
+ },
859
+ {
860
+ "epoch": 6.21,
861
+ "eval_loss": 0.2695116698741913,
862
+ "eval_runtime": 431.1678,
863
+ "eval_samples_per_second": 14.398,
864
+ "eval_steps_per_second": 0.9,
865
+ "eval_wer": 0.22445281044774726,
866
+ "step": 5500
867
+ },
868
+ {
869
+ "epoch": 6.27,
870
+ "learning_rate": 5.805573248407643e-05,
871
+ "loss": 1.0577,
872
+ "step": 5550
873
+ },
874
+ {
875
+ "epoch": 6.33,
876
+ "learning_rate": 5.781687898089172e-05,
877
+ "loss": 1.0693,
878
+ "step": 5600
879
+ },
880
+ {
881
+ "epoch": 6.38,
882
+ "learning_rate": 5.7578025477707004e-05,
883
+ "loss": 1.0784,
884
+ "step": 5650
885
+ },
886
+ {
887
+ "epoch": 6.44,
888
+ "learning_rate": 5.733917197452229e-05,
889
+ "loss": 1.0754,
890
+ "step": 5700
891
+ },
892
+ {
893
+ "epoch": 6.5,
894
+ "learning_rate": 5.710031847133758e-05,
895
+ "loss": 1.0802,
896
+ "step": 5750
897
+ },
898
+ {
899
+ "epoch": 6.5,
900
+ "eval_loss": 0.26020729541778564,
901
+ "eval_runtime": 433.3184,
902
+ "eval_samples_per_second": 14.327,
903
+ "eval_steps_per_second": 0.895,
904
+ "eval_wer": 0.21889450428316534,
905
+ "step": 5750
906
+ },
907
+ {
908
+ "epoch": 6.55,
909
+ "learning_rate": 5.6861464968152864e-05,
910
+ "loss": 1.0459,
911
+ "step": 5800
912
+ },
913
+ {
914
+ "epoch": 6.61,
915
+ "learning_rate": 5.662261146496815e-05,
916
+ "loss": 1.0492,
917
+ "step": 5850
918
+ },
919
+ {
920
+ "epoch": 6.67,
921
+ "learning_rate": 5.638375796178344e-05,
922
+ "loss": 1.0526,
923
+ "step": 5900
924
+ },
925
+ {
926
+ "epoch": 6.72,
927
+ "learning_rate": 5.6144904458598724e-05,
928
+ "loss": 1.079,
929
+ "step": 5950
930
+ },
931
+ {
932
+ "epoch": 6.78,
933
+ "learning_rate": 5.590605095541401e-05,
934
+ "loss": 1.0882,
935
+ "step": 6000
936
+ },
937
+ {
938
+ "epoch": 6.78,
939
+ "eval_loss": 0.2602781653404236,
940
+ "eval_runtime": 434.4762,
941
+ "eval_samples_per_second": 14.288,
942
+ "eval_steps_per_second": 0.893,
943
+ "eval_wer": 0.21684867394695787,
944
+ "step": 6000
945
+ },
946
+ {
947
+ "epoch": 6.84,
948
+ "learning_rate": 5.56671974522293e-05,
949
+ "loss": 1.0691,
950
+ "step": 6050
951
+ },
952
+ {
953
+ "epoch": 6.89,
954
+ "learning_rate": 5.5428343949044585e-05,
955
+ "loss": 1.0728,
956
+ "step": 6100
957
+ },
958
+ {
959
+ "epoch": 6.95,
960
+ "learning_rate": 5.518949044585987e-05,
961
+ "loss": 1.0308,
962
+ "step": 6150
963
+ },
964
+ {
965
+ "epoch": 7.01,
966
+ "learning_rate": 5.4955414012738844e-05,
967
+ "loss": 1.0894,
968
+ "step": 6200
969
+ },
970
+ {
971
+ "epoch": 7.06,
972
+ "learning_rate": 5.471656050955413e-05,
973
+ "loss": 1.0881,
974
+ "step": 6250
975
+ },
976
+ {
977
+ "epoch": 7.06,
978
+ "eval_loss": 0.25403761863708496,
979
+ "eval_runtime": 433.991,
980
+ "eval_samples_per_second": 14.304,
981
+ "eval_steps_per_second": 0.894,
982
+ "eval_wer": 0.2292544396386634,
983
+ "step": 6250
984
+ },
985
+ {
986
+ "epoch": 7.12,
987
+ "learning_rate": 5.447770700636942e-05,
988
+ "loss": 1.0295,
989
+ "step": 6300
990
+ },
991
+ {
992
+ "epoch": 7.17,
993
+ "learning_rate": 5.4238853503184704e-05,
994
+ "loss": 1.0389,
995
+ "step": 6350
996
+ },
997
+ {
998
+ "epoch": 7.23,
999
+ "learning_rate": 5.399999999999999e-05,
1000
+ "loss": 1.0415,
1001
+ "step": 6400
1002
+ },
1003
+ {
1004
+ "epoch": 7.29,
1005
+ "learning_rate": 5.376114649681528e-05,
1006
+ "loss": 1.0492,
1007
+ "step": 6450
1008
+ },
1009
+ {
1010
+ "epoch": 7.34,
1011
+ "learning_rate": 5.3522292993630565e-05,
1012
+ "loss": 1.0378,
1013
+ "step": 6500
1014
+ },
1015
+ {
1016
+ "epoch": 7.34,
1017
+ "eval_loss": 0.2614484429359436,
1018
+ "eval_runtime": 432.0675,
1019
+ "eval_samples_per_second": 14.368,
1020
+ "eval_steps_per_second": 0.898,
1021
+ "eval_wer": 0.21932422207067923,
1022
+ "step": 6500
1023
+ },
1024
+ {
1025
+ "epoch": 7.4,
1026
+ "learning_rate": 5.328343949044585e-05,
1027
+ "loss": 1.0362,
1028
+ "step": 6550
1029
+ },
1030
+ {
1031
+ "epoch": 7.46,
1032
+ "learning_rate": 5.304458598726114e-05,
1033
+ "loss": 1.0444,
1034
+ "step": 6600
1035
+ },
1036
+ {
1037
+ "epoch": 7.51,
1038
+ "learning_rate": 5.2805732484076425e-05,
1039
+ "loss": 1.0626,
1040
+ "step": 6650
1041
+ },
1042
+ {
1043
+ "epoch": 7.57,
1044
+ "learning_rate": 5.256687898089171e-05,
1045
+ "loss": 1.0307,
1046
+ "step": 6700
1047
+ },
1048
+ {
1049
+ "epoch": 7.63,
1050
+ "learning_rate": 5.2328025477707e-05,
1051
+ "loss": 1.0397,
1052
+ "step": 6750
1053
+ },
1054
+ {
1055
+ "epoch": 7.63,
1056
+ "eval_loss": 0.27073222398757935,
1057
+ "eval_runtime": 432.0598,
1058
+ "eval_samples_per_second": 14.368,
1059
+ "eval_steps_per_second": 0.898,
1060
+ "eval_wer": 0.21041224882528237,
1061
+ "step": 6750
1062
+ },
1063
+ {
1064
+ "epoch": 7.68,
1065
+ "learning_rate": 5.2089171974522285e-05,
1066
+ "loss": 1.0481,
1067
+ "step": 6800
1068
+ },
1069
+ {
1070
+ "epoch": 7.74,
1071
+ "learning_rate": 5.185031847133757e-05,
1072
+ "loss": 1.042,
1073
+ "step": 6850
1074
+ },
1075
+ {
1076
+ "epoch": 7.8,
1077
+ "learning_rate": 5.161146496815286e-05,
1078
+ "loss": 1.0298,
1079
+ "step": 6900
1080
+ },
1081
+ {
1082
+ "epoch": 7.85,
1083
+ "learning_rate": 5.1372611464968145e-05,
1084
+ "loss": 1.0269,
1085
+ "step": 6950
1086
+ },
1087
+ {
1088
+ "epoch": 7.91,
1089
+ "learning_rate": 5.113375796178343e-05,
1090
+ "loss": 1.0296,
1091
+ "step": 7000
1092
+ },
1093
+ {
1094
+ "epoch": 7.91,
1095
+ "eval_loss": 0.248311385512352,
1096
+ "eval_runtime": 431.8203,
1097
+ "eval_samples_per_second": 14.376,
1098
+ "eval_steps_per_second": 0.899,
1099
+ "eval_wer": 0.2119256027726139,
1100
+ "step": 7000
1101
+ },
1102
+ {
1103
+ "epoch": 7.97,
1104
+ "learning_rate": 5.089490445859872e-05,
1105
+ "loss": 1.0276,
1106
+ "step": 7050
1107
+ },
1108
+ {
1109
+ "epoch": 8.02,
1110
+ "learning_rate": 5.0656050955414005e-05,
1111
+ "loss": 1.0481,
1112
+ "step": 7100
1113
+ },
1114
+ {
1115
+ "epoch": 8.08,
1116
+ "learning_rate": 5.041719745222929e-05,
1117
+ "loss": 1.006,
1118
+ "step": 7150
1119
+ },
1120
+ {
1121
+ "epoch": 8.14,
1122
+ "learning_rate": 5.017834394904458e-05,
1123
+ "loss": 1.0215,
1124
+ "step": 7200
1125
+ },
1126
+ {
1127
+ "epoch": 8.19,
1128
+ "learning_rate": 4.9939490445859866e-05,
1129
+ "loss": 1.0249,
1130
+ "step": 7250
1131
+ },
1132
+ {
1133
+ "epoch": 8.19,
1134
+ "eval_loss": 0.24828839302062988,
1135
+ "eval_runtime": 429.7696,
1136
+ "eval_samples_per_second": 14.445,
1137
+ "eval_steps_per_second": 0.903,
1138
+ "eval_wer": 0.20468579222210806,
1139
+ "step": 7250
1140
+ },
1141
+ {
1142
+ "epoch": 8.25,
1143
+ "learning_rate": 4.970063694267515e-05,
1144
+ "loss": 1.0109,
1145
+ "step": 7300
1146
+ },
1147
+ {
1148
+ "epoch": 8.3,
1149
+ "learning_rate": 4.946178343949044e-05,
1150
+ "loss": 1.0154,
1151
+ "step": 7350
1152
+ },
1153
+ {
1154
+ "epoch": 8.36,
1155
+ "learning_rate": 4.9222929936305726e-05,
1156
+ "loss": 1.0123,
1157
+ "step": 7400
1158
+ },
1159
+ {
1160
+ "epoch": 8.42,
1161
+ "learning_rate": 4.898407643312101e-05,
1162
+ "loss": 1.0126,
1163
+ "step": 7450
1164
+ },
1165
+ {
1166
+ "epoch": 8.47,
1167
+ "learning_rate": 4.87452229299363e-05,
1168
+ "loss": 1.013,
1169
+ "step": 7500
1170
+ },
1171
+ {
1172
+ "epoch": 8.47,
1173
+ "eval_loss": 0.24869437515735626,
1174
+ "eval_runtime": 430.836,
1175
+ "eval_samples_per_second": 14.409,
1176
+ "eval_steps_per_second": 0.901,
1177
+ "eval_wer": 0.20419068259736378,
1178
+ "step": 7500
1179
+ },
1180
+ {
1181
+ "epoch": 8.53,
1182
+ "learning_rate": 4.8506369426751586e-05,
1183
+ "loss": 1.0077,
1184
+ "step": 7550
1185
+ },
1186
+ {
1187
+ "epoch": 8.59,
1188
+ "learning_rate": 4.826751592356687e-05,
1189
+ "loss": 1.0256,
1190
+ "step": 7600
1191
+ },
1192
+ {
1193
+ "epoch": 8.64,
1194
+ "learning_rate": 4.802866242038216e-05,
1195
+ "loss": 1.0627,
1196
+ "step": 7650
1197
+ },
1198
+ {
1199
+ "epoch": 8.7,
1200
+ "learning_rate": 4.7789808917197446e-05,
1201
+ "loss": 0.9883,
1202
+ "step": 7700
1203
+ },
1204
+ {
1205
+ "epoch": 8.76,
1206
+ "learning_rate": 4.755095541401273e-05,
1207
+ "loss": 1.0064,
1208
+ "step": 7750
1209
+ },
1210
+ {
1211
+ "epoch": 8.76,
1212
+ "eval_loss": 0.24558775126934052,
1213
+ "eval_runtime": 430.9773,
1214
+ "eval_samples_per_second": 14.404,
1215
+ "eval_steps_per_second": 0.9,
1216
+ "eval_wer": 0.20164974263641205,
1217
+ "step": 7750
1218
+ },
1219
+ {
1220
+ "epoch": 8.81,
1221
+ "learning_rate": 4.731210191082802e-05,
1222
+ "loss": 1.0137,
1223
+ "step": 7800
1224
+ },
1225
+ {
1226
+ "epoch": 8.87,
1227
+ "learning_rate": 4.7073248407643306e-05,
1228
+ "loss": 1.0178,
1229
+ "step": 7850
1230
+ },
1231
+ {
1232
+ "epoch": 8.93,
1233
+ "learning_rate": 4.683439490445859e-05,
1234
+ "loss": 1.0035,
1235
+ "step": 7900
1236
+ },
1237
+ {
1238
+ "epoch": 8.98,
1239
+ "learning_rate": 4.659554140127388e-05,
1240
+ "loss": 1.0457,
1241
+ "step": 7950
1242
+ },
1243
+ {
1244
+ "epoch": 9.04,
1245
+ "learning_rate": 4.6356687898089167e-05,
1246
+ "loss": 1.0668,
1247
+ "step": 8000
1248
+ },
1249
+ {
1250
+ "epoch": 9.04,
1251
+ "eval_loss": 0.2397284209728241,
1252
+ "eval_runtime": 430.6925,
1253
+ "eval_samples_per_second": 14.414,
1254
+ "eval_steps_per_second": 0.901,
1255
+ "eval_wer": 0.19949181200780966,
1256
+ "step": 8000
1257
+ },
1258
+ {
1259
+ "epoch": 9.1,
1260
+ "learning_rate": 4.611783439490445e-05,
1261
+ "loss": 1.0054,
1262
+ "step": 8050
1263
+ },
1264
+ {
1265
+ "epoch": 9.15,
1266
+ "learning_rate": 4.587898089171974e-05,
1267
+ "loss": 1.0224,
1268
+ "step": 8100
1269
+ },
1270
+ {
1271
+ "epoch": 9.21,
1272
+ "learning_rate": 4.564012738853503e-05,
1273
+ "loss": 1.0019,
1274
+ "step": 8150
1275
+ },
1276
+ {
1277
+ "epoch": 9.27,
1278
+ "learning_rate": 4.5401273885350314e-05,
1279
+ "loss": 1.0033,
1280
+ "step": 8200
1281
+ },
1282
+ {
1283
+ "epoch": 9.32,
1284
+ "learning_rate": 4.51624203821656e-05,
1285
+ "loss": 1.0129,
1286
+ "step": 8250
1287
+ },
1288
+ {
1289
+ "epoch": 9.32,
1290
+ "eval_loss": 0.23742474615573883,
1291
+ "eval_runtime": 432.9935,
1292
+ "eval_samples_per_second": 14.337,
1293
+ "eval_steps_per_second": 0.896,
1294
+ "eval_wer": 0.19942642017057927,
1295
+ "step": 8250
1296
+ },
1297
+ {
1298
+ "epoch": 9.38,
1299
+ "learning_rate": 4.492356687898089e-05,
1300
+ "loss": 0.9864,
1301
+ "step": 8300
1302
+ },
1303
+ {
1304
+ "epoch": 9.43,
1305
+ "learning_rate": 4.4689490445859874e-05,
1306
+ "loss": 1.0021,
1307
+ "step": 8350
1308
+ },
1309
+ {
1310
+ "epoch": 9.49,
1311
+ "learning_rate": 4.445063694267516e-05,
1312
+ "loss": 1.0073,
1313
+ "step": 8400
1314
+ },
1315
+ {
1316
+ "epoch": 9.55,
1317
+ "learning_rate": 4.421178343949045e-05,
1318
+ "loss": 0.9999,
1319
+ "step": 8450
1320
+ },
1321
+ {
1322
+ "epoch": 9.6,
1323
+ "learning_rate": 4.3972929936305734e-05,
1324
+ "loss": 1.0164,
1325
+ "step": 8500
1326
+ },
1327
+ {
1328
+ "epoch": 9.6,
1329
+ "eval_loss": 0.2206413298845291,
1330
+ "eval_runtime": 431.5354,
1331
+ "eval_samples_per_second": 14.386,
1332
+ "eval_steps_per_second": 0.899,
1333
+ "eval_wer": 0.19915551113062488,
1334
+ "step": 8500
1335
+ },
1336
+ {
1337
+ "epoch": 9.66,
1338
+ "learning_rate": 4.373407643312102e-05,
1339
+ "loss": 0.9956,
1340
+ "step": 8550
1341
+ },
1342
+ {
1343
+ "epoch": 9.72,
1344
+ "learning_rate": 4.349522292993631e-05,
1345
+ "loss": 0.9662,
1346
+ "step": 8600
1347
+ },
1348
+ {
1349
+ "epoch": 9.77,
1350
+ "learning_rate": 4.3256369426751594e-05,
1351
+ "loss": 0.9781,
1352
+ "step": 8650
1353
+ },
1354
+ {
1355
+ "epoch": 9.83,
1356
+ "learning_rate": 4.301751592356688e-05,
1357
+ "loss": 0.9863,
1358
+ "step": 8700
1359
+ },
1360
+ {
1361
+ "epoch": 9.89,
1362
+ "learning_rate": 4.277866242038217e-05,
1363
+ "loss": 0.975,
1364
+ "step": 8750
1365
+ },
1366
+ {
1367
+ "epoch": 9.89,
1368
+ "eval_loss": 0.22473624348640442,
1369
+ "eval_runtime": 432.1534,
1370
+ "eval_samples_per_second": 14.365,
1371
+ "eval_steps_per_second": 0.898,
1372
+ "eval_wer": 0.19731519799714145,
1373
+ "step": 8750
1374
+ },
1375
+ {
1376
+ "epoch": 9.94,
1377
+ "learning_rate": 4.2539808917197454e-05,
1378
+ "loss": 0.9931,
1379
+ "step": 8800
1380
+ },
1381
+ {
1382
+ "epoch": 10.0,
1383
+ "learning_rate": 4.230095541401274e-05,
1384
+ "loss": 1.0101,
1385
+ "step": 8850
1386
+ },
1387
+ {
1388
+ "epoch": 10.06,
1389
+ "learning_rate": 4.206210191082803e-05,
1390
+ "loss": 1.0034,
1391
+ "step": 8900
1392
+ },
1393
+ {
1394
+ "epoch": 10.11,
1395
+ "learning_rate": 4.1823248407643314e-05,
1396
+ "loss": 1.0018,
1397
+ "step": 8950
1398
+ },
1399
+ {
1400
+ "epoch": 10.17,
1401
+ "learning_rate": 4.15843949044586e-05,
1402
+ "loss": 0.9849,
1403
+ "step": 9000
1404
+ },
1405
+ {
1406
+ "epoch": 10.17,
1407
+ "eval_loss": 0.23245184123516083,
1408
+ "eval_runtime": 431.4778,
1409
+ "eval_samples_per_second": 14.388,
1410
+ "eval_steps_per_second": 0.899,
1411
+ "eval_wer": 0.19526002596990108,
1412
+ "step": 9000
1413
+ },
1414
+ {
1415
+ "epoch": 10.23,
1416
+ "learning_rate": 4.134554140127389e-05,
1417
+ "loss": 0.9953,
1418
+ "step": 9050
1419
+ },
1420
+ {
1421
+ "epoch": 10.28,
1422
+ "learning_rate": 4.1106687898089175e-05,
1423
+ "loss": 0.9639,
1424
+ "step": 9100
1425
+ },
1426
+ {
1427
+ "epoch": 10.34,
1428
+ "learning_rate": 4.086783439490446e-05,
1429
+ "loss": 0.9862,
1430
+ "step": 9150
1431
+ },
1432
+ {
1433
+ "epoch": 10.4,
1434
+ "learning_rate": 4.062898089171975e-05,
1435
+ "loss": 1.0222,
1436
+ "step": 9200
1437
+ },
1438
+ {
1439
+ "epoch": 10.45,
1440
+ "learning_rate": 4.0390127388535035e-05,
1441
+ "loss": 0.9826,
1442
+ "step": 9250
1443
+ },
1444
+ {
1445
+ "epoch": 10.45,
1446
+ "eval_loss": 0.2301308959722519,
1447
+ "eval_runtime": 432.6762,
1448
+ "eval_samples_per_second": 14.348,
1449
+ "eval_steps_per_second": 0.897,
1450
+ "eval_wer": 0.1933730043812531,
1451
+ "step": 9250
1452
+ },
1453
+ {
1454
+ "epoch": 10.51,
1455
+ "learning_rate": 4.015127388535032e-05,
1456
+ "loss": 0.9867,
1457
+ "step": 9300
1458
+ },
1459
+ {
1460
+ "epoch": 10.56,
1461
+ "learning_rate": 3.991242038216561e-05,
1462
+ "loss": 0.9687,
1463
+ "step": 9350
1464
+ },
1465
+ {
1466
+ "epoch": 10.62,
1467
+ "learning_rate": 3.9673566878980895e-05,
1468
+ "loss": 0.9715,
1469
+ "step": 9400
1470
+ },
1471
+ {
1472
+ "epoch": 10.68,
1473
+ "learning_rate": 3.943471337579618e-05,
1474
+ "loss": 0.9914,
1475
+ "step": 9450
1476
+ },
1477
+ {
1478
+ "epoch": 10.73,
1479
+ "learning_rate": 3.919585987261147e-05,
1480
+ "loss": 0.9835,
1481
+ "step": 9500
1482
+ },
1483
+ {
1484
+ "epoch": 10.73,
1485
+ "eval_loss": 0.2191852629184723,
1486
+ "eval_runtime": 439.1976,
1487
+ "eval_samples_per_second": 14.135,
1488
+ "eval_steps_per_second": 0.883,
1489
+ "eval_wer": 0.19420441488318216,
1490
+ "step": 9500
1491
+ },
1492
+ {
1493
+ "epoch": 10.79,
1494
+ "learning_rate": 3.8957006369426755e-05,
1495
+ "loss": 0.9652,
1496
+ "step": 9550
1497
+ },
1498
+ {
1499
+ "epoch": 10.85,
1500
+ "learning_rate": 3.871815286624204e-05,
1501
+ "loss": 0.9614,
1502
+ "step": 9600
1503
+ },
1504
+ {
1505
+ "epoch": 10.9,
1506
+ "learning_rate": 3.847929936305733e-05,
1507
+ "loss": 0.97,
1508
+ "step": 9650
1509
+ },
1510
+ {
1511
+ "epoch": 10.96,
1512
+ "learning_rate": 3.8240445859872615e-05,
1513
+ "loss": 0.9764,
1514
+ "step": 9700
1515
+ },
1516
+ {
1517
+ "epoch": 11.02,
1518
+ "learning_rate": 3.80015923566879e-05,
1519
+ "loss": 0.9676,
1520
+ "step": 9750
1521
+ },
1522
+ {
1523
+ "epoch": 11.02,
1524
+ "eval_loss": 0.2265927493572235,
1525
+ "eval_runtime": 430.7748,
1526
+ "eval_samples_per_second": 14.411,
1527
+ "eval_steps_per_second": 0.901,
1528
+ "eval_wer": 0.19133651573607854,
1529
+ "step": 9750
1530
+ },
1531
+ {
1532
+ "epoch": 11.07,
1533
+ "learning_rate": 3.776273885350319e-05,
1534
+ "loss": 0.9609,
1535
+ "step": 9800
1536
+ },
1537
+ {
1538
+ "epoch": 11.13,
1539
+ "learning_rate": 3.7523885350318475e-05,
1540
+ "loss": 0.9721,
1541
+ "step": 9850
1542
+ },
1543
+ {
1544
+ "epoch": 11.19,
1545
+ "learning_rate": 3.7285031847133755e-05,
1546
+ "loss": 0.9669,
1547
+ "step": 9900
1548
+ },
1549
+ {
1550
+ "epoch": 11.24,
1551
+ "learning_rate": 3.704617834394904e-05,
1552
+ "loss": 0.9643,
1553
+ "step": 9950
1554
+ },
1555
+ {
1556
+ "epoch": 11.3,
1557
+ "learning_rate": 3.680732484076433e-05,
1558
+ "loss": 0.9627,
1559
+ "step": 10000
1560
+ },
1561
+ {
1562
+ "epoch": 11.3,
1563
+ "eval_loss": 0.2193416953086853,
1564
+ "eval_runtime": 432.6083,
1565
+ "eval_samples_per_second": 14.35,
1566
+ "eval_steps_per_second": 0.897,
1567
+ "eval_wer": 0.19205582594561268,
1568
+ "step": 10000
1569
+ },
1570
+ {
1571
+ "epoch": 11.36,
1572
+ "learning_rate": 3.6568471337579616e-05,
1573
+ "loss": 1.0179,
1574
+ "step": 10050
1575
+ },
1576
+ {
1577
+ "epoch": 11.41,
1578
+ "learning_rate": 3.63296178343949e-05,
1579
+ "loss": 0.9575,
1580
+ "step": 10100
1581
+ },
1582
+ {
1583
+ "epoch": 11.47,
1584
+ "learning_rate": 3.609076433121019e-05,
1585
+ "loss": 0.98,
1586
+ "step": 10150
1587
+ },
1588
+ {
1589
+ "epoch": 11.52,
1590
+ "learning_rate": 3.5851910828025476e-05,
1591
+ "loss": 0.9542,
1592
+ "step": 10200
1593
+ },
1594
+ {
1595
+ "epoch": 11.58,
1596
+ "learning_rate": 3.561305732484076e-05,
1597
+ "loss": 0.976,
1598
+ "step": 10250
1599
+ },
1600
+ {
1601
+ "epoch": 11.58,
1602
+ "eval_loss": 0.23090308904647827,
1603
+ "eval_runtime": 432.1501,
1604
+ "eval_samples_per_second": 14.365,
1605
+ "eval_steps_per_second": 0.898,
1606
+ "eval_wer": 0.1881790241669547,
1607
+ "step": 10250
1608
+ },
1609
+ {
1610
+ "epoch": 11.64,
1611
+ "learning_rate": 3.537420382165605e-05,
1612
+ "loss": 0.972,
1613
+ "step": 10300
1614
+ },
1615
+ {
1616
+ "epoch": 11.69,
1617
+ "learning_rate": 3.5135350318471336e-05,
1618
+ "loss": 0.9634,
1619
+ "step": 10350
1620
+ },
1621
+ {
1622
+ "epoch": 11.75,
1623
+ "learning_rate": 3.489649681528662e-05,
1624
+ "loss": 0.9682,
1625
+ "step": 10400
1626
+ },
1627
+ {
1628
+ "epoch": 11.81,
1629
+ "learning_rate": 3.465764331210191e-05,
1630
+ "loss": 0.9638,
1631
+ "step": 10450
1632
+ },
1633
+ {
1634
+ "epoch": 11.86,
1635
+ "learning_rate": 3.4418789808917196e-05,
1636
+ "loss": 0.969,
1637
+ "step": 10500
1638
+ },
1639
+ {
1640
+ "epoch": 11.86,
1641
+ "eval_loss": 0.2268366813659668,
1642
+ "eval_runtime": 433.2795,
1643
+ "eval_samples_per_second": 14.328,
1644
+ "eval_steps_per_second": 0.895,
1645
+ "eval_wer": 0.18859005857240277,
1646
+ "step": 10500
1647
+ },
1648
+ {
1649
+ "epoch": 11.92,
1650
+ "learning_rate": 3.417993630573248e-05,
1651
+ "loss": 0.9698,
1652
+ "step": 10550
1653
+ },
1654
+ {
1655
+ "epoch": 11.98,
1656
+ "learning_rate": 3.394108280254777e-05,
1657
+ "loss": 0.9369,
1658
+ "step": 10600
1659
+ },
1660
+ {
1661
+ "epoch": 12.03,
1662
+ "learning_rate": 3.3702229299363056e-05,
1663
+ "loss": 0.9699,
1664
+ "step": 10650
1665
+ },
1666
+ {
1667
+ "epoch": 12.09,
1668
+ "learning_rate": 3.346337579617834e-05,
1669
+ "loss": 1.0013,
1670
+ "step": 10700
1671
+ },
1672
+ {
1673
+ "epoch": 12.15,
1674
+ "learning_rate": 3.322929936305732e-05,
1675
+ "loss": 0.9611,
1676
+ "step": 10750
1677
+ },
1678
+ {
1679
+ "epoch": 12.15,
1680
+ "eval_loss": 0.2322191596031189,
1681
+ "eval_runtime": 429.3587,
1682
+ "eval_samples_per_second": 14.459,
1683
+ "eval_steps_per_second": 0.904,
1684
+ "eval_wer": 0.18626397750520798,
1685
+ "step": 10750
1686
+ },
1687
+ {
1688
+ "epoch": 12.2,
1689
+ "learning_rate": 3.299044585987261e-05,
1690
+ "loss": 0.9418,
1691
+ "step": 10800
1692
+ },
1693
+ {
1694
+ "epoch": 12.26,
1695
+ "learning_rate": 3.2751592356687896e-05,
1696
+ "loss": 0.9582,
1697
+ "step": 10850
1698
+ },
1699
+ {
1700
+ "epoch": 12.32,
1701
+ "learning_rate": 3.251273885350318e-05,
1702
+ "loss": 0.945,
1703
+ "step": 10900
1704
+ },
1705
+ {
1706
+ "epoch": 12.37,
1707
+ "learning_rate": 3.227388535031847e-05,
1708
+ "loss": 0.9386,
1709
+ "step": 10950
1710
+ },
1711
+ {
1712
+ "epoch": 12.43,
1713
+ "learning_rate": 3.2035031847133757e-05,
1714
+ "loss": 0.9397,
1715
+ "step": 11000
1716
+ },
1717
+ {
1718
+ "epoch": 12.43,
1719
+ "eval_loss": 0.21969455480575562,
1720
+ "eval_runtime": 432.2628,
1721
+ "eval_samples_per_second": 14.362,
1722
+ "eval_steps_per_second": 0.898,
1723
+ "eval_wer": 0.1843676142255271,
1724
+ "step": 11000
1725
+ },
1726
+ {
1727
+ "epoch": 12.49,
1728
+ "learning_rate": 3.179617834394904e-05,
1729
+ "loss": 0.9594,
1730
+ "step": 11050
1731
+ },
1732
+ {
1733
+ "epoch": 12.54,
1734
+ "learning_rate": 3.155732484076433e-05,
1735
+ "loss": 0.9467,
1736
+ "step": 11100
1737
+ },
1738
+ {
1739
+ "epoch": 12.6,
1740
+ "learning_rate": 3.131847133757962e-05,
1741
+ "loss": 0.9609,
1742
+ "step": 11150
1743
+ },
1744
+ {
1745
+ "epoch": 12.65,
1746
+ "learning_rate": 3.1079617834394904e-05,
1747
+ "loss": 0.9446,
1748
+ "step": 11200
1749
+ },
1750
+ {
1751
+ "epoch": 12.71,
1752
+ "learning_rate": 3.084076433121019e-05,
1753
+ "loss": 0.9601,
1754
+ "step": 11250
1755
+ },
1756
+ {
1757
+ "epoch": 12.71,
1758
+ "eval_loss": 0.22107979655265808,
1759
+ "eval_runtime": 432.9535,
1760
+ "eval_samples_per_second": 14.339,
1761
+ "eval_steps_per_second": 0.896,
1762
+ "eval_wer": 0.18711407138920289,
1763
+ "step": 11250
1764
+ },
1765
+ {
1766
+ "epoch": 12.77,
1767
+ "learning_rate": 3.060191082802548e-05,
1768
+ "loss": 0.9497,
1769
+ "step": 11300
1770
+ },
1771
+ {
1772
+ "epoch": 12.82,
1773
+ "learning_rate": 3.036305732484076e-05,
1774
+ "loss": 0.939,
1775
+ "step": 11350
1776
+ },
1777
+ {
1778
+ "epoch": 12.88,
1779
+ "learning_rate": 3.0124203821656047e-05,
1780
+ "loss": 0.9462,
1781
+ "step": 11400
1782
+ },
1783
+ {
1784
+ "epoch": 12.94,
1785
+ "learning_rate": 2.9885350318471334e-05,
1786
+ "loss": 0.9243,
1787
+ "step": 11450
1788
+ },
1789
+ {
1790
+ "epoch": 12.99,
1791
+ "learning_rate": 2.964649681528662e-05,
1792
+ "loss": 0.9718,
1793
+ "step": 11500
1794
+ },
1795
+ {
1796
+ "epoch": 12.99,
1797
+ "eval_loss": 0.20792651176452637,
1798
+ "eval_runtime": 429.7801,
1799
+ "eval_samples_per_second": 14.445,
1800
+ "eval_steps_per_second": 0.903,
1801
+ "eval_wer": 0.189823161788747,
1802
+ "step": 11500
1803
+ },
1804
+ {
1805
+ "epoch": 13.05,
1806
+ "learning_rate": 2.9407643312101907e-05,
1807
+ "loss": 0.9543,
1808
+ "step": 11550
1809
+ },
1810
+ {
1811
+ "epoch": 13.11,
1812
+ "learning_rate": 2.9168789808917194e-05,
1813
+ "loss": 0.9386,
1814
+ "step": 11600
1815
+ },
1816
+ {
1817
+ "epoch": 13.16,
1818
+ "learning_rate": 2.892993630573248e-05,
1819
+ "loss": 0.9662,
1820
+ "step": 11650
1821
+ },
1822
+ {
1823
+ "epoch": 13.22,
1824
+ "learning_rate": 2.8691082802547767e-05,
1825
+ "loss": 0.9426,
1826
+ "step": 11700
1827
+ },
1828
+ {
1829
+ "epoch": 13.28,
1830
+ "learning_rate": 2.8452229299363054e-05,
1831
+ "loss": 0.9347,
1832
+ "step": 11750
1833
+ },
1834
+ {
1835
+ "epoch": 13.28,
1836
+ "eval_loss": 0.2053879350423813,
1837
+ "eval_runtime": 427.5266,
1838
+ "eval_samples_per_second": 14.521,
1839
+ "eval_steps_per_second": 0.908,
1840
+ "eval_wer": 0.1842835390062309,
1841
+ "step": 11750
1842
+ },
1843
+ {
1844
+ "epoch": 13.33,
1845
+ "learning_rate": 2.821337579617834e-05,
1846
+ "loss": 0.9579,
1847
+ "step": 11800
1848
+ },
1849
+ {
1850
+ "epoch": 13.39,
1851
+ "learning_rate": 2.7974522292993628e-05,
1852
+ "loss": 0.9313,
1853
+ "step": 11850
1854
+ },
1855
+ {
1856
+ "epoch": 13.45,
1857
+ "learning_rate": 2.7735668789808914e-05,
1858
+ "loss": 0.9295,
1859
+ "step": 11900
1860
+ },
1861
+ {
1862
+ "epoch": 13.5,
1863
+ "learning_rate": 2.74968152866242e-05,
1864
+ "loss": 0.9437,
1865
+ "step": 11950
1866
+ },
1867
+ {
1868
+ "epoch": 13.56,
1869
+ "learning_rate": 2.7257961783439488e-05,
1870
+ "loss": 0.9377,
1871
+ "step": 12000
1872
+ },
1873
+ {
1874
+ "epoch": 13.56,
1875
+ "eval_loss": 0.20305366814136505,
1876
+ "eval_runtime": 429.8935,
1877
+ "eval_samples_per_second": 14.441,
1878
+ "eval_steps_per_second": 0.903,
1879
+ "eval_wer": 0.18423683055106635,
1880
+ "step": 12000
1881
+ },
1882
+ {
1883
+ "epoch": 13.62,
1884
+ "learning_rate": 2.7019108280254775e-05,
1885
+ "loss": 0.9273,
1886
+ "step": 12050
1887
+ },
1888
+ {
1889
+ "epoch": 13.67,
1890
+ "learning_rate": 2.678025477707006e-05,
1891
+ "loss": 0.9804,
1892
+ "step": 12100
1893
+ },
1894
+ {
1895
+ "epoch": 13.73,
1896
+ "learning_rate": 2.6541401273885348e-05,
1897
+ "loss": 0.9392,
1898
+ "step": 12150
1899
+ },
1900
+ {
1901
+ "epoch": 13.78,
1902
+ "learning_rate": 2.6302547770700635e-05,
1903
+ "loss": 0.9379,
1904
+ "step": 12200
1905
+ },
1906
+ {
1907
+ "epoch": 13.84,
1908
+ "learning_rate": 2.606369426751592e-05,
1909
+ "loss": 0.934,
1910
+ "step": 12250
1911
+ },
1912
+ {
1913
+ "epoch": 13.84,
1914
+ "eval_loss": 0.20586800575256348,
1915
+ "eval_runtime": 428.3313,
1916
+ "eval_samples_per_second": 14.493,
1917
+ "eval_steps_per_second": 0.906,
1918
+ "eval_wer": 0.18060291273926407,
1919
+ "step": 12250
1920
+ },
1921
+ {
1922
+ "epoch": 13.9,
1923
+ "learning_rate": 2.5824840764331208e-05,
1924
+ "loss": 0.9177,
1925
+ "step": 12300
1926
+ },
1927
+ {
1928
+ "epoch": 13.95,
1929
+ "learning_rate": 2.5585987261146495e-05,
1930
+ "loss": 0.9369,
1931
+ "step": 12350
1932
+ },
1933
+ {
1934
+ "epoch": 14.01,
1935
+ "learning_rate": 2.534713375796178e-05,
1936
+ "loss": 0.9438,
1937
+ "step": 12400
1938
+ },
1939
+ {
1940
+ "epoch": 14.07,
1941
+ "learning_rate": 2.510828025477707e-05,
1942
+ "loss": 0.9341,
1943
+ "step": 12450
1944
+ },
1945
+ {
1946
+ "epoch": 14.12,
1947
+ "learning_rate": 2.4869426751592355e-05,
1948
+ "loss": 0.9295,
1949
+ "step": 12500
1950
+ },
1951
+ {
1952
+ "epoch": 14.12,
1953
+ "eval_loss": 0.21221554279327393,
1954
+ "eval_runtime": 432.7246,
1955
+ "eval_samples_per_second": 14.346,
1956
+ "eval_steps_per_second": 0.897,
1957
+ "eval_wer": 0.18605846030248396,
1958
+ "step": 12500
1959
+ },
1960
+ {
1961
+ "epoch": 14.18,
1962
+ "learning_rate": 2.4630573248407642e-05,
1963
+ "loss": 0.9239,
1964
+ "step": 12550
1965
+ },
1966
+ {
1967
+ "epoch": 14.24,
1968
+ "learning_rate": 2.439171974522293e-05,
1969
+ "loss": 0.9235,
1970
+ "step": 12600
1971
+ },
1972
+ {
1973
+ "epoch": 14.29,
1974
+ "learning_rate": 2.4152866242038215e-05,
1975
+ "loss": 0.9631,
1976
+ "step": 12650
1977
+ },
1978
+ {
1979
+ "epoch": 14.35,
1980
+ "learning_rate": 2.3914012738853502e-05,
1981
+ "loss": 0.9467,
1982
+ "step": 12700
1983
+ },
1984
+ {
1985
+ "epoch": 14.41,
1986
+ "learning_rate": 2.367515923566879e-05,
1987
+ "loss": 0.935,
1988
+ "step": 12750
1989
+ },
1990
+ {
1991
+ "epoch": 14.41,
1992
+ "eval_loss": 0.20723822712898254,
1993
+ "eval_runtime": 429.493,
1994
+ "eval_samples_per_second": 14.454,
1995
+ "eval_steps_per_second": 0.903,
1996
+ "eval_wer": 0.17866918269545154,
1997
+ "step": 12750
1998
+ },
1999
+ {
2000
+ "epoch": 14.46,
2001
+ "learning_rate": 2.3436305732484076e-05,
2002
+ "loss": 0.9319,
2003
+ "step": 12800
2004
+ },
2005
+ {
2006
+ "epoch": 14.52,
2007
+ "learning_rate": 2.3197452229299362e-05,
2008
+ "loss": 0.9337,
2009
+ "step": 12850
2010
+ },
2011
+ {
2012
+ "epoch": 14.58,
2013
+ "learning_rate": 2.295859872611465e-05,
2014
+ "loss": 0.9259,
2015
+ "step": 12900
2016
+ },
2017
+ {
2018
+ "epoch": 14.63,
2019
+ "learning_rate": 2.2719745222929936e-05,
2020
+ "loss": 0.9228,
2021
+ "step": 12950
2022
+ },
2023
+ {
2024
+ "epoch": 14.69,
2025
+ "learning_rate": 2.2480891719745222e-05,
2026
+ "loss": 0.9021,
2027
+ "step": 13000
2028
+ },
2029
+ {
2030
+ "epoch": 14.69,
2031
+ "eval_loss": 0.21045178174972534,
2032
+ "eval_runtime": 428.8167,
2033
+ "eval_samples_per_second": 14.477,
2034
+ "eval_steps_per_second": 0.905,
2035
+ "eval_wer": 0.1781273646155427,
2036
+ "step": 13000
2037
+ },
2038
+ {
2039
+ "epoch": 14.75,
2040
+ "learning_rate": 2.224203821656051e-05,
2041
+ "loss": 0.9238,
2042
+ "step": 13050
2043
+ },
2044
+ {
2045
+ "epoch": 14.8,
2046
+ "learning_rate": 2.2003184713375796e-05,
2047
+ "loss": 0.9373,
2048
+ "step": 13100
2049
+ },
2050
+ {
2051
+ "epoch": 14.86,
2052
+ "learning_rate": 2.1764331210191083e-05,
2053
+ "loss": 0.9365,
2054
+ "step": 13150
2055
+ },
2056
+ {
2057
+ "epoch": 14.91,
2058
+ "learning_rate": 2.152547770700637e-05,
2059
+ "loss": 0.9656,
2060
+ "step": 13200
2061
+ },
2062
+ {
2063
+ "epoch": 14.97,
2064
+ "learning_rate": 2.1286624203821656e-05,
2065
+ "loss": 0.9193,
2066
+ "step": 13250
2067
+ },
2068
+ {
2069
+ "epoch": 14.97,
2070
+ "eval_loss": 0.20348267257213593,
2071
+ "eval_runtime": 430.2042,
2072
+ "eval_samples_per_second": 14.43,
2073
+ "eval_steps_per_second": 0.902,
2074
+ "eval_wer": 0.17860379085822115,
2075
+ "step": 13250
2076
+ },
2077
+ {
2078
+ "epoch": 15.03,
2079
+ "learning_rate": 2.1047770700636943e-05,
2080
+ "loss": 0.9366,
2081
+ "step": 13300
2082
+ },
2083
+ {
2084
+ "epoch": 15.08,
2085
+ "learning_rate": 2.080891719745223e-05,
2086
+ "loss": 0.9129,
2087
+ "step": 13350
2088
+ },
2089
+ {
2090
+ "epoch": 15.14,
2091
+ "learning_rate": 2.0570063694267513e-05,
2092
+ "loss": 0.9032,
2093
+ "step": 13400
2094
+ },
2095
+ {
2096
+ "epoch": 15.2,
2097
+ "learning_rate": 2.03312101910828e-05,
2098
+ "loss": 0.9152,
2099
+ "step": 13450
2100
+ },
2101
+ {
2102
+ "epoch": 15.25,
2103
+ "learning_rate": 2.0092356687898086e-05,
2104
+ "loss": 0.9214,
2105
+ "step": 13500
2106
+ },
2107
+ {
2108
+ "epoch": 15.25,
2109
+ "eval_loss": 0.2034832239151001,
2110
+ "eval_runtime": 432.039,
2111
+ "eval_samples_per_second": 14.369,
2112
+ "eval_steps_per_second": 0.898,
2113
+ "eval_wer": 0.17661401066821117,
2114
+ "step": 13500
2115
+ },
2116
+ {
2117
+ "epoch": 15.31,
2118
+ "learning_rate": 1.9853503184713373e-05,
2119
+ "loss": 0.9438,
2120
+ "step": 13550
2121
+ },
2122
+ {
2123
+ "epoch": 15.37,
2124
+ "learning_rate": 1.961464968152866e-05,
2125
+ "loss": 0.9262,
2126
+ "step": 13600
2127
+ },
2128
+ {
2129
+ "epoch": 15.42,
2130
+ "learning_rate": 1.9375796178343947e-05,
2131
+ "loss": 0.9157,
2132
+ "step": 13650
2133
+ },
2134
+ {
2135
+ "epoch": 15.48,
2136
+ "learning_rate": 1.9136942675159233e-05,
2137
+ "loss": 0.9299,
2138
+ "step": 13700
2139
+ },
2140
+ {
2141
+ "epoch": 15.54,
2142
+ "learning_rate": 1.889808917197452e-05,
2143
+ "loss": 0.9048,
2144
+ "step": 13750
2145
+ },
2146
+ {
2147
+ "epoch": 15.54,
2148
+ "eval_loss": 0.19639889895915985,
2149
+ "eval_runtime": 438.8483,
2150
+ "eval_samples_per_second": 14.146,
2151
+ "eval_steps_per_second": 0.884,
2152
+ "eval_wer": 0.17581062523938082,
2153
+ "step": 13750
2154
+ },
2155
+ {
2156
+ "epoch": 15.59,
2157
+ "learning_rate": 1.8659235668789807e-05,
2158
+ "loss": 0.9399,
2159
+ "step": 13800
2160
+ },
2161
+ {
2162
+ "epoch": 15.65,
2163
+ "learning_rate": 1.8420382165605094e-05,
2164
+ "loss": 0.9309,
2165
+ "step": 13850
2166
+ },
2167
+ {
2168
+ "epoch": 15.71,
2169
+ "learning_rate": 1.818152866242038e-05,
2170
+ "loss": 0.9646,
2171
+ "step": 13900
2172
+ },
2173
+ {
2174
+ "epoch": 15.76,
2175
+ "learning_rate": 1.7942675159235667e-05,
2176
+ "loss": 0.9095,
2177
+ "step": 13950
2178
+ },
2179
+ {
2180
+ "epoch": 15.82,
2181
+ "learning_rate": 1.7703821656050954e-05,
2182
+ "loss": 0.9006,
2183
+ "step": 14000
2184
+ },
2185
+ {
2186
+ "epoch": 15.82,
2187
+ "eval_loss": 0.19844159483909607,
2188
+ "eval_runtime": 435.4721,
2189
+ "eval_samples_per_second": 14.256,
2190
+ "eval_steps_per_second": 0.891,
2191
+ "eval_wer": 0.17574523340215045,
2192
+ "step": 14000
2193
+ },
2194
+ {
2195
+ "epoch": 15.87,
2196
+ "learning_rate": 1.746496815286624e-05,
2197
+ "loss": 0.8845,
2198
+ "step": 14050
2199
+ },
2200
+ {
2201
+ "epoch": 15.93,
2202
+ "learning_rate": 1.7226114649681527e-05,
2203
+ "loss": 0.8991,
2204
+ "step": 14100
2205
+ },
2206
+ {
2207
+ "epoch": 15.99,
2208
+ "learning_rate": 1.6987261146496814e-05,
2209
+ "loss": 0.9266,
2210
+ "step": 14150
2211
+ },
2212
+ {
2213
+ "epoch": 16.05,
2214
+ "learning_rate": 1.67484076433121e-05,
2215
+ "loss": 0.9535,
2216
+ "step": 14200
2217
+ },
2218
+ {
2219
+ "epoch": 16.1,
2220
+ "learning_rate": 1.6509554140127387e-05,
2221
+ "loss": 0.9027,
2222
+ "step": 14250
2223
+ },
2224
+ {
2225
+ "epoch": 16.1,
2226
+ "eval_loss": 0.20223206281661987,
2227
+ "eval_runtime": 434.737,
2228
+ "eval_samples_per_second": 14.28,
2229
+ "eval_steps_per_second": 0.892,
2230
+ "eval_wer": 0.17431595467411512,
2231
+ "step": 14250
2232
+ },
2233
+ {
2234
+ "epoch": 16.16,
2235
+ "learning_rate": 1.6270700636942674e-05,
2236
+ "loss": 0.9095,
2237
+ "step": 14300
2238
+ },
2239
+ {
2240
+ "epoch": 16.21,
2241
+ "learning_rate": 1.603184713375796e-05,
2242
+ "loss": 0.9024,
2243
+ "step": 14350
2244
+ },
2245
+ {
2246
+ "epoch": 16.27,
2247
+ "learning_rate": 1.5792993630573248e-05,
2248
+ "loss": 0.9135,
2249
+ "step": 14400
2250
+ },
2251
+ {
2252
+ "epoch": 16.33,
2253
+ "learning_rate": 1.5554140127388534e-05,
2254
+ "loss": 0.9013,
2255
+ "step": 14450
2256
+ },
2257
+ {
2258
+ "epoch": 16.38,
2259
+ "learning_rate": 1.531528662420382e-05,
2260
+ "loss": 0.9083,
2261
+ "step": 14500
2262
+ },
2263
+ {
2264
+ "epoch": 16.38,
2265
+ "eval_loss": 0.19693595170974731,
2266
+ "eval_runtime": 437.2683,
2267
+ "eval_samples_per_second": 14.197,
2268
+ "eval_steps_per_second": 0.887,
2269
+ "eval_wer": 0.1744000298934113,
2270
+ "step": 14500
2271
+ },
2272
+ {
2273
+ "epoch": 16.44,
2274
+ "learning_rate": 1.5076433121019106e-05,
2275
+ "loss": 0.9173,
2276
+ "step": 14550
2277
+ },
2278
+ {
2279
+ "epoch": 16.5,
2280
+ "learning_rate": 1.4837579617834393e-05,
2281
+ "loss": 0.9133,
2282
+ "step": 14600
2283
+ },
2284
+ {
2285
+ "epoch": 16.55,
2286
+ "learning_rate": 1.459872611464968e-05,
2287
+ "loss": 0.9161,
2288
+ "step": 14650
2289
+ },
2290
+ {
2291
+ "epoch": 16.61,
2292
+ "learning_rate": 1.4359872611464966e-05,
2293
+ "loss": 0.8844,
2294
+ "step": 14700
2295
+ },
2296
+ {
2297
+ "epoch": 16.67,
2298
+ "learning_rate": 1.4121019108280253e-05,
2299
+ "loss": 0.9761,
2300
+ "step": 14750
2301
+ },
2302
+ {
2303
+ "epoch": 16.67,
2304
+ "eval_loss": 0.19631367921829224,
2305
+ "eval_runtime": 434.2237,
2306
+ "eval_samples_per_second": 14.297,
2307
+ "eval_steps_per_second": 0.894,
2308
+ "eval_wer": 0.17276523396265192,
2309
+ "step": 14750
2310
+ },
2311
+ {
2312
+ "epoch": 16.72,
2313
+ "learning_rate": 1.388216560509554e-05,
2314
+ "loss": 0.9057,
2315
+ "step": 14800
2316
+ },
2317
+ {
2318
+ "epoch": 16.78,
2319
+ "learning_rate": 1.3643312101910826e-05,
2320
+ "loss": 0.9128,
2321
+ "step": 14850
2322
+ },
2323
+ {
2324
+ "epoch": 16.84,
2325
+ "learning_rate": 1.3404458598726113e-05,
2326
+ "loss": 0.9056,
2327
+ "step": 14900
2328
+ },
2329
+ {
2330
+ "epoch": 16.89,
2331
+ "learning_rate": 1.31656050955414e-05,
2332
+ "loss": 0.9024,
2333
+ "step": 14950
2334
+ },
2335
+ {
2336
+ "epoch": 16.95,
2337
+ "learning_rate": 1.2926751592356687e-05,
2338
+ "loss": 0.9311,
2339
+ "step": 15000
2340
+ },
2341
+ {
2342
+ "epoch": 16.95,
2343
+ "eval_loss": 0.19600756466388702,
2344
+ "eval_runtime": 438.9128,
2345
+ "eval_samples_per_second": 14.144,
2346
+ "eval_steps_per_second": 0.884,
2347
+ "eval_wer": 0.1736807196838772,
2348
+ "step": 15000
2349
+ },
2350
+ {
2351
+ "epoch": 17.01,
2352
+ "learning_rate": 1.2687898089171973e-05,
2353
+ "loss": 0.9372,
2354
+ "step": 15050
2355
+ },
2356
+ {
2357
+ "epoch": 17.06,
2358
+ "learning_rate": 1.244904458598726e-05,
2359
+ "loss": 0.8955,
2360
+ "step": 15100
2361
+ },
2362
+ {
2363
+ "epoch": 17.12,
2364
+ "learning_rate": 1.2210191082802547e-05,
2365
+ "loss": 0.909,
2366
+ "step": 15150
2367
+ },
2368
+ {
2369
+ "epoch": 17.17,
2370
+ "learning_rate": 1.1971337579617834e-05,
2371
+ "loss": 0.9092,
2372
+ "step": 15200
2373
+ },
2374
+ {
2375
+ "epoch": 17.23,
2376
+ "learning_rate": 1.173248407643312e-05,
2377
+ "loss": 0.886,
2378
+ "step": 15250
2379
+ },
2380
+ {
2381
+ "epoch": 17.23,
2382
+ "eval_loss": 0.1928754597902298,
2383
+ "eval_runtime": 438.297,
2384
+ "eval_samples_per_second": 14.164,
2385
+ "eval_steps_per_second": 0.885,
2386
+ "eval_wer": 0.17263445028819116,
2387
+ "step": 15250
2388
+ },
2389
+ {
2390
+ "epoch": 17.29,
2391
+ "learning_rate": 1.1493630573248407e-05,
2392
+ "loss": 0.9053,
2393
+ "step": 15300
2394
+ },
2395
+ {
2396
+ "epoch": 17.34,
2397
+ "learning_rate": 1.1254777070063694e-05,
2398
+ "loss": 0.9056,
2399
+ "step": 15350
2400
+ },
2401
+ {
2402
+ "epoch": 17.4,
2403
+ "learning_rate": 1.101592356687898e-05,
2404
+ "loss": 0.9219,
2405
+ "step": 15400
2406
+ },
2407
+ {
2408
+ "epoch": 17.46,
2409
+ "learning_rate": 1.0777070063694267e-05,
2410
+ "loss": 0.8967,
2411
+ "step": 15450
2412
+ },
2413
+ {
2414
+ "epoch": 17.51,
2415
+ "learning_rate": 1.0538216560509554e-05,
2416
+ "loss": 0.8969,
2417
+ "step": 15500
2418
+ },
2419
+ {
2420
+ "epoch": 17.51,
2421
+ "eval_loss": 0.1928360015153885,
2422
+ "eval_runtime": 442.1109,
2423
+ "eval_samples_per_second": 14.042,
2424
+ "eval_steps_per_second": 0.878,
2425
+ "eval_wer": 0.17337244387979112,
2426
+ "step": 15500
2427
+ },
2428
+ {
2429
+ "epoch": 17.57,
2430
+ "learning_rate": 1.029936305732484e-05,
2431
+ "loss": 0.8899,
2432
+ "step": 15550
2433
+ },
2434
+ {
2435
+ "epoch": 17.63,
2436
+ "learning_rate": 1.0060509554140127e-05,
2437
+ "loss": 0.9056,
2438
+ "step": 15600
2439
+ },
2440
+ {
2441
+ "epoch": 17.68,
2442
+ "learning_rate": 9.821656050955414e-06,
2443
+ "loss": 0.9048,
2444
+ "step": 15650
2445
+ },
2446
+ {
2447
+ "epoch": 17.74,
2448
+ "learning_rate": 9.582802547770701e-06,
2449
+ "loss": 0.9572,
2450
+ "step": 15700
2451
+ },
2452
+ {
2453
+ "epoch": 17.8,
2454
+ "learning_rate": 9.34872611464968e-06,
2455
+ "loss": 0.9084,
2456
+ "step": 15750
2457
+ },
2458
+ {
2459
+ "epoch": 17.8,
2460
+ "eval_loss": 0.19373278319835663,
2461
+ "eval_runtime": 446.3693,
2462
+ "eval_samples_per_second": 13.908,
2463
+ "eval_steps_per_second": 0.869,
2464
+ "eval_wer": 0.17133595523461656,
2465
+ "step": 15750
2466
+ },
2467
+ {
2468
+ "epoch": 17.85,
2469
+ "learning_rate": 9.109872611464967e-06,
2470
+ "loss": 0.8861,
2471
+ "step": 15800
2472
+ },
2473
+ {
2474
+ "epoch": 17.91,
2475
+ "learning_rate": 8.871019108280254e-06,
2476
+ "loss": 0.8842,
2477
+ "step": 15850
2478
+ },
2479
+ {
2480
+ "epoch": 17.97,
2481
+ "learning_rate": 8.63216560509554e-06,
2482
+ "loss": 0.8949,
2483
+ "step": 15900
2484
+ },
2485
+ {
2486
+ "epoch": 18.02,
2487
+ "learning_rate": 8.398089171974522e-06,
2488
+ "loss": 0.8977,
2489
+ "step": 15950
2490
+ },
2491
+ {
2492
+ "epoch": 18.08,
2493
+ "learning_rate": 8.159235668789809e-06,
2494
+ "loss": 0.8795,
2495
+ "step": 16000
2496
+ },
2497
+ {
2498
+ "epoch": 18.08,
2499
+ "eval_loss": 0.1977699100971222,
2500
+ "eval_runtime": 437.7611,
2501
+ "eval_samples_per_second": 14.181,
2502
+ "eval_steps_per_second": 0.886,
2503
+ "eval_wer": 0.17086887068297102,
2504
+ "step": 16000
2505
+ },
2506
+ {
2507
+ "epoch": 18.14,
2508
+ "learning_rate": 7.920382165605094e-06,
2509
+ "loss": 0.8984,
2510
+ "step": 16050
2511
+ },
2512
+ {
2513
+ "epoch": 18.19,
2514
+ "learning_rate": 7.68152866242038e-06,
2515
+ "loss": 0.9005,
2516
+ "step": 16100
2517
+ },
2518
+ {
2519
+ "epoch": 18.25,
2520
+ "learning_rate": 7.4426751592356675e-06,
2521
+ "loss": 0.8981,
2522
+ "step": 16150
2523
+ },
2524
+ {
2525
+ "epoch": 18.3,
2526
+ "learning_rate": 7.203821656050954e-06,
2527
+ "loss": 0.9029,
2528
+ "step": 16200
2529
+ },
2530
+ {
2531
+ "epoch": 18.36,
2532
+ "learning_rate": 6.964968152866241e-06,
2533
+ "loss": 0.8883,
2534
+ "step": 16250
2535
+ },
2536
+ {
2537
+ "epoch": 18.36,
2538
+ "eval_loss": 0.19563348591327667,
2539
+ "eval_runtime": 434.7761,
2540
+ "eval_samples_per_second": 14.279,
2541
+ "eval_steps_per_second": 0.892,
2542
+ "eval_wer": 0.17032705260306222,
2543
+ "step": 16250
2544
+ },
2545
+ {
2546
+ "epoch": 18.42,
2547
+ "learning_rate": 6.726114649681528e-06,
2548
+ "loss": 0.8919,
2549
+ "step": 16300
2550
+ },
2551
+ {
2552
+ "epoch": 18.47,
2553
+ "learning_rate": 6.4872611464968145e-06,
2554
+ "loss": 0.8978,
2555
+ "step": 16350
2556
+ },
2557
+ {
2558
+ "epoch": 18.53,
2559
+ "learning_rate": 6.248407643312101e-06,
2560
+ "loss": 0.8897,
2561
+ "step": 16400
2562
+ },
2563
+ {
2564
+ "epoch": 18.59,
2565
+ "learning_rate": 6.009554140127388e-06,
2566
+ "loss": 0.9477,
2567
+ "step": 16450
2568
+ },
2569
+ {
2570
+ "epoch": 18.64,
2571
+ "learning_rate": 5.770700636942675e-06,
2572
+ "loss": 0.8901,
2573
+ "step": 16500
2574
+ },
2575
+ {
2576
+ "epoch": 18.64,
2577
+ "eval_loss": 0.19332656264305115,
2578
+ "eval_runtime": 439.133,
2579
+ "eval_samples_per_second": 14.137,
2580
+ "eval_steps_per_second": 0.884,
2581
+ "eval_wer": 0.17053256980578624,
2582
+ "step": 16500
2583
+ },
2584
+ {
2585
+ "epoch": 18.7,
2586
+ "learning_rate": 5.531847133757961e-06,
2587
+ "loss": 0.8992,
2588
+ "step": 16550
2589
+ },
2590
+ {
2591
+ "epoch": 18.76,
2592
+ "learning_rate": 5.292993630573248e-06,
2593
+ "loss": 0.8988,
2594
+ "step": 16600
2595
+ },
2596
+ {
2597
+ "epoch": 18.81,
2598
+ "learning_rate": 5.054140127388535e-06,
2599
+ "loss": 0.8885,
2600
+ "step": 16650
2601
+ },
2602
+ {
2603
+ "epoch": 18.87,
2604
+ "learning_rate": 4.815286624203822e-06,
2605
+ "loss": 0.8837,
2606
+ "step": 16700
2607
+ },
2608
+ {
2609
+ "epoch": 18.93,
2610
+ "learning_rate": 4.576433121019108e-06,
2611
+ "loss": 0.8922,
2612
+ "step": 16750
2613
+ },
2614
+ {
2615
+ "epoch": 18.93,
2616
+ "eval_loss": 0.1962379515171051,
2617
+ "eval_runtime": 444.5287,
2618
+ "eval_samples_per_second": 13.965,
2619
+ "eval_steps_per_second": 0.873,
2620
+ "eval_wer": 0.17109307126776088,
2621
+ "step": 16750
2622
+ },
2623
+ {
2624
+ "epoch": 18.98,
2625
+ "learning_rate": 4.337579617834394e-06,
2626
+ "loss": 0.8943,
2627
+ "step": 16800
2628
+ },
2629
+ {
2630
+ "epoch": 19.04,
2631
+ "learning_rate": 4.098726114649681e-06,
2632
+ "loss": 0.9171,
2633
+ "step": 16850
2634
+ },
2635
+ {
2636
+ "epoch": 19.1,
2637
+ "learning_rate": 3.859872611464968e-06,
2638
+ "loss": 0.9144,
2639
+ "step": 16900
2640
+ },
2641
+ {
2642
+ "epoch": 19.15,
2643
+ "learning_rate": 3.6210191082802544e-06,
2644
+ "loss": 0.9517,
2645
+ "step": 16950
2646
+ },
2647
+ {
2648
+ "epoch": 19.21,
2649
+ "learning_rate": 3.382165605095541e-06,
2650
+ "loss": 0.8765,
2651
+ "step": 17000
2652
+ },
2653
+ {
2654
+ "epoch": 19.21,
2655
+ "eval_loss": 0.19622743129730225,
2656
+ "eval_runtime": 445.8046,
2657
+ "eval_samples_per_second": 13.925,
2658
+ "eval_steps_per_second": 0.87,
2659
+ "eval_wer": 0.17106504619466215,
2660
+ "step": 17000
2661
+ },
2662
+ {
2663
+ "epoch": 19.27,
2664
+ "learning_rate": 3.143312101910828e-06,
2665
+ "loss": 0.9072,
2666
+ "step": 17050
2667
+ },
2668
+ {
2669
+ "epoch": 19.32,
2670
+ "learning_rate": 2.9044585987261146e-06,
2671
+ "loss": 0.8897,
2672
+ "step": 17100
2673
+ },
2674
+ {
2675
+ "epoch": 19.38,
2676
+ "learning_rate": 2.6656050955414013e-06,
2677
+ "loss": 0.8879,
2678
+ "step": 17150
2679
+ },
2680
+ {
2681
+ "epoch": 19.43,
2682
+ "learning_rate": 2.426751592356688e-06,
2683
+ "loss": 0.883,
2684
+ "step": 17200
2685
+ },
2686
+ {
2687
+ "epoch": 19.49,
2688
+ "learning_rate": 2.1878980891719744e-06,
2689
+ "loss": 0.8992,
2690
+ "step": 17250
2691
+ },
2692
+ {
2693
+ "epoch": 19.49,
2694
+ "eval_loss": 0.19645148515701294,
2695
+ "eval_runtime": 447.1526,
2696
+ "eval_samples_per_second": 13.883,
2697
+ "eval_steps_per_second": 0.868,
2698
+ "eval_wer": 0.17034573598512803,
2699
+ "step": 17250
2700
+ },
2701
+ {
2702
+ "epoch": 19.55,
2703
+ "learning_rate": 1.949044585987261e-06,
2704
+ "loss": 0.8969,
2705
+ "step": 17300
2706
+ },
2707
+ {
2708
+ "epoch": 19.6,
2709
+ "learning_rate": 1.7101910828025476e-06,
2710
+ "loss": 0.872,
2711
+ "step": 17350
2712
+ },
2713
+ {
2714
+ "epoch": 19.66,
2715
+ "learning_rate": 1.4713375796178341e-06,
2716
+ "loss": 0.8984,
2717
+ "step": 17400
2718
+ },
2719
+ {
2720
+ "epoch": 19.72,
2721
+ "learning_rate": 1.2324840764331209e-06,
2722
+ "loss": 0.8913,
2723
+ "step": 17450
2724
+ },
2725
+ {
2726
+ "epoch": 19.77,
2727
+ "learning_rate": 9.936305732484076e-07,
2728
+ "loss": 0.8778,
2729
+ "step": 17500
2730
+ },
2731
+ {
2732
+ "epoch": 19.77,
2733
+ "eval_loss": 0.19571013748645782,
2734
+ "eval_runtime": 442.6975,
2735
+ "eval_samples_per_second": 14.023,
2736
+ "eval_steps_per_second": 0.876,
2737
+ "eval_wer": 0.16990667650658123,
2738
+ "step": 17500
2739
+ },
2740
+ {
2741
+ "epoch": 19.83,
2742
+ "learning_rate": 7.547770700636942e-07,
2743
+ "loss": 0.8687,
2744
+ "step": 17550
2745
+ },
2746
+ {
2747
+ "epoch": 19.89,
2748
+ "learning_rate": 5.159235668789809e-07,
2749
+ "loss": 0.8858,
2750
+ "step": 17600
2751
+ },
2752
+ {
2753
+ "epoch": 19.94,
2754
+ "learning_rate": 2.770700636942675e-07,
2755
+ "loss": 0.8854,
2756
+ "step": 17650
2757
+ },
2758
+ {
2759
+ "epoch": 20.0,
2760
+ "learning_rate": 3.821656050955413e-08,
2761
+ "loss": 0.8898,
2762
+ "step": 17700
2763
+ },
2764
+ {
2765
+ "epoch": 20.0,
2766
+ "step": 17700,
2767
+ "total_flos": 2.3221664293970497e+20,
2768
+ "train_loss": 1.2649466082470566,
2769
+ "train_runtime": 115228.5602,
2770
+ "train_samples_per_second": 9.837,
2771
+ "train_steps_per_second": 0.154
2772
+ }
2773
+ ],
2774
+ "max_steps": 17700,
2775
+ "num_train_epochs": 20,
2776
+ "total_flos": 2.3221664293970497e+20,
2777
+ "trial_name": null,
2778
+ "trial_params": null
2779
+ }