kenzheng99 commited on
Commit
c458396
1 Parent(s): b6102e9

Update model

Browse files
README.md CHANGED
@@ -1,3 +1,363 @@
1
  ---
 
 
 
 
 
 
 
2
  license: cc-by-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - automatic-speech-recognition
6
+ language: en
7
+ datasets:
8
+ - iam
9
  license: cc-by-4.0
10
  ---
11
+
12
+ ## ESPnet2 ASR model
13
+
14
+ ### `espnet/iam_handwriting_ocr`
15
+
16
+ This model was trained by kenzheng99 using iam recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ Follow the [ESPnet installation instructions](https://espnet.github.io/espnet/installation.html)
21
+ if you haven't done that already.
22
+
23
+ ```bash
24
+ cd espnet
25
+ git checkout 2169367022b8939d22005e8cf45a65bb20bc0768
26
+ pip install -e .
27
+ cd egs2/iam/ocr1
28
+ ./run.sh --skip_data_prep false --skip_train true --download_model espnet/iam_handwriting_ocr
29
+ ```
30
+
31
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
32
+ # RESULTS
33
+ ## Environments
34
+ - date: `Mon Nov 7 13:40:17 EST 2022`
35
+ - python version: `3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0]`
36
+ - espnet version: `espnet 202209`
37
+ - pytorch version: `pytorch 1.10.0`
38
+ - Git hash: `2169367022b8939d22005e8cf45a65bb20bc0768`
39
+ - Commit date: `Thu Nov 3 20:38:03 2022 -0400`
40
+
41
+ ## asr_train_asr_conformer_extracted_en_char
42
+ ### WER
43
+
44
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
45
+ |---|---|---|---|---|---|---|---|---|
46
+ |inference_asr_model_valid.acc.ave/test|2915|25932|80.5|17.3|2.2|0.8|20.3|72.8|
47
+
48
+ ### CER
49
+
50
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
51
+ |---|---|---|---|---|---|---|---|---|
52
+ |inference_asr_model_valid.acc.ave/test|2915|125616|94.0|4.2|1.8|0.7|6.7|72.8|
53
+
54
+ ### TER
55
+
56
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
57
+ |---|---|---|---|---|---|---|---|---|
58
+
59
+ ## ASR config
60
+
61
+ <details><summary>expand</summary>
62
+
63
+ ```
64
+ config: conf/train_asr_conformer.yaml
65
+ print_config: false
66
+ log_level: INFO
67
+ dry_run: false
68
+ iterator_type: sequence
69
+ output_dir: exp/asr_train_asr_conformer_extracted_en_char
70
+ ngpu: 1
71
+ seed: 0
72
+ num_workers: 1
73
+ num_att_plot: 3
74
+ dist_backend: nccl
75
+ dist_init_method: env://
76
+ dist_world_size: 4
77
+ dist_rank: 0
78
+ local_rank: 0
79
+ dist_master_addr: localhost
80
+ dist_master_port: 35197
81
+ dist_launcher: null
82
+ multiprocessing_distributed: true
83
+ unused_parameters: false
84
+ sharded_ddp: false
85
+ cudnn_enabled: true
86
+ cudnn_benchmark: false
87
+ cudnn_deterministic: true
88
+ collect_stats: false
89
+ write_collected_feats: false
90
+ max_epoch: 200
91
+ patience: null
92
+ val_scheduler_criterion:
93
+ - valid
94
+ - loss
95
+ early_stopping_criterion:
96
+ - valid
97
+ - loss
98
+ - min
99
+ best_model_criterion:
100
+ - - valid
101
+ - acc
102
+ - max
103
+ keep_nbest_models: 10
104
+ nbest_averaging_interval: 0
105
+ grad_clip: 5.0
106
+ grad_clip_type: 2.0
107
+ grad_noise: false
108
+ accum_grad: 1
109
+ no_forward_run: false
110
+ resume: true
111
+ train_dtype: float32
112
+ use_amp: false
113
+ log_interval: null
114
+ use_matplotlib: true
115
+ use_tensorboard: true
116
+ create_graph_in_tensorboard: false
117
+ use_wandb: false
118
+ wandb_project: null
119
+ wandb_id: null
120
+ wandb_entity: null
121
+ wandb_name: null
122
+ wandb_model_log_interval: -1
123
+ detect_anomaly: false
124
+ pretrain_path: null
125
+ init_param: []
126
+ ignore_init_mismatch: false
127
+ freeze_param: []
128
+ num_iters_per_epoch: null
129
+ batch_size: 64
130
+ valid_batch_size: null
131
+ batch_bins: 1000000
132
+ valid_batch_bins: null
133
+ train_shape_file:
134
+ - exp/asr_stats_extracted_en_char/train/speech_shape
135
+ - exp/asr_stats_extracted_en_char/train/text_shape.char
136
+ valid_shape_file:
137
+ - exp/asr_stats_extracted_en_char/valid/speech_shape
138
+ - exp/asr_stats_extracted_en_char/valid/text_shape.char
139
+ batch_type: folded
140
+ valid_batch_type: null
141
+ fold_length:
142
+ - 800
143
+ - 150
144
+ sort_in_batch: descending
145
+ sort_batch: descending
146
+ multiple_iterator: false
147
+ chunk_length: 500
148
+ chunk_shift_ratio: 0.5
149
+ num_cache_chunks: 1024
150
+ train_data_path_and_name_and_type:
151
+ - - dump/extracted/train/feats.scp
152
+ - speech
153
+ - kaldi_ark
154
+ - - dump/extracted/train/text
155
+ - text
156
+ - text
157
+ valid_data_path_and_name_and_type:
158
+ - - dump/extracted/valid/feats.scp
159
+ - speech
160
+ - kaldi_ark
161
+ - - dump/extracted/valid/text
162
+ - text
163
+ - text
164
+ allow_variable_data_keys: false
165
+ max_cache_size: 0.0
166
+ max_cache_fd: 32
167
+ valid_max_cache_size: null
168
+ optim: adam
169
+ optim_conf:
170
+ lr: 0.002
171
+ weight_decay: 1.0e-06
172
+ scheduler: warmuplr
173
+ scheduler_conf:
174
+ warmup_steps: 15000
175
+ token_list:
176
+ - <blank>
177
+ - <unk>
178
+ - <space>
179
+ - e
180
+ - t
181
+ - a
182
+ - o
183
+ - n
184
+ - i
185
+ - r
186
+ - s
187
+ - h
188
+ - l
189
+ - d
190
+ - c
191
+ - u
192
+ - m
193
+ - f
194
+ - p
195
+ - g
196
+ - y
197
+ - w
198
+ - b
199
+ - .
200
+ - ','
201
+ - v
202
+ - k
203
+ - '-'
204
+ - T
205
+ - ''''
206
+ - M
207
+ - I
208
+ - A
209
+ - '"'
210
+ - S
211
+ - P
212
+ - H
213
+ - B
214
+ - C
215
+ - W
216
+ - N
217
+ - G
218
+ - x
219
+ - R
220
+ - E
221
+ - L
222
+ - F
223
+ - '0'
224
+ - D
225
+ - '1'
226
+ - j
227
+ - O
228
+ - q
229
+ - U
230
+ - K
231
+ - '!'
232
+ - '3'
233
+ - '9'
234
+ - (
235
+ - z
236
+ - )
237
+ - ':'
238
+ - V
239
+ - ;
240
+ - '5'
241
+ - '2'
242
+ - J
243
+ - '8'
244
+ - Y
245
+ - '4'
246
+ - '6'
247
+ - '?'
248
+ - '#'
249
+ - '&'
250
+ - '7'
251
+ - /
252
+ - '*'
253
+ - Q
254
+ - X
255
+ - Z
256
+ - +
257
+ - <sos/eos>
258
+ init: xavier_uniform
259
+ input_size: 100
260
+ ctc_conf:
261
+ dropout_rate: 0.0
262
+ ctc_type: builtin
263
+ reduce: true
264
+ ignore_nan_grad: null
265
+ zero_infinity: true
266
+ joint_net_conf: null
267
+ use_preprocessor: true
268
+ token_type: char
269
+ bpemodel: null
270
+ non_linguistic_symbols: null
271
+ cleaner: null
272
+ g2p: null
273
+ speech_volume_normalize: null
274
+ rir_scp: null
275
+ rir_apply_prob: 1.0
276
+ noise_scp: null
277
+ noise_apply_prob: 1.0
278
+ noise_db_range: '13_15'
279
+ short_noise_thres: 0.5
280
+ frontend: null
281
+ frontend_conf: {}
282
+ specaug: null
283
+ specaug_conf: {}
284
+ normalize: global_mvn
285
+ normalize_conf:
286
+ stats_file: exp/asr_stats_extracted_en_char/train/feats_stats.npz
287
+ model: espnet
288
+ model_conf:
289
+ ctc_weight: 0.3
290
+ lsm_weight: 0.1
291
+ length_normalized_loss: false
292
+ preencoder: null
293
+ preencoder_conf: {}
294
+ encoder: conformer
295
+ encoder_conf:
296
+ output_size: 256
297
+ attention_heads: 4
298
+ linear_units: 1024
299
+ num_blocks: 12
300
+ dropout_rate: 0.1
301
+ positional_dropout_rate: 0.1
302
+ attention_dropout_rate: 0.1
303
+ input_layer: conv2d
304
+ normalize_before: true
305
+ macaron_style: true
306
+ rel_pos_type: latest
307
+ pos_enc_layer_type: rel_pos
308
+ selfattention_layer_type: rel_selfattn
309
+ activation_type: swish
310
+ use_cnn_module: true
311
+ cnn_module_kernel: 31
312
+ postencoder: null
313
+ postencoder_conf: {}
314
+ decoder: transformer
315
+ decoder_conf:
316
+ attention_heads: 4
317
+ linear_units: 2048
318
+ num_blocks: 6
319
+ dropout_rate: 0.1
320
+ positional_dropout_rate: 0.1
321
+ self_attention_dropout_rate: 0.1
322
+ src_attention_dropout_rate: 0.1
323
+ required:
324
+ - output_dir
325
+ - token_list
326
+ version: '202209'
327
+ distributed: true
328
+ ```
329
+
330
+ </details>
331
+
332
+
333
+
334
+ ### Citing ESPnet
335
+
336
+ ```BibTex
337
+ @inproceedings{watanabe2018espnet,
338
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
339
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
340
+ year={2018},
341
+ booktitle={Proceedings of Interspeech},
342
+ pages={2207--2211},
343
+ doi={10.21437/Interspeech.2018-1456},
344
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
345
+ }
346
+
347
+
348
+
349
+
350
+ ```
351
+
352
+ or arXiv:
353
+
354
+ ```bibtex
355
+ @misc{watanabe2018espnet,
356
+ title={ESPnet: End-to-End Speech Processing Toolkit},
357
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
358
+ year={2018},
359
+ eprint={1804.00015},
360
+ archivePrefix={arXiv},
361
+ primaryClass={cs.CL}
362
+ }
363
+ ```
exp/asr_stats_extracted_en_char/train/feats_stats.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d30a0d106243130f58849bf216bcfd73a0e650e0ab1e78244aa40a3c77d4c9c
3
+ size 1562
exp/asr_train_asr_conformer_extracted_en_char/RESULTS.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
2
+ # RESULTS
3
+ ## Environments
4
+ - date: `Mon Nov 7 13:40:17 EST 2022`
5
+ - python version: `3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0]`
6
+ - espnet version: `espnet 202209`
7
+ - pytorch version: `pytorch 1.10.0`
8
+ - Git hash: `2169367022b8939d22005e8cf45a65bb20bc0768`
9
+ - Commit date: `Thu Nov 3 20:38:03 2022 -0400`
10
+
11
+ ## asr_train_asr_conformer_extracted_en_char
12
+ ### WER
13
+
14
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
15
+ |---|---|---|---|---|---|---|---|---|
16
+ |inference_asr_model_valid.acc.ave/test|2915|25932|80.5|17.3|2.2|0.8|20.3|72.8|
17
+
18
+ ### CER
19
+
20
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
21
+ |---|---|---|---|---|---|---|---|---|
22
+ |inference_asr_model_valid.acc.ave/test|2915|125616|94.0|4.2|1.8|0.7|6.7|72.8|
23
+
24
+ ### TER
25
+
26
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
27
+ |---|---|---|---|---|---|---|---|---|
exp/asr_train_asr_conformer_extracted_en_char/config.yaml ADDED
@@ -0,0 +1,264 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/train_asr_conformer.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ dry_run: false
5
+ iterator_type: sequence
6
+ output_dir: exp/asr_train_asr_conformer_extracted_en_char
7
+ ngpu: 1
8
+ seed: 0
9
+ num_workers: 1
10
+ num_att_plot: 3
11
+ dist_backend: nccl
12
+ dist_init_method: env://
13
+ dist_world_size: 4
14
+ dist_rank: 0
15
+ local_rank: 0
16
+ dist_master_addr: localhost
17
+ dist_master_port: 35197
18
+ dist_launcher: null
19
+ multiprocessing_distributed: true
20
+ unused_parameters: false
21
+ sharded_ddp: false
22
+ cudnn_enabled: true
23
+ cudnn_benchmark: false
24
+ cudnn_deterministic: true
25
+ collect_stats: false
26
+ write_collected_feats: false
27
+ max_epoch: 200
28
+ patience: null
29
+ val_scheduler_criterion:
30
+ - valid
31
+ - loss
32
+ early_stopping_criterion:
33
+ - valid
34
+ - loss
35
+ - min
36
+ best_model_criterion:
37
+ - - valid
38
+ - acc
39
+ - max
40
+ keep_nbest_models: 10
41
+ nbest_averaging_interval: 0
42
+ grad_clip: 5.0
43
+ grad_clip_type: 2.0
44
+ grad_noise: false
45
+ accum_grad: 1
46
+ no_forward_run: false
47
+ resume: true
48
+ train_dtype: float32
49
+ use_amp: false
50
+ log_interval: null
51
+ use_matplotlib: true
52
+ use_tensorboard: true
53
+ create_graph_in_tensorboard: false
54
+ use_wandb: false
55
+ wandb_project: null
56
+ wandb_id: null
57
+ wandb_entity: null
58
+ wandb_name: null
59
+ wandb_model_log_interval: -1
60
+ detect_anomaly: false
61
+ pretrain_path: null
62
+ init_param: []
63
+ ignore_init_mismatch: false
64
+ freeze_param: []
65
+ num_iters_per_epoch: null
66
+ batch_size: 64
67
+ valid_batch_size: null
68
+ batch_bins: 1000000
69
+ valid_batch_bins: null
70
+ train_shape_file:
71
+ - exp/asr_stats_extracted_en_char/train/speech_shape
72
+ - exp/asr_stats_extracted_en_char/train/text_shape.char
73
+ valid_shape_file:
74
+ - exp/asr_stats_extracted_en_char/valid/speech_shape
75
+ - exp/asr_stats_extracted_en_char/valid/text_shape.char
76
+ batch_type: folded
77
+ valid_batch_type: null
78
+ fold_length:
79
+ - 800
80
+ - 150
81
+ sort_in_batch: descending
82
+ sort_batch: descending
83
+ multiple_iterator: false
84
+ chunk_length: 500
85
+ chunk_shift_ratio: 0.5
86
+ num_cache_chunks: 1024
87
+ train_data_path_and_name_and_type:
88
+ - - dump/extracted/train/feats.scp
89
+ - speech
90
+ - kaldi_ark
91
+ - - dump/extracted/train/text
92
+ - text
93
+ - text
94
+ valid_data_path_and_name_and_type:
95
+ - - dump/extracted/valid/feats.scp
96
+ - speech
97
+ - kaldi_ark
98
+ - - dump/extracted/valid/text
99
+ - text
100
+ - text
101
+ allow_variable_data_keys: false
102
+ max_cache_size: 0.0
103
+ max_cache_fd: 32
104
+ valid_max_cache_size: null
105
+ optim: adam
106
+ optim_conf:
107
+ lr: 0.002
108
+ weight_decay: 1.0e-06
109
+ scheduler: warmuplr
110
+ scheduler_conf:
111
+ warmup_steps: 15000
112
+ token_list:
113
+ - <blank>
114
+ - <unk>
115
+ - <space>
116
+ - e
117
+ - t
118
+ - a
119
+ - o
120
+ - n
121
+ - i
122
+ - r
123
+ - s
124
+ - h
125
+ - l
126
+ - d
127
+ - c
128
+ - u
129
+ - m
130
+ - f
131
+ - p
132
+ - g
133
+ - y
134
+ - w
135
+ - b
136
+ - .
137
+ - ','
138
+ - v
139
+ - k
140
+ - '-'
141
+ - T
142
+ - ''''
143
+ - M
144
+ - I
145
+ - A
146
+ - '"'
147
+ - S
148
+ - P
149
+ - H
150
+ - B
151
+ - C
152
+ - W
153
+ - N
154
+ - G
155
+ - x
156
+ - R
157
+ - E
158
+ - L
159
+ - F
160
+ - '0'
161
+ - D
162
+ - '1'
163
+ - j
164
+ - O
165
+ - q
166
+ - U
167
+ - K
168
+ - '!'
169
+ - '3'
170
+ - '9'
171
+ - (
172
+ - z
173
+ - )
174
+ - ':'
175
+ - V
176
+ - ;
177
+ - '5'
178
+ - '2'
179
+ - J
180
+ - '8'
181
+ - Y
182
+ - '4'
183
+ - '6'
184
+ - '?'
185
+ - '#'
186
+ - '&'
187
+ - '7'
188
+ - /
189
+ - '*'
190
+ - Q
191
+ - X
192
+ - Z
193
+ - +
194
+ - <sos/eos>
195
+ init: xavier_uniform
196
+ input_size: 100
197
+ ctc_conf:
198
+ dropout_rate: 0.0
199
+ ctc_type: builtin
200
+ reduce: true
201
+ ignore_nan_grad: null
202
+ zero_infinity: true
203
+ joint_net_conf: null
204
+ use_preprocessor: true
205
+ token_type: char
206
+ bpemodel: null
207
+ non_linguistic_symbols: null
208
+ cleaner: null
209
+ g2p: null
210
+ speech_volume_normalize: null
211
+ rir_scp: null
212
+ rir_apply_prob: 1.0
213
+ noise_scp: null
214
+ noise_apply_prob: 1.0
215
+ noise_db_range: '13_15'
216
+ short_noise_thres: 0.5
217
+ frontend: null
218
+ frontend_conf: {}
219
+ specaug: null
220
+ specaug_conf: {}
221
+ normalize: global_mvn
222
+ normalize_conf:
223
+ stats_file: exp/asr_stats_extracted_en_char/train/feats_stats.npz
224
+ model: espnet
225
+ model_conf:
226
+ ctc_weight: 0.3
227
+ lsm_weight: 0.1
228
+ length_normalized_loss: false
229
+ preencoder: null
230
+ preencoder_conf: {}
231
+ encoder: conformer
232
+ encoder_conf:
233
+ output_size: 256
234
+ attention_heads: 4
235
+ linear_units: 1024
236
+ num_blocks: 12
237
+ dropout_rate: 0.1
238
+ positional_dropout_rate: 0.1
239
+ attention_dropout_rate: 0.1
240
+ input_layer: conv2d
241
+ normalize_before: true
242
+ macaron_style: true
243
+ rel_pos_type: latest
244
+ pos_enc_layer_type: rel_pos
245
+ selfattention_layer_type: rel_selfattn
246
+ activation_type: swish
247
+ use_cnn_module: true
248
+ cnn_module_kernel: 31
249
+ postencoder: null
250
+ postencoder_conf: {}
251
+ decoder: transformer
252
+ decoder_conf:
253
+ attention_heads: 4
254
+ linear_units: 2048
255
+ num_blocks: 6
256
+ dropout_rate: 0.1
257
+ positional_dropout_rate: 0.1
258
+ self_attention_dropout_rate: 0.1
259
+ src_attention_dropout_rate: 0.1
260
+ required:
261
+ - output_dir
262
+ - token_list
263
+ version: '202209'
264
+ distributed: true
exp/asr_train_asr_conformer_extracted_en_char/images/acc.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/backward_time.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/cer.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/cer_ctc.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/forward_time.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/gpu_max_cached_mem_GB.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/iter_time.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/loss.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/loss_att.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/loss_ctc.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/optim0_lr0.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/optim_step_time.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/train_time.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/images/wer.png ADDED
exp/asr_train_asr_conformer_extracted_en_char/valid.acc.ave_10best.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:acd1e13811f40127205f038b2f074b1f17847a32de52c0debe6c959c56eb5539
3
+ size 123344039
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: '202209'
2
+ files:
3
+ asr_model_file: exp/asr_train_asr_conformer_extracted_en_char/valid.acc.ave_10best.pth
4
+ python: "3.7.13 (default, Mar 29 2022, 02:18:16) \n[GCC 7.5.0]"
5
+ timestamp: 1667846424.799758
6
+ torch: 1.10.0
7
+ yaml_files:
8
+ asr_train_config: exp/asr_train_asr_conformer_extracted_en_char/config.yaml