dzeinali commited on
Commit
e83fc97
1 Parent(s): 817dbdc

Update model

Browse files
README.md ADDED
@@ -0,0 +1,636 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - automatic-speech-recognition
6
+ language: fr
7
+ datasets:
8
+ - commonvoice
9
+ license: cc-by-4.0
10
+ ---
11
+
12
+ ## ESPnet2 ASR model
13
+
14
+ ### `espnet/french_commonvoice_blstm`
15
+
16
+ This model was trained by dzeinali using commonvoice recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ ```bash
21
+ cd espnet
22
+ git checkout 716eb8f92e19708acfd08ba3bd39d40890d3a84b
23
+ pip install -e .
24
+ cd egs2/commonvoice/asr1
25
+ ./run.sh --skip_data_prep false --skip_train true --download_model espnet/french_commonvoice_blstm
26
+ ```
27
+
28
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
29
+ # RESULTS
30
+ ## Environments
31
+ - date: `Fri Apr 29 17:20:37 EDT 2022`
32
+ - python version: `3.9.5 (default, Jun 4 2021, 12:28:51) [GCC 7.5.0]`
33
+ - espnet version: `espnet 0.10.6a1`
34
+ - pytorch version: `pytorch 1.8.1+cu102`
35
+ - Git hash: `716eb8f92e19708acfd08ba3bd39d40890d3a84b`
36
+ - Commit date: `Thu Apr 28 19:50:59 2022 -0400`
37
+
38
+ ## asr_train_asr_rnn_raw_fr_bpe350_sp
39
+ ### WER
40
+
41
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
42
+ |---|---|---|---|---|---|---|---|---|
43
+ |decode_rnn_asr_model_valid.acc.best/test_fr|15621|151227|75.1|22.6|2.3|2.3|27.2|81.0|
44
+
45
+ ### CER
46
+
47
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
48
+ |---|---|---|---|---|---|---|---|---|
49
+ |decode_rnn_asr_model_valid.acc.best/test_fr|15621|952803|92.9|3.6|3.5|2.0|9.1|81.0|
50
+
51
+ ### TER
52
+
53
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
54
+ |---|---|---|---|---|---|---|---|---|
55
+ |decode_rnn_asr_model_valid.acc.best/test_fr|15621|730898|89.9|6.5|3.6|1.9|12.0|81.0|
56
+
57
+ ## ASR config
58
+
59
+ <details><summary>expand</summary>
60
+
61
+ ```
62
+ config: conf/tuning/train_asr_rnn.yaml
63
+ print_config: false
64
+ log_level: INFO
65
+ dry_run: false
66
+ iterator_type: sequence
67
+ output_dir: exp/asr_train_asr_rnn_raw_fr_bpe350_sp
68
+ ngpu: 1
69
+ seed: 0
70
+ num_workers: 1
71
+ num_att_plot: 3
72
+ dist_backend: nccl
73
+ dist_init_method: env://
74
+ dist_world_size: null
75
+ dist_rank: null
76
+ local_rank: 0
77
+ dist_master_addr: null
78
+ dist_master_port: null
79
+ dist_launcher: null
80
+ multiprocessing_distributed: false
81
+ unused_parameters: false
82
+ sharded_ddp: false
83
+ cudnn_enabled: true
84
+ cudnn_benchmark: false
85
+ cudnn_deterministic: true
86
+ collect_stats: false
87
+ write_collected_feats: false
88
+ max_epoch: 15
89
+ patience: 3
90
+ val_scheduler_criterion:
91
+ - valid
92
+ - loss
93
+ early_stopping_criterion:
94
+ - valid
95
+ - loss
96
+ - min
97
+ best_model_criterion:
98
+ - - train
99
+ - loss
100
+ - min
101
+ - - valid
102
+ - loss
103
+ - min
104
+ - - train
105
+ - acc
106
+ - max
107
+ - - valid
108
+ - acc
109
+ - max
110
+ keep_nbest_models:
111
+ - 10
112
+ nbest_averaging_interval: 0
113
+ grad_clip: 5.0
114
+ grad_clip_type: 2.0
115
+ grad_noise: false
116
+ accum_grad: 1
117
+ no_forward_run: false
118
+ resume: true
119
+ train_dtype: float32
120
+ use_amp: false
121
+ log_interval: null
122
+ use_matplotlib: true
123
+ use_tensorboard: true
124
+ use_wandb: false
125
+ wandb_project: null
126
+ wandb_id: null
127
+ wandb_entity: null
128
+ wandb_name: null
129
+ wandb_model_log_interval: -1
130
+ detect_anomaly: false
131
+ pretrain_path: null
132
+ init_param: []
133
+ ignore_init_mismatch: false
134
+ freeze_param: []
135
+ num_iters_per_epoch: null
136
+ batch_size: 30
137
+ valid_batch_size: null
138
+ batch_bins: 1000000
139
+ valid_batch_bins: null
140
+ train_shape_file:
141
+ - exp/asr_stats_raw_fr_bpe350_sp/train/speech_shape
142
+ - exp/asr_stats_raw_fr_bpe350_sp/train/text_shape.bpe
143
+ valid_shape_file:
144
+ - exp/asr_stats_raw_fr_bpe350_sp/valid/speech_shape
145
+ - exp/asr_stats_raw_fr_bpe350_sp/valid/text_shape.bpe
146
+ batch_type: folded
147
+ valid_batch_type: null
148
+ fold_length:
149
+ - 80000
150
+ - 150
151
+ sort_in_batch: descending
152
+ sort_batch: descending
153
+ multiple_iterator: false
154
+ chunk_length: 500
155
+ chunk_shift_ratio: 0.5
156
+ num_cache_chunks: 1024
157
+ train_data_path_and_name_and_type:
158
+ - - dump/raw/train_fr_sp/wav.scp
159
+ - speech
160
+ - sound
161
+ - - dump/raw/train_fr_sp/text
162
+ - text
163
+ - text
164
+ valid_data_path_and_name_and_type:
165
+ - - dump/raw/dev_fr/wav.scp
166
+ - speech
167
+ - sound
168
+ - - dump/raw/dev_fr/text
169
+ - text
170
+ - text
171
+ allow_variable_data_keys: false
172
+ max_cache_size: 0.0
173
+ max_cache_fd: 32
174
+ valid_max_cache_size: null
175
+ optim: adadelta
176
+ optim_conf:
177
+ lr: 0.1
178
+ scheduler: null
179
+ scheduler_conf: {}
180
+ token_list:
181
+ - <blank>
182
+ - <unk>
183
+ - S
184
+ - ▁
185
+ - E
186
+ - I
187
+ - T
188
+ - A
189
+ - U
190
+ - O
191
+ - .
192
+ - L
193
+ - R
194
+ - é
195
+ - P
196
+ - C
197
+ - V
198
+ - 'ON'
199
+ - M
200
+ - ▁DE
201
+ - ','
202
+ - N
203
+ - ▁S
204
+ - D
205
+ - IN
206
+ - ''''
207
+ - OU
208
+ - ▁D
209
+ - G
210
+ - IS
211
+ - ▁P
212
+ - ER
213
+ - ▁C
214
+ - ▁L
215
+ - ▁LA
216
+ - B
217
+ - ▁"
218
+ - ▁A
219
+ - RE
220
+ - AN
221
+ - ."
222
+ - ▁M
223
+ - ▁F
224
+ - '-'
225
+ - F
226
+ - ▁T
227
+ - ES
228
+ - ENT
229
+ - ▁LE
230
+ - EN
231
+ - IT
232
+ - LE
233
+ - ▁N
234
+ - è
235
+ - H
236
+ - ’
237
+ - Y
238
+ - X
239
+ - Z
240
+ - K
241
+ - J
242
+ - ê
243
+ - '?'
244
+ - '!'
245
+ - É
246
+ - ç
247
+ - W
248
+ - à
249
+ - ô
250
+ - â
251
+ - Q
252
+ - î
253
+ - À
254
+ - '"'
255
+ - œ
256
+ - û
257
+ - ù
258
+ - ï
259
+ - ':'
260
+ - ;
261
+ - —
262
+ - È
263
+ - «
264
+ - »
265
+ - Ç
266
+ - Ê
267
+ - ë
268
+ - á
269
+ - ü
270
+ - í
271
+ - ö
272
+ - ó
273
+ - )
274
+ - Î
275
+ - Â
276
+ - ō
277
+ - ä
278
+ - –
279
+ - Ô
280
+ - ć
281
+ - š
282
+ - '&'
283
+ - ñ
284
+ - '='
285
+ - ł
286
+ - č
287
+ - Û
288
+ - ú
289
+ - ū
290
+ - ø
291
+ - ā
292
+ - ã
293
+ - ă
294
+ - /
295
+ - ń
296
+ - _
297
+ - ș
298
+ - å
299
+ - æ
300
+ - °
301
+ - ß
302
+ - “
303
+ - ”
304
+ - ž
305
+ - ı
306
+ - Œ
307
+ - Ö
308
+ - ř
309
+ - Š
310
+ - ý
311
+ - Ō
312
+ - ‘
313
+ - ş
314
+ - ·
315
+ - o
316
+ - ę
317
+ - ÿ
318
+ - Å
319
+ - ą
320
+ - ð
321
+ - ī
322
+ - ò
323
+ - ż
324
+ - ě
325
+ - ś
326
+ - '`'
327
+ - Ë
328
+ - ì
329
+ - ē
330
+ - ğ
331
+ - İ
332
+ - '*'
333
+ - Í
334
+ - ė
335
+ - Ó
336
+ - ő
337
+ - đ
338
+ - ʻ
339
+ - Ü
340
+ - õ
341
+ - Ä
342
+ - ņ
343
+ - ṣ
344
+ - '|'
345
+ - ʾ
346
+ - π
347
+ - Ā
348
+ - σ
349
+ - '%'
350
+ - ả
351
+ - κ
352
+ - ʼ
353
+ - ň
354
+ - Ú
355
+ - ļ
356
+ - ư
357
+ - '1'
358
+ - '2'
359
+ - '}'
360
+ - ĩ
361
+ - Ҫ
362
+ - ا
363
+ - ầ
364
+ - ⁄
365
+ - ṇ
366
+ - þ
367
+ - ǎ
368
+ - ο
369
+ - ′
370
+ - s
371
+ - §
372
+ - ľ
373
+ - ǹ
374
+ - Ʉ
375
+ - ː
376
+ - ̱
377
+ - γ
378
+ - ν
379
+ - ن
380
+ - ạ
381
+ - ễ
382
+ - ộ
383
+ - ≥
384
+ - 星
385
+ - ề
386
+ - ṯ
387
+ - τ
388
+ - δ
389
+ - Δ
390
+ - Ț
391
+ - Ș
392
+ - Ū
393
+ - Ř
394
+ - ∆
395
+ - →
396
+ - ệ
397
+ - Г
398
+ - ơ
399
+ - ţ
400
+ - Þ
401
+ - Ñ
402
+ - ±
403
+ - ť
404
+ - ŏ
405
+ - €
406
+ - „
407
+ - ʿ
408
+ - Ć
409
+ - £
410
+ - α
411
+ - Ż
412
+ - Ş
413
+ - β
414
+ - ź
415
+ - Đ
416
+ - Ø
417
+ - Ś
418
+ - Ž
419
+ - Æ
420
+ - $
421
+ - Ï
422
+ - Ł
423
+ - ț
424
+ - Č
425
+ - Á
426
+ - ́
427
+ - Ù
428
+ - Μ
429
+ - ι
430
+ - ρ
431
+ - ό
432
+ - И
433
+ - з
434
+ - 京
435
+ - 北
436
+ - ď
437
+ - Ġ
438
+ - Ṭ
439
+ - −
440
+ - ☉
441
+ - '~'
442
+ - ®
443
+ - Ì
444
+ - Ò
445
+ - Õ
446
+ - ×
447
+ - ħ
448
+ - ĺ
449
+ - Ľ
450
+ - ũ
451
+ - ů
452
+ - Ų
453
+ - ǃ
454
+ - ǔ
455
+ - ̠
456
+ - ̲
457
+ - Κ
458
+ - Π
459
+ - ε
460
+ - ζ
461
+ - μ
462
+ - ς
463
+ - υ
464
+ - ψ
465
+ - І
466
+ - Ј
467
+ - А
468
+ - Е
469
+ - П
470
+ - а
471
+ - е
472
+ - м
473
+ - н
474
+ - Գ
475
+ - Զ
476
+ - ب
477
+ - د
478
+ - ر
479
+ - ل
480
+ - و
481
+ - ي
482
+ - ወ
483
+ - ደ
484
+ - ḍ
485
+ - ṅ
486
+ - ṭ
487
+ - ậ
488
+ - ắ
489
+ - ẵ
490
+ - ị
491
+ - ồ
492
+ - ờ
493
+ - ợ
494
+ - ủ
495
+ - ‐
496
+ - ―
497
+ - †
498
+ - ‹
499
+ - ›
500
+ - ₽
501
+ - ∈
502
+ - ∞
503
+ - ─
504
+ - い
505
+ - う
506
+ - た
507
+ - つ
508
+ - へ
509
+ - ま
510
+ - め
511
+ - や
512
+ - ゔ
513
+ - 扬
514
+ - 术
515
+ - 美
516
+ - 貴
517
+ - 青
518
+ - 馆
519
+ - Ꝑ
520
+ - ̐
521
+ - Ω
522
+ - ử
523
+ - ỳ
524
+ - ∨
525
+ - 乃
526
+ - 杜
527
+ - (
528
+ - Ē
529
+ - ǫ
530
+ - <sos/eos>
531
+ init: null
532
+ input_size: null
533
+ ctc_conf:
534
+ dropout_rate: 0.0
535
+ ctc_type: builtin
536
+ reduce: true
537
+ ignore_nan_grad: true
538
+ joint_net_conf: null
539
+ model_conf:
540
+ ctc_weight: 0.5
541
+ use_preprocessor: true
542
+ token_type: bpe
543
+ bpemodel: data/fr_token_list/bpe_unigram350/bpe.model
544
+ non_linguistic_symbols: null
545
+ cleaner: null
546
+ g2p: null
547
+ speech_volume_normalize: null
548
+ rir_scp: null
549
+ rir_apply_prob: 1.0
550
+ noise_scp: null
551
+ noise_apply_prob: 1.0
552
+ noise_db_range: '13_15'
553
+ frontend: default
554
+ frontend_conf:
555
+ fs: 16k
556
+ specaug: specaug
557
+ specaug_conf:
558
+ apply_time_warp: true
559
+ time_warp_window: 5
560
+ time_warp_mode: bicubic
561
+ apply_freq_mask: true
562
+ freq_mask_width_range:
563
+ - 0
564
+ - 27
565
+ num_freq_mask: 2
566
+ apply_time_mask: true
567
+ time_mask_width_ratio_range:
568
+ - 0.0
569
+ - 0.05
570
+ num_time_mask: 2
571
+ normalize: global_mvn
572
+ normalize_conf:
573
+ stats_file: exp/asr_stats_raw_fr_bpe350_sp/train/feats_stats.npz
574
+ preencoder: null
575
+ preencoder_conf: {}
576
+ encoder: vgg_rnn
577
+ encoder_conf:
578
+ rnn_type: lstm
579
+ bidirectional: true
580
+ use_projection: true
581
+ num_layers: 4
582
+ hidden_size: 1024
583
+ output_size: 1024
584
+ postencoder: null
585
+ postencoder_conf: {}
586
+ decoder: rnn
587
+ decoder_conf:
588
+ num_layers: 2
589
+ hidden_size: 1024
590
+ sampling_probability: 0
591
+ att_conf:
592
+ atype: location
593
+ adim: 1024
594
+ aconv_chans: 10
595
+ aconv_filts: 100
596
+ required:
597
+ - output_dir
598
+ - token_list
599
+ version: 0.10.6a1
600
+ distributed: false
601
+ ```
602
+
603
+ </details>
604
+
605
+
606
+
607
+ ### Citing ESPnet
608
+
609
+ ```BibTex
610
+ @inproceedings{watanabe2018espnet,
611
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
612
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
613
+ year={2018},
614
+ booktitle={Proceedings of Interspeech},
615
+ pages={2207--2211},
616
+ doi={10.21437/Interspeech.2018-1456},
617
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
618
+ }
619
+
620
+
621
+
622
+
623
+ ```
624
+
625
+ or arXiv:
626
+
627
+ ```bibtex
628
+ @misc{watanabe2018espnet,
629
+ title={ESPnet: End-to-End Speech Processing Toolkit},
630
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
631
+ year={2018},
632
+ eprint={1804.00015},
633
+ archivePrefix={arXiv},
634
+ primaryClass={cs.CL}
635
+ }
636
+ ```
data/fr_token_list/bpe_unigram350/bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a04833c85d60af6b34afba74ca201dcb689c126996dc669069a1f839f07ed4ce
3
+ size 241595
exp/asr_stats_raw_fr_bpe350_sp/train/feats_stats.npz ADDED
Binary file (1.4 kB). View file
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/4epoch.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44c5190ecfeb14c51aebedf244b3c14e49a2c7b45a8cdd926c6fb8e045a93e83
3
+ size 450440626
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/RESULTS.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
2
+ # RESULTS
3
+ ## Environments
4
+ - date: `Fri Apr 29 17:20:37 EDT 2022`
5
+ - python version: `3.9.5 (default, Jun 4 2021, 12:28:51) [GCC 7.5.0]`
6
+ - espnet version: `espnet 0.10.6a1`
7
+ - pytorch version: `pytorch 1.8.1+cu102`
8
+ - Git hash: `716eb8f92e19708acfd08ba3bd39d40890d3a84b`
9
+ - Commit date: `Thu Apr 28 19:50:59 2022 -0400`
10
+
11
+ ## asr_train_asr_rnn_raw_fr_bpe350_sp
12
+ ### WER
13
+
14
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
15
+ |---|---|---|---|---|---|---|---|---|
16
+ |decode_rnn_asr_model_valid.acc.best/test_fr|15621|151227|75.1|22.6|2.3|2.3|27.2|81.0|
17
+
18
+ ### CER
19
+
20
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
21
+ |---|---|---|---|---|---|---|---|---|
22
+ |decode_rnn_asr_model_valid.acc.best/test_fr|15621|952803|92.9|3.6|3.5|2.0|9.1|81.0|
23
+
24
+ ### TER
25
+
26
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
27
+ |---|---|---|---|---|---|---|---|---|
28
+ |decode_rnn_asr_model_valid.acc.best/test_fr|15621|730898|89.9|6.5|3.6|1.9|12.0|81.0|
29
+
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/config.yaml ADDED
@@ -0,0 +1,539 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/tuning/train_asr_rnn.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ dry_run: false
5
+ iterator_type: sequence
6
+ output_dir: exp/asr_train_asr_rnn_raw_fr_bpe350_sp
7
+ ngpu: 1
8
+ seed: 0
9
+ num_workers: 1
10
+ num_att_plot: 3
11
+ dist_backend: nccl
12
+ dist_init_method: env://
13
+ dist_world_size: null
14
+ dist_rank: null
15
+ local_rank: 0
16
+ dist_master_addr: null
17
+ dist_master_port: null
18
+ dist_launcher: null
19
+ multiprocessing_distributed: false
20
+ unused_parameters: false
21
+ sharded_ddp: false
22
+ cudnn_enabled: true
23
+ cudnn_benchmark: false
24
+ cudnn_deterministic: true
25
+ collect_stats: false
26
+ write_collected_feats: false
27
+ max_epoch: 15
28
+ patience: 3
29
+ val_scheduler_criterion:
30
+ - valid
31
+ - loss
32
+ early_stopping_criterion:
33
+ - valid
34
+ - loss
35
+ - min
36
+ best_model_criterion:
37
+ - - train
38
+ - loss
39
+ - min
40
+ - - valid
41
+ - loss
42
+ - min
43
+ - - train
44
+ - acc
45
+ - max
46
+ - - valid
47
+ - acc
48
+ - max
49
+ keep_nbest_models:
50
+ - 10
51
+ nbest_averaging_interval: 0
52
+ grad_clip: 5.0
53
+ grad_clip_type: 2.0
54
+ grad_noise: false
55
+ accum_grad: 1
56
+ no_forward_run: false
57
+ resume: true
58
+ train_dtype: float32
59
+ use_amp: false
60
+ log_interval: null
61
+ use_matplotlib: true
62
+ use_tensorboard: true
63
+ use_wandb: false
64
+ wandb_project: null
65
+ wandb_id: null
66
+ wandb_entity: null
67
+ wandb_name: null
68
+ wandb_model_log_interval: -1
69
+ detect_anomaly: false
70
+ pretrain_path: null
71
+ init_param: []
72
+ ignore_init_mismatch: false
73
+ freeze_param: []
74
+ num_iters_per_epoch: null
75
+ batch_size: 30
76
+ valid_batch_size: null
77
+ batch_bins: 1000000
78
+ valid_batch_bins: null
79
+ train_shape_file:
80
+ - exp/asr_stats_raw_fr_bpe350_sp/train/speech_shape
81
+ - exp/asr_stats_raw_fr_bpe350_sp/train/text_shape.bpe
82
+ valid_shape_file:
83
+ - exp/asr_stats_raw_fr_bpe350_sp/valid/speech_shape
84
+ - exp/asr_stats_raw_fr_bpe350_sp/valid/text_shape.bpe
85
+ batch_type: folded
86
+ valid_batch_type: null
87
+ fold_length:
88
+ - 80000
89
+ - 150
90
+ sort_in_batch: descending
91
+ sort_batch: descending
92
+ multiple_iterator: false
93
+ chunk_length: 500
94
+ chunk_shift_ratio: 0.5
95
+ num_cache_chunks: 1024
96
+ train_data_path_and_name_and_type:
97
+ - - dump/raw/train_fr_sp/wav.scp
98
+ - speech
99
+ - sound
100
+ - - dump/raw/train_fr_sp/text
101
+ - text
102
+ - text
103
+ valid_data_path_and_name_and_type:
104
+ - - dump/raw/dev_fr/wav.scp
105
+ - speech
106
+ - sound
107
+ - - dump/raw/dev_fr/text
108
+ - text
109
+ - text
110
+ allow_variable_data_keys: false
111
+ max_cache_size: 0.0
112
+ max_cache_fd: 32
113
+ valid_max_cache_size: null
114
+ optim: adadelta
115
+ optim_conf:
116
+ lr: 0.1
117
+ scheduler: null
118
+ scheduler_conf: {}
119
+ token_list:
120
+ - <blank>
121
+ - <unk>
122
+ - S
123
+ - ▁
124
+ - E
125
+ - I
126
+ - T
127
+ - A
128
+ - U
129
+ - O
130
+ - .
131
+ - L
132
+ - R
133
+ - é
134
+ - P
135
+ - C
136
+ - V
137
+ - 'ON'
138
+ - M
139
+ - ▁DE
140
+ - ','
141
+ - N
142
+ - ▁S
143
+ - D
144
+ - IN
145
+ - ''''
146
+ - OU
147
+ - ▁D
148
+ - G
149
+ - IS
150
+ - ▁P
151
+ - ER
152
+ - ▁C
153
+ - ▁L
154
+ - ▁LA
155
+ - B
156
+ - ▁"
157
+ - ▁A
158
+ - RE
159
+ - AN
160
+ - ."
161
+ - ▁M
162
+ - ▁F
163
+ - '-'
164
+ - F
165
+ - ▁T
166
+ - ES
167
+ - ENT
168
+ - ▁LE
169
+ - EN
170
+ - IT
171
+ - LE
172
+ - ▁N
173
+ - è
174
+ - H
175
+ - ’
176
+ - Y
177
+ - X
178
+ - Z
179
+ - K
180
+ - J
181
+ - ê
182
+ - '?'
183
+ - '!'
184
+ - É
185
+ - ç
186
+ - W
187
+ - à
188
+ - ô
189
+ - â
190
+ - Q
191
+ - î
192
+ - À
193
+ - '"'
194
+ - œ
195
+ - û
196
+ - ù
197
+ - ï
198
+ - ':'
199
+ - ;
200
+ - —
201
+ - È
202
+ - «
203
+ - »
204
+ - Ç
205
+ - Ê
206
+ - ë
207
+ - á
208
+ - ü
209
+ - í
210
+ - ö
211
+ - ó
212
+ - )
213
+ - Î
214
+ - Â
215
+ - ō
216
+ - ä
217
+ - –
218
+ - Ô
219
+ - ć
220
+ - š
221
+ - '&'
222
+ - ñ
223
+ - '='
224
+ - ł
225
+ - č
226
+ - Û
227
+ - ú
228
+ - ū
229
+ - ø
230
+ - ā
231
+ - ã
232
+ - ă
233
+ - /
234
+ - ń
235
+ - _
236
+ - ș
237
+ - å
238
+ - æ
239
+ - °
240
+ - ß
241
+ - “
242
+ - ”
243
+ - ž
244
+ - ı
245
+ - Œ
246
+ - Ö
247
+ - ř
248
+ - Š
249
+ - ý
250
+ - Ō
251
+ - ‘
252
+ - ş
253
+ - ·
254
+ - o
255
+ - ę
256
+ - ÿ
257
+ - Å
258
+ - ą
259
+ - ð
260
+ - ī
261
+ - ò
262
+ - ż
263
+ - ě
264
+ - ś
265
+ - '`'
266
+ - Ë
267
+ - ì
268
+ - ē
269
+ - ğ
270
+ - İ
271
+ - '*'
272
+ - Í
273
+ - ė
274
+ - Ó
275
+ - ő
276
+ - đ
277
+ - ʻ
278
+ - Ü
279
+ - õ
280
+ - Ä
281
+ - ņ
282
+ - ṣ
283
+ - '|'
284
+ - ʾ
285
+ - π
286
+ - Ā
287
+ - σ
288
+ - '%'
289
+ - ả
290
+ - κ
291
+ - ʼ
292
+ - ň
293
+ - Ú
294
+ - ļ
295
+ - ư
296
+ - '1'
297
+ - '2'
298
+ - '}'
299
+ - ĩ
300
+ - Ҫ
301
+ - ا
302
+ - ầ
303
+ - ⁄
304
+ - ṇ
305
+ - þ
306
+ - ǎ
307
+ - ο
308
+ - ′
309
+ - s
310
+ - §
311
+ - ľ
312
+ - ǹ
313
+ - Ʉ
314
+ - ː
315
+ - ̱
316
+ - γ
317
+ - ν
318
+ - ن
319
+ - ạ
320
+ - ễ
321
+ - ộ
322
+ - ≥
323
+ - 星
324
+ - ề
325
+ - ṯ
326
+ - τ
327
+ - δ
328
+ - Δ
329
+ - Ț
330
+ - Ș
331
+ - Ū
332
+ - Ř
333
+ - ∆
334
+ - →
335
+ - ệ
336
+ - Г
337
+ - ơ
338
+ - ţ
339
+ - Þ
340
+ - Ñ
341
+ - ±
342
+ - ť
343
+ - ŏ
344
+ - €
345
+ - „
346
+ - ʿ
347
+ - Ć
348
+ - £
349
+ - α
350
+ - Ż
351
+ - Ş
352
+ - β
353
+ - ź
354
+ - Đ
355
+ - Ø
356
+ - Ś
357
+ - Ž
358
+ - Æ
359
+ - $
360
+ - Ï
361
+ - Ł
362
+ - ț
363
+ - Č
364
+ - Á
365
+ - ́
366
+ - Ù
367
+ - Μ
368
+ - ι
369
+ - ρ
370
+ - ό
371
+ - И
372
+ - з
373
+ - 京
374
+ - 北
375
+ - ď
376
+ - Ġ
377
+ - Ṭ
378
+ - −
379
+ - ☉
380
+ - '~'
381
+ - ®
382
+ - Ì
383
+ - Ò
384
+ - Õ
385
+ - ×
386
+ - ħ
387
+ - ĺ
388
+ - Ľ
389
+ - ũ
390
+ - ů
391
+ - Ų
392
+ - ǃ
393
+ - ǔ
394
+ - ̠
395
+ - ̲
396
+ - Κ
397
+ - Π
398
+ - ε
399
+ - ζ
400
+ - μ
401
+ - ς
402
+ - υ
403
+ - ψ
404
+ - І
405
+ - Ј
406
+ - А
407
+ - Е
408
+ - П
409
+ - а
410
+ - е
411
+ - м
412
+ - н
413
+ - Գ
414
+ - Զ
415
+ - ب
416
+ - د
417
+ - ر
418
+ - ل
419
+ - و
420
+ - ي
421
+ - ወ
422
+ - ደ
423
+ - ḍ
424
+ - ṅ
425
+ - ṭ
426
+ - ậ
427
+ - ắ
428
+ - ẵ
429
+ - ị
430
+ - ồ
431
+ - ờ
432
+ - ợ
433
+ - ủ
434
+ - ‐
435
+ - ―
436
+ - †
437
+ - ‹
438
+ - ›
439
+ - ₽
440
+ - ∈
441
+ - ∞
442
+ - ─
443
+ - い
444
+ - う
445
+ - た
446
+ - つ
447
+ - へ
448
+ - ま
449
+ - め
450
+ - や
451
+ - ゔ
452
+ - 扬
453
+ - 术
454
+ - 美
455
+ - 貴
456
+ - 青
457
+ - 馆
458
+ - Ꝑ
459
+ - ̐
460
+ - Ω
461
+ - ử
462
+ - ỳ
463
+ - ∨
464
+ - 乃
465
+ - 杜
466
+ - (
467
+ - Ē
468
+ - ǫ
469
+ - <sos/eos>
470
+ init: null
471
+ input_size: null
472
+ ctc_conf:
473
+ dropout_rate: 0.0
474
+ ctc_type: builtin
475
+ reduce: true
476
+ ignore_nan_grad: true
477
+ joint_net_conf: null
478
+ model_conf:
479
+ ctc_weight: 0.5
480
+ use_preprocessor: true
481
+ token_type: bpe
482
+ bpemodel: data/fr_token_list/bpe_unigram350/bpe.model
483
+ non_linguistic_symbols: null
484
+ cleaner: null
485
+ g2p: null
486
+ speech_volume_normalize: null
487
+ rir_scp: null
488
+ rir_apply_prob: 1.0
489
+ noise_scp: null
490
+ noise_apply_prob: 1.0
491
+ noise_db_range: '13_15'
492
+ frontend: default
493
+ frontend_conf:
494
+ fs: 16k
495
+ specaug: specaug
496
+ specaug_conf:
497
+ apply_time_warp: true
498
+ time_warp_window: 5
499
+ time_warp_mode: bicubic
500
+ apply_freq_mask: true
501
+ freq_mask_width_range:
502
+ - 0
503
+ - 27
504
+ num_freq_mask: 2
505
+ apply_time_mask: true
506
+ time_mask_width_ratio_range:
507
+ - 0.0
508
+ - 0.05
509
+ num_time_mask: 2
510
+ normalize: global_mvn
511
+ normalize_conf:
512
+ stats_file: exp/asr_stats_raw_fr_bpe350_sp/train/feats_stats.npz
513
+ preencoder: null
514
+ preencoder_conf: {}
515
+ encoder: vgg_rnn
516
+ encoder_conf:
517
+ rnn_type: lstm
518
+ bidirectional: true
519
+ use_projection: true
520
+ num_layers: 4
521
+ hidden_size: 1024
522
+ output_size: 1024
523
+ postencoder: null
524
+ postencoder_conf: {}
525
+ decoder: rnn
526
+ decoder_conf:
527
+ num_layers: 2
528
+ hidden_size: 1024
529
+ sampling_probability: 0
530
+ att_conf:
531
+ atype: location
532
+ adim: 1024
533
+ aconv_chans: 10
534
+ aconv_filts: 100
535
+ required:
536
+ - output_dir
537
+ - token_list
538
+ version: 0.10.6a1
539
+ distributed: false
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/acc.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/backward_time.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/cer.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/cer_ctc.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/forward_time.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/gpu_max_cached_mem_GB.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/iter_time.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/loss.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/loss_att.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/loss_ctc.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/optim0_lr0.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/optim_step_time.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/train_time.png ADDED
exp/asr_train_asr_rnn_raw_fr_bpe350_sp/images/wer.png ADDED
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
1
+ espnet: 0.10.6a1
2
+ files:
3
+ asr_model_file: exp/asr_train_asr_rnn_raw_fr_bpe350_sp/4epoch.pth
4
+ python: "3.9.5 (default, Jun 4 2021, 12:28:51) \n[GCC 7.5.0]"
5
+ timestamp: 1651267273.998803
6
+ torch: 1.8.1+cu102
7
+ yaml_files:
8
+ asr_train_config: exp/asr_train_asr_rnn_raw_fr_bpe350_sp/config.yaml