ftshijt commited on
Commit
5993e48
1 Parent(s): 0053460

Update model

Browse files
README.md ADDED
@@ -0,0 +1,773 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - automatic-speech-recognition
6
+ language: noinfo
7
+ datasets:
8
+ - yolo_mixtec
9
+ license: cc-by-4.0
10
+ ---
11
+
12
+ ## ESPnet2 ASR model
13
+
14
+ ### `espnet/ftshijt_espnet2_asr_yolo_mixtec_transformer`
15
+
16
+ This model was trained by ftshijt using yolo_mixtec recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ ```bash
21
+ cd espnet
22
+
23
+ pip install -e .
24
+ cd els/yolo_mixtec/asr1
25
+ ./run.sh --skip_data_prep false --skip_train true --download_model espnet/ftshijt_espnet2_asr_yolo_mixtec_transformer
26
+ ```
27
+
28
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
29
+ # RESULTS
30
+ ## Environments
31
+ - date: `Wed Nov 10 02:59:39 EST 2021`
32
+ - python version: `3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0]`
33
+ - espnet version: `espnet 0.10.4a1`
34
+ - pytorch version: `pytorch 1.9.0`
35
+ - Git hash: ``
36
+ - Commit date: ``
37
+
38
+ ## asr_train_asr_transformer_specaug_raw_bpe500
39
+ ### WER
40
+
41
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
42
+ |---|---|---|---|---|---|---|---|---|
43
+ |decode_asr_lm_lm_train_bpe500_valid.loss.ave_asr_model_valid.acc.best/test|4985|81348|84.1|11.8|4.1|2.5|18.3|82.5|
44
+
45
+ ### CER
46
+
47
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
48
+ |---|---|---|---|---|---|---|---|---|
49
+ |decode_asr_lm_lm_train_bpe500_valid.loss.ave_asr_model_valid.acc.best/test|4985|626187|93.4|2.2|4.4|2.4|9.0|82.5|
50
+
51
+ ### TER
52
+
53
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
54
+ |---|---|---|---|---|---|---|---|---|
55
+ |decode_asr_lm_lm_train_bpe500_valid.loss.ave_asr_model_valid.acc.best/test|4985|325684|90.7|5.2|4.1|2.2|11.5|82.5|
56
+
57
+ ## ASR config
58
+
59
+ <details><summary>expand</summary>
60
+
61
+ ```
62
+ config: conf/tuning/train_asr_transformer_specaug.yaml
63
+ print_config: false
64
+ log_level: INFO
65
+ dry_run: false
66
+ iterator_type: sequence
67
+ output_dir: exp/asr_train_asr_transformer_specaug_raw_bpe500
68
+ ngpu: 1
69
+ seed: 0
70
+ num_workers: 1
71
+ num_att_plot: 3
72
+ dist_backend: nccl
73
+ dist_init_method: env://
74
+ dist_world_size: null
75
+ dist_rank: null
76
+ local_rank: 0
77
+ dist_master_addr: null
78
+ dist_master_port: null
79
+ dist_launcher: null
80
+ multiprocessing_distributed: false
81
+ unused_parameters: false
82
+ sharded_ddp: false
83
+ cudnn_enabled: true
84
+ cudnn_benchmark: false
85
+ cudnn_deterministic: true
86
+ collect_stats: false
87
+ write_collected_feats: false
88
+ max_epoch: 100
89
+ patience: 15
90
+ val_scheduler_criterion:
91
+ - valid
92
+ - loss
93
+ early_stopping_criterion:
94
+ - valid
95
+ - loss
96
+ - min
97
+ best_model_criterion:
98
+ - - valid
99
+ - acc
100
+ - max
101
+ keep_nbest_models: 10
102
+ grad_clip: 5
103
+ grad_clip_type: 2.0
104
+ grad_noise: false
105
+ accum_grad: 2
106
+ no_forward_run: false
107
+ resume: true
108
+ train_dtype: float32
109
+ use_amp: false
110
+ log_interval: null
111
+ use_tensorboard: true
112
+ use_wandb: false
113
+ wandb_project: null
114
+ wandb_id: null
115
+ wandb_entity: null
116
+ wandb_name: null
117
+ wandb_model_log_interval: -1
118
+ detect_anomaly: false
119
+ pretrain_path: null
120
+ init_param: []
121
+ ignore_init_mismatch: false
122
+ freeze_param: []
123
+ num_iters_per_epoch: null
124
+ batch_size: 32
125
+ valid_batch_size: null
126
+ batch_bins: 1000000
127
+ valid_batch_bins: null
128
+ train_shape_file:
129
+ - exp/asr_stats_raw_bpe500/train/speech_shape
130
+ - exp/asr_stats_raw_bpe500/train/text_shape.bpe
131
+ valid_shape_file:
132
+ - exp/asr_stats_raw_bpe500/valid/speech_shape
133
+ - exp/asr_stats_raw_bpe500/valid/text_shape.bpe
134
+ batch_type: folded
135
+ valid_batch_type: null
136
+ fold_length:
137
+ - 80000
138
+ - 150
139
+ sort_in_batch: descending
140
+ sort_batch: descending
141
+ multiple_iterator: false
142
+ chunk_length: 500
143
+ chunk_shift_ratio: 0.5
144
+ num_cache_chunks: 1024
145
+ train_data_path_and_name_and_type:
146
+ - - /tmp/st-jiatong-54826.tbQP9L0N/raw/train/wav.scp
147
+ - speech
148
+ - kaldi_ark
149
+ - - /tmp/st-jiatong-54826.tbQP9L0N/raw/train/text
150
+ - text
151
+ - text
152
+ valid_data_path_and_name_and_type:
153
+ - - /tmp/st-jiatong-54826.tbQP9L0N/raw/dev/wav.scp
154
+ - speech
155
+ - kaldi_ark
156
+ - - /tmp/st-jiatong-54826.tbQP9L0N/raw/dev/text
157
+ - text
158
+ - text
159
+ allow_variable_data_keys: false
160
+ max_cache_size: 0.0
161
+ max_cache_fd: 32
162
+ valid_max_cache_size: null
163
+ optim: adam
164
+ optim_conf:
165
+ lr: 1.0
166
+ scheduler: noamlr
167
+ scheduler_conf:
168
+ warmup_steps: 25000
169
+ token_list:
170
+ - <blank>
171
+ - <unk>
172
+ - '4'
173
+ - '3'
174
+ - '1'
175
+ - '2'
176
+ - A
177
+ - ▁NDI
178
+ - '''4'
179
+ - '''1'
180
+ - U
181
+ - ▁BA
182
+ - O
183
+ - ▁I
184
+ - E
185
+ - 4=
186
+ - ▁KU
187
+ - ▁TAN
188
+ - ▁KA
189
+ - '''3'
190
+ - NI
191
+ - ▁YA
192
+ - RA
193
+ - 3=
194
+ - 2=
195
+ - IN
196
+ - NA
197
+ - ▁TA
198
+ - AN
199
+ - ▁KAN
200
+ - ▁NI
201
+ - ▁NDA
202
+ - ▁NA
203
+ - ▁JI
204
+ - KAN
205
+ - CHI
206
+ - (3)=
207
+ - I
208
+ - UN
209
+ - 1-
210
+ - ▁SA
211
+ - (4)=
212
+ - ▁JA
213
+ - XI
214
+ - ▁KO
215
+ - ▁TI
216
+ - TA
217
+ - KU
218
+ - BI
219
+ - ▁YU
220
+ - ▁KWA
221
+ - KA
222
+ - XA
223
+ - 1=
224
+ - ▁YO
225
+ - RI
226
+ - NDO
227
+ - ▁XA
228
+ - TU
229
+ - ▁TU
230
+ - ▁ÑA
231
+ - ▁KI
232
+ - ▁XI
233
+ - YO
234
+ - NDU
235
+ - NDA
236
+ - ▁CHI
237
+ - (2)=
238
+ - ▁BI
239
+ - ▁NU
240
+ - KI
241
+ - (1)=
242
+ - YU
243
+ - 3-
244
+ - ▁MI
245
+ - 'ON'
246
+ - ▁A
247
+ - BA
248
+ - 4-
249
+ - KO
250
+ - ▁NDU
251
+ - ▁ÑU
252
+ - ▁NDO
253
+ - NU
254
+ - ÑU
255
+ - '143'
256
+ - ▁SI
257
+ - ▁SO
258
+ - 13-
259
+ - NDI
260
+ - ▁AN
261
+ - ▁SU
262
+ - TIN
263
+ - SA
264
+ - ▁BE
265
+ - TO
266
+ - RUN
267
+ - KWA
268
+ - KWI
269
+ - ▁NDE
270
+ - ▁KWI
271
+ - XIN
272
+ - ▁U
273
+ - SI
274
+ - SO
275
+ - ▁TUN
276
+ - EN
277
+ - ▁KWE
278
+ - YA
279
+ - (4)=2
280
+ - NDE
281
+ - TI
282
+ - TUN
283
+ - ▁TIN
284
+ - MA
285
+ - ▁SE
286
+ - ▁XU
287
+ - SU
288
+ - ▁LU
289
+ - ▁KE
290
+ - ▁
291
+ - MI
292
+ - ▁RAN
293
+ - (3)=2
294
+ - 14-
295
+ - ▁MA
296
+ - KUN
297
+ - LU
298
+ - N
299
+ - ▁O
300
+ - KE
301
+ - NGA
302
+ - ▁IS
303
+ - ▁JU
304
+ - '='
305
+ - ▁LA
306
+ - ÑA
307
+ - JA
308
+ - CHUN
309
+ - R
310
+ - TAN
311
+ - PU
312
+ - ▁TIEM
313
+ - LI
314
+ - LA
315
+ - CHIU
316
+ - ▁PA
317
+ - M
318
+ - ▁REY
319
+ - ▁BAN
320
+ - JI
321
+ - L
322
+ - SUN
323
+ - ▁SEÑOR
324
+ - ▁JO
325
+ - ▁TIO
326
+ - KWE
327
+ - CHU
328
+ - S
329
+ - ▁YE
330
+ - KIN
331
+ - XU
332
+ - BE
333
+ - ▁CUENTA
334
+ - ▁SAN
335
+ - RRU
336
+ - ▁¿
337
+ - CHA
338
+ - ▁TO
339
+ - RRA
340
+ - LO
341
+ - TE
342
+ - ▁AMIGU
343
+ - PA
344
+ - XAN
345
+ - ▁C
346
+ - C
347
+ - ▁CHA
348
+ - ▁TE
349
+ - ▁HIJO
350
+ - ▁MB
351
+ - ▁PI
352
+ - G
353
+ - ▁ÁNIMA
354
+ - ▁CHE
355
+ - ▁P
356
+ - B
357
+ - NDIO
358
+ - SE
359
+ - ▁SANTU
360
+ - MU
361
+ - ▁PADRE
362
+ - D
363
+ - JU
364
+ - Z
365
+ - ▁TORO
366
+ - ▁PO
367
+ - LE
368
+ - ▁LI
369
+ - RO
370
+ - ▁LO
371
+ - ▁MESA
372
+ - CA
373
+ - ▁CHIU
374
+ - DO
375
+ - ▁BU
376
+ - ▁BUTA
377
+ - JO
378
+ - T
379
+ - TRU
380
+ - RU
381
+ - ▁MBO
382
+ - ▁JUAN
383
+ - ▁MM
384
+ - ▁CA
385
+ - ▁M
386
+ - ▁MAS
387
+ - ▁DE
388
+ - V
389
+ - ▁MAÑA
390
+ - ▁UTA
391
+ - DA
392
+ - ▁MULA
393
+ - ▁YOLOXÓCHITL
394
+ - ▁CONSEJU
395
+ - ▁Y
396
+ - ▁LE
397
+ - ÓN
398
+ - ▁MISA
399
+ - TIU
400
+ - ▁CANDELA
401
+ - ▁PATRÓN
402
+ - ▁PADRINU
403
+ - ▁MARCU
404
+ - ▁V
405
+ - ▁G
406
+ - Í
407
+ - ▁XE
408
+ - ▁MU
409
+ - ▁XO
410
+ - NGUI
411
+ - ▁CO
412
+ - ▁HOMBRE
413
+ - ▁PESU
414
+ - ▁PE
415
+ - ▁D
416
+ - ▁MACHITI
417
+ - CO
418
+ - REN
419
+ - ▁RANCHU
420
+ - ▁MIS
421
+ - ▁MACHU
422
+ - J
423
+ - ▁PAN
424
+ - CHO
425
+ - H
426
+ - ▁CHU
427
+ - Y
428
+ - ▁TON
429
+ - GA
430
+ - X
431
+ - ▁VI
432
+ - ▁FE
433
+ - ▁TARRAYA
434
+ - ▁SANTÍSIMA
435
+ - ▁N
436
+ - ▁MAYÓ
437
+ - ▁CARRU
438
+ - ▁F
439
+ - ▁PAPÁ
440
+ - ▁PALOMA
441
+ - ▁MARÍA
442
+ - ▁PEDRU
443
+ - ▁CAFÉ
444
+ - ▁COMISARIO
445
+ - ▁PANELA
446
+ - ▁PELÓN
447
+ - É
448
+ - ▁POZO
449
+ - ▁CABRÓN
450
+ - ▁GUACHU
451
+ - ▁S
452
+ - RES
453
+ - ▁COSTUMBRE
454
+ - ▁SEÑA
455
+ - QUI
456
+ - ▁ORO
457
+ - CH
458
+ - ▁MAR
459
+ - SIN
460
+ - SAN
461
+ - ▁COSTA
462
+ - ▁MAMÁ
463
+ - ▁CINCUENTA
464
+ - ▁CHO
465
+ - ▁PEDR
466
+ - ▁JUNTA
467
+ - MÚ
468
+ - ▁TIENDA
469
+ - ▁JOSÉ
470
+ - NC
471
+ - ▁ES
472
+ - ▁SUERTE
473
+ - ▁FAMILIA
474
+ - ▁ZAPATU
475
+ - NTE
476
+ - ▁PASTO
477
+ - ▁CON
478
+ - Ñ
479
+ - ▁BOTE
480
+ - CIÓN
481
+ - ▁RE
482
+ - ▁BOLSA
483
+ - ▁MANGO
484
+ - ▁JWE
485
+ - ▁GASTU
486
+ - ▁T
487
+ - ▁B
488
+ - ▁KW
489
+ - ÍN
490
+ - ▁HIJA
491
+ - ▁CUARENT
492
+ - ▁VAQUERU
493
+ - ▁NECHITO
494
+ - ▁NOVIA
495
+ - ▁NOVIO
496
+ - JWE
497
+ - ▁PUENTE
498
+ - ▁SANDÍA
499
+ - ▁MALA
500
+ - Ó
501
+ - ▁ABONO
502
+ - ▁JESÚS
503
+ - ▁CUARTO
504
+ - ▁EFE
505
+ - ▁REINA
506
+ - ▁COMANDANTE
507
+ - ▁ESCUELA
508
+ - ▁MANZANA
509
+ - ▁MÁQUINA
510
+ - LLA
511
+ - ▁COR
512
+ - ▁JERÓNIMO
513
+ - ▁PISTOLA
514
+ - NGI
515
+ - CIO
516
+ - ▁FRANCISCU
517
+ - ▁TEODORO
518
+ - CER
519
+ - ▁SALUBI
520
+ - ▁MEZA
521
+ - ▁MÚSIC
522
+ - ▁RU
523
+ - ▁CONSTANTINO
524
+ - ▁GARCÍA
525
+ - ▁FRENU
526
+ - ▁ROSA
527
+ - ▁CERVEZA
528
+ - ▁CIGARRU
529
+ - ▁COMISIÓN
530
+ - ▁CUNIJO
531
+ - ▁FRANCISCO
532
+ - ▁HÍJOLE
533
+ - ▁NUEVE
534
+ - ▁MUL
535
+ - ▁PANTALÓN
536
+ - ▁CAMISA
537
+ - ▁CHINGADA
538
+ - ▁SEMANA
539
+ - ▁COM
540
+ - GAR
541
+ - ▁MARTÍN
542
+ - ▁SÁBADO
543
+ - ▁TRABAJO
544
+ - ▁CINCO
545
+ - ▁DIE
546
+ - ▁EST
547
+ - NDWA
548
+ - ▁LECHIN
549
+ - ▁COCO
550
+ - ILLU
551
+ - ▁CORRE
552
+ - ▁MADR
553
+ - ▁REC
554
+ - ▁BAUTISTA
555
+ - ▁VENTANA
556
+ - ▁CUÑAD
557
+ - ▁ANTONIU
558
+ - ▁COPALA
559
+ - LÍN
560
+ - ▁SECUND
561
+ - ▁COHETE
562
+ - ▁HISTORIA
563
+ - ▁POLICÍA
564
+ - ENCIA
565
+ - ▁CAD
566
+ - ▁LUIS
567
+ - ▁DOCTOR
568
+ - ▁GONZÁLEZ
569
+ - ▁JUEVE
570
+ - ▁LIBRU
571
+ - ▁QUESU
572
+ - ▁VIAJE
573
+ - ▁CART
574
+ - ▁LOCO
575
+ - ▁BOL
576
+ - ▁COMPADRE
577
+ - ▁JWI
578
+ - ▁METRU
579
+ - ▁BUENO
580
+ - ▁TRE
581
+ - ▁CASTILLO
582
+ - ▁COMITÉ
583
+ - ▁ETERNO
584
+ - ▁LÍQUIDO
585
+ - ▁MOLE
586
+ - ▁CAPULCU
587
+ - ▁DOMING
588
+ - ▁ROMA
589
+ - ▁CARAJU
590
+ - ▁RIATA
591
+ - ▁TRATU
592
+ - ▁SEIS
593
+ - ▁ADÁN
594
+ - ▁JUANCITO
595
+ - ▁HOR
596
+ - ''''
597
+ - ▁ARRÓ
598
+ - ▁COCINA
599
+ - ▁PALACIO
600
+ - ▁RÓMULO
601
+ - K
602
+ - ▁ALFONSO
603
+ - ▁BARTOLO
604
+ - ▁FELIPE
605
+ - ▁HERRER
606
+ - ▁PAULINO
607
+ - ▁YEGUA
608
+ - ▁LISTA
609
+ - Ú
610
+ - ▁ABRIL
611
+ - ▁CUATRO
612
+ - ▁DICIEMBRE
613
+ - ▁MARGARITO
614
+ - ▁MOJONERA
615
+ - ▁SOLEDAD
616
+ - ▁VESTIDO
617
+ - ▁PELOTA
618
+ - RRET
619
+ - ▁CAPITÁN
620
+ - ▁COMUNIÓN
621
+ - ▁CUCHARA
622
+ - ▁FERNANDO
623
+ - ▁GUADALUPE
624
+ - ▁MIGUEL
625
+ - ▁PELÚN
626
+ - ▁SECRETARIU
627
+ - ▁LENCHU
628
+ - ▁EVA
629
+ - ▁SEGUND
630
+ - ▁CANTOR
631
+ - ▁CHILPANCINGO
632
+ - ▁GABRIEL
633
+ - ▁QUINIENTO
634
+ - ▁RAÚL
635
+ - ▁SEVERIAN
636
+ - ▁TUMBADA
637
+ - ▁MALINCHI
638
+ - ▁PRIMU
639
+ - ▁MORAL
640
+ - ▁AGOSTO
641
+ - ▁CENTÍMETRO
642
+ - ▁FIRMA
643
+ - ▁HUEHUETÁN
644
+ - ▁MANGUERA
645
+ - ▁MEDI
646
+ - ▁MUERT
647
+ - ▁SALAZAR
648
+ - ▁VIERNI
649
+ - LILL
650
+ - ▁LL
651
+ - '-'
652
+ - ▁CAMPESINO
653
+ - ▁CIVIL
654
+ - ▁COMISARIADO
655
+ - )
656
+ - (
657
+ - Ã
658
+ - ‘
659
+ - ¿
660
+ - Ü
661
+ - ¡
662
+ - Q
663
+ - F
664
+ - Á
665
+ - P
666
+ - Ÿ
667
+ - W
668
+ - Ý
669
+ - <sos/eos>
670
+ init: xavier_uniform
671
+ input_size: null
672
+ ctc_conf:
673
+ dropout_rate: 0.0
674
+ ctc_type: builtin
675
+ reduce: true
676
+ ignore_nan_grad: true
677
+ model_conf:
678
+ ctc_weight: 0.3
679
+ lsm_weight: 0.1
680
+ length_normalized_loss: false
681
+ use_preprocessor: true
682
+ token_type: bpe
683
+ bpemodel: data/token_list/bpe_unigram500/bpe.model
684
+ non_linguistic_symbols: null
685
+ cleaner: null
686
+ g2p: null
687
+ speech_volume_normalize: null
688
+ rir_scp: null
689
+ rir_apply_prob: 1.0
690
+ noise_scp: null
691
+ noise_apply_prob: 1.0
692
+ noise_db_range: '13_15'
693
+ frontend: default
694
+ frontend_conf:
695
+ fs: 16k
696
+ specaug: specaug
697
+ specaug_conf:
698
+ apply_time_warp: true
699
+ time_warp_window: 5
700
+ time_warp_mode: bicubic
701
+ apply_freq_mask: true
702
+ freq_mask_width_range:
703
+ - 0
704
+ - 30
705
+ num_freq_mask: 2
706
+ apply_time_mask: true
707
+ time_mask_width_range:
708
+ - 0
709
+ - 40
710
+ num_time_mask: 2
711
+ normalize: global_mvn
712
+ normalize_conf:
713
+ stats_file: exp/asr_stats_raw_bpe500/train/feats_stats.npz
714
+ preencoder: null
715
+ preencoder_conf: {}
716
+ encoder: transformer
717
+ encoder_conf:
718
+ input_layer: conv2d
719
+ num_blocks: 12
720
+ linear_units: 2048
721
+ dropout_rate: 0.1
722
+ output_size: 512
723
+ attention_heads: 4
724
+ attention_dropout_rate: 0.0
725
+ postencoder: null
726
+ postencoder_conf: {}
727
+ decoder: transformer
728
+ decoder_conf:
729
+ input_layer: embed
730
+ num_blocks: 6
731
+ linear_units: 2048
732
+ dropout_rate: 0.1
733
+ required:
734
+ - output_dir
735
+ - token_list
736
+ version: 0.10.4a1
737
+ distributed: false
738
+ ```
739
+
740
+ </details>
741
+
742
+
743
+
744
+ ### Citing ESPnet
745
+
746
+ ```BibTex
747
+ @inproceedings{watanabe2018espnet,
748
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
749
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
750
+ year={2018},
751
+ booktitle={Proceedings of Interspeech},
752
+ pages={2207--2211},
753
+ doi={10.21437/Interspeech.2018-1456},
754
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
755
+ }
756
+
757
+
758
+
759
+
760
+ ```
761
+
762
+ or arXiv:
763
+
764
+ ```bibtex
765
+ @misc{watanabe2018espnet,
766
+ title={ESPnet: End-to-End Speech Processing Toolkit},
767
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
768
+ year={2018},
769
+ eprint={1804.00015},
770
+ archivePrefix={arXiv},
771
+ primaryClass={cs.CL}
772
+ }
773
+ ```
data/token_list/bpe_unigram500/bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:92b5c5a4f2da619fc71f8938c06e011f79a8fd1824ea835a78c65c88941c7004
3
+ size 244672
exp/asr_stats_raw_bpe500/train/feats_stats.npz ADDED
Binary file (1.4 kB). View file
 
exp/asr_train_asr_transformer_specaug_raw_bpe500/56epoch.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d825491b8c9d8ddf38476dd110957d6c6e5d7fad2bad317d0e38da2786caa66
3
+ size 284901797
exp/asr_train_asr_transformer_specaug_raw_bpe500/RESULTS.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
2
+ # RESULTS
3
+ ## Environments
4
+ - date: `Wed Nov 10 02:59:39 EST 2021`
5
+ - python version: `3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0]`
6
+ - espnet version: `espnet 0.10.4a1`
7
+ - pytorch version: `pytorch 1.9.0`
8
+ - Git hash: ``
9
+ - Commit date: ``
10
+
11
+ ## asr_train_asr_transformer_specaug_raw_bpe500
12
+ ### WER
13
+
14
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
15
+ |---|---|---|---|---|---|---|---|---|
16
+ |decode_asr_lm_lm_train_bpe500_valid.loss.ave_asr_model_valid.acc.best/test|4985|81348|84.1|11.8|4.1|2.5|18.3|82.5|
17
+
18
+ ### CER
19
+
20
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
21
+ |---|---|---|---|---|---|---|---|---|
22
+ |decode_asr_lm_lm_train_bpe500_valid.loss.ave_asr_model_valid.acc.best/test|4985|626187|93.4|2.2|4.4|2.4|9.0|82.5|
23
+
24
+ ### TER
25
+
26
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
27
+ |---|---|---|---|---|---|---|---|---|
28
+ |decode_asr_lm_lm_train_bpe500_valid.loss.ave_asr_model_valid.acc.best/test|4985|325684|90.7|5.2|4.1|2.2|11.5|82.5|
29
+
exp/asr_train_asr_transformer_specaug_raw_bpe500/config.yaml ADDED
@@ -0,0 +1,676 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/tuning/train_asr_transformer_specaug.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ dry_run: false
5
+ iterator_type: sequence
6
+ output_dir: exp/asr_train_asr_transformer_specaug_raw_bpe500
7
+ ngpu: 1
8
+ seed: 0
9
+ num_workers: 1
10
+ num_att_plot: 3
11
+ dist_backend: nccl
12
+ dist_init_method: env://
13
+ dist_world_size: null
14
+ dist_rank: null
15
+ local_rank: 0
16
+ dist_master_addr: null
17
+ dist_master_port: null
18
+ dist_launcher: null
19
+ multiprocessing_distributed: false
20
+ unused_parameters: false
21
+ sharded_ddp: false
22
+ cudnn_enabled: true
23
+ cudnn_benchmark: false
24
+ cudnn_deterministic: true
25
+ collect_stats: false
26
+ write_collected_feats: false
27
+ max_epoch: 100
28
+ patience: 15
29
+ val_scheduler_criterion:
30
+ - valid
31
+ - loss
32
+ early_stopping_criterion:
33
+ - valid
34
+ - loss
35
+ - min
36
+ best_model_criterion:
37
+ - - valid
38
+ - acc
39
+ - max
40
+ keep_nbest_models: 10
41
+ grad_clip: 5
42
+ grad_clip_type: 2.0
43
+ grad_noise: false
44
+ accum_grad: 2
45
+ no_forward_run: false
46
+ resume: true
47
+ train_dtype: float32
48
+ use_amp: false
49
+ log_interval: null
50
+ use_tensorboard: true
51
+ use_wandb: false
52
+ wandb_project: null
53
+ wandb_id: null
54
+ wandb_entity: null
55
+ wandb_name: null
56
+ wandb_model_log_interval: -1
57
+ detect_anomaly: false
58
+ pretrain_path: null
59
+ init_param: []
60
+ ignore_init_mismatch: false
61
+ freeze_param: []
62
+ num_iters_per_epoch: null
63
+ batch_size: 32
64
+ valid_batch_size: null
65
+ batch_bins: 1000000
66
+ valid_batch_bins: null
67
+ train_shape_file:
68
+ - exp/asr_stats_raw_bpe500/train/speech_shape
69
+ - exp/asr_stats_raw_bpe500/train/text_shape.bpe
70
+ valid_shape_file:
71
+ - exp/asr_stats_raw_bpe500/valid/speech_shape
72
+ - exp/asr_stats_raw_bpe500/valid/text_shape.bpe
73
+ batch_type: folded
74
+ valid_batch_type: null
75
+ fold_length:
76
+ - 80000
77
+ - 150
78
+ sort_in_batch: descending
79
+ sort_batch: descending
80
+ multiple_iterator: false
81
+ chunk_length: 500
82
+ chunk_shift_ratio: 0.5
83
+ num_cache_chunks: 1024
84
+ train_data_path_and_name_and_type:
85
+ - - /tmp/st-jiatong-54826.tbQP9L0N/raw/train/wav.scp
86
+ - speech
87
+ - kaldi_ark
88
+ - - /tmp/st-jiatong-54826.tbQP9L0N/raw/train/text
89
+ - text
90
+ - text
91
+ valid_data_path_and_name_and_type:
92
+ - - /tmp/st-jiatong-54826.tbQP9L0N/raw/dev/wav.scp
93
+ - speech
94
+ - kaldi_ark
95
+ - - /tmp/st-jiatong-54826.tbQP9L0N/raw/dev/text
96
+ - text
97
+ - text
98
+ allow_variable_data_keys: false
99
+ max_cache_size: 0.0
100
+ max_cache_fd: 32
101
+ valid_max_cache_size: null
102
+ optim: adam
103
+ optim_conf:
104
+ lr: 1.0
105
+ scheduler: noamlr
106
+ scheduler_conf:
107
+ warmup_steps: 25000
108
+ token_list:
109
+ - <blank>
110
+ - <unk>
111
+ - '4'
112
+ - '3'
113
+ - '1'
114
+ - '2'
115
+ - A
116
+ - ▁NDI
117
+ - '''4'
118
+ - '''1'
119
+ - U
120
+ - ▁BA
121
+ - O
122
+ - ▁I
123
+ - E
124
+ - 4=
125
+ - ▁KU
126
+ - ▁TAN
127
+ - ▁KA
128
+ - '''3'
129
+ - NI
130
+ - ▁YA
131
+ - RA
132
+ - 3=
133
+ - 2=
134
+ - IN
135
+ - NA
136
+ - ▁TA
137
+ - AN
138
+ - ▁KAN
139
+ - ▁NI
140
+ - ▁NDA
141
+ - ▁NA
142
+ - ▁JI
143
+ - KAN
144
+ - CHI
145
+ - (3)=
146
+ - I
147
+ - UN
148
+ - 1-
149
+ - ▁SA
150
+ - (4)=
151
+ - ▁JA
152
+ - XI
153
+ - ▁KO
154
+ - ▁TI
155
+ - TA
156
+ - KU
157
+ - BI
158
+ - ▁YU
159
+ - ▁KWA
160
+ - KA
161
+ - XA
162
+ - 1=
163
+ - ▁YO
164
+ - RI
165
+ - NDO
166
+ - ▁XA
167
+ - TU
168
+ - ▁TU
169
+ - ▁ÑA
170
+ - ▁KI
171
+ - ▁XI
172
+ - YO
173
+ - NDU
174
+ - NDA
175
+ - ▁CHI
176
+ - (2)=
177
+ - ▁BI
178
+ - ▁NU
179
+ - KI
180
+ - (1)=
181
+ - YU
182
+ - 3-
183
+ - ▁MI
184
+ - 'ON'
185
+ - ▁A
186
+ - BA
187
+ - 4-
188
+ - KO
189
+ - ▁NDU
190
+ - ▁ÑU
191
+ - ▁NDO
192
+ - NU
193
+ - ÑU
194
+ - '143'
195
+ - ▁SI
196
+ - ▁SO
197
+ - 13-
198
+ - NDI
199
+ - ▁AN
200
+ - ▁SU
201
+ - TIN
202
+ - SA
203
+ - ▁BE
204
+ - TO
205
+ - RUN
206
+ - KWA
207
+ - KWI
208
+ - ▁NDE
209
+ - ▁KWI
210
+ - XIN
211
+ - ▁U
212
+ - SI
213
+ - SO
214
+ - ▁TUN
215
+ - EN
216
+ - ▁KWE
217
+ - YA
218
+ - (4)=2
219
+ - NDE
220
+ - TI
221
+ - TUN
222
+ - ▁TIN
223
+ - MA
224
+ - ▁SE
225
+ - ▁XU
226
+ - SU
227
+ - ▁LU
228
+ - ▁KE
229
+ - ▁
230
+ - MI
231
+ - ▁RAN
232
+ - (3)=2
233
+ - 14-
234
+ - ▁MA
235
+ - KUN
236
+ - LU
237
+ - N
238
+ - ▁O
239
+ - KE
240
+ - NGA
241
+ - ▁IS
242
+ - ▁JU
243
+ - '='
244
+ - ▁LA
245
+ - ÑA
246
+ - JA
247
+ - CHUN
248
+ - R
249
+ - TAN
250
+ - PU
251
+ - ▁TIEM
252
+ - LI
253
+ - LA
254
+ - CHIU
255
+ - ▁PA
256
+ - M
257
+ - ▁REY
258
+ - ▁BAN
259
+ - JI
260
+ - L
261
+ - SUN
262
+ - ▁SEÑOR
263
+ - ▁JO
264
+ - ▁TIO
265
+ - KWE
266
+ - CHU
267
+ - S
268
+ - ▁YE
269
+ - KIN
270
+ - XU
271
+ - BE
272
+ - ▁CUENTA
273
+ - ▁SAN
274
+ - RRU
275
+ - ▁¿
276
+ - CHA
277
+ - ▁TO
278
+ - RRA
279
+ - LO
280
+ - TE
281
+ - ▁AMIGU
282
+ - PA
283
+ - XAN
284
+ - ▁C
285
+ - C
286
+ - ▁CHA
287
+ - ▁TE
288
+ - ▁HIJO
289
+ - ▁MB
290
+ - ▁PI
291
+ - G
292
+ - ▁ÁNIMA
293
+ - ▁CHE
294
+ - ▁P
295
+ - B
296
+ - NDIO
297
+ - SE
298
+ - ▁SANTU
299
+ - MU
300
+ - ▁PADRE
301
+ - D
302
+ - JU
303
+ - Z
304
+ - ▁TORO
305
+ - ▁PO
306
+ - LE
307
+ - ▁LI
308
+ - RO
309
+ - ▁LO
310
+ - ▁MESA
311
+ - CA
312
+ - ▁CHIU
313
+ - DO
314
+ - ▁BU
315
+ - ▁BUTA
316
+ - JO
317
+ - T
318
+ - TRU
319
+ - RU
320
+ - ▁MBO
321
+ - ▁JUAN
322
+ - ▁MM
323
+ - ▁CA
324
+ - ▁M
325
+ - ▁MAS
326
+ - ▁DE
327
+ - V
328
+ - ▁MAÑA
329
+ - ▁UTA
330
+ - DA
331
+ - ▁MULA
332
+ - ▁YOLOXÓCHITL
333
+ - ▁CONSEJU
334
+ - ▁Y
335
+ - ▁LE
336
+ - ÓN
337
+ - ▁MISA
338
+ - TIU
339
+ - ▁CANDELA
340
+ - ▁PATRÓN
341
+ - ▁PADRINU
342
+ - ▁MARCU
343
+ - ▁V
344
+ - ▁G
345
+ - Í
346
+ - ▁XE
347
+ - ▁MU
348
+ - ▁XO
349
+ - NGUI
350
+ - ▁CO
351
+ - ▁HOMBRE
352
+ - ▁PESU
353
+ - ▁PE
354
+ - ▁D
355
+ - ▁MACHITI
356
+ - CO
357
+ - REN
358
+ - ▁RANCHU
359
+ - ▁MIS
360
+ - ▁MACHU
361
+ - J
362
+ - ▁PAN
363
+ - CHO
364
+ - H
365
+ - ▁CHU
366
+ - Y
367
+ - ▁TON
368
+ - GA
369
+ - X
370
+ - ▁VI
371
+ - ▁FE
372
+ - ▁TARRAYA
373
+ - ▁SANTÍSIMA
374
+ - ▁N
375
+ - ▁MAYÓ
376
+ - ▁CARRU
377
+ - ▁F
378
+ - ▁PAPÁ
379
+ - ▁PALOMA
380
+ - ▁MARÍA
381
+ - ▁PEDRU
382
+ - ▁CAFÉ
383
+ - ▁COMISARIO
384
+ - ▁PANELA
385
+ - ▁PELÓN
386
+ - É
387
+ - ▁POZO
388
+ - ▁CABRÓN
389
+ - ▁GUACHU
390
+ - ▁S
391
+ - RES
392
+ - ▁COSTUMBRE
393
+ - ▁SEÑA
394
+ - QUI
395
+ - ▁ORO
396
+ - CH
397
+ - ▁MAR
398
+ - SIN
399
+ - SAN
400
+ - ▁COSTA
401
+ - ▁MAMÁ
402
+ - ▁CINCUENTA
403
+ - ▁CHO
404
+ - ▁PEDR
405
+ - ▁JUNTA
406
+ - MÚ
407
+ - ▁TIENDA
408
+ - ▁JOSÉ
409
+ - NC
410
+ - ▁ES
411
+ - ▁SUERTE
412
+ - ▁FAMILIA
413
+ - ▁ZAPATU
414
+ - NTE
415
+ - ▁PASTO
416
+ - ▁CON
417
+ - Ñ
418
+ - ▁BOTE
419
+ - CIÓN
420
+ - ▁RE
421
+ - ▁BOLSA
422
+ - ▁MANGO
423
+ - ▁JWE
424
+ - ▁GASTU
425
+ - ▁T
426
+ - ▁B
427
+ - ▁KW
428
+ - ÍN
429
+ - ▁HIJA
430
+ - ▁CUARENT
431
+ - ▁VAQUERU
432
+ - ▁NECHITO
433
+ - ▁NOVIA
434
+ - ▁NOVIO
435
+ - JWE
436
+ - ▁PUENTE
437
+ - ▁SANDÍA
438
+ - ▁MALA
439
+ - Ó
440
+ - ▁ABONO
441
+ - ▁JESÚS
442
+ - ▁CUARTO
443
+ - ▁EFE
444
+ - ▁REINA
445
+ - ▁COMANDANTE
446
+ - ▁ESCUELA
447
+ - ▁MANZANA
448
+ - ▁MÁQUINA
449
+ - LLA
450
+ - ▁COR
451
+ - ▁JERÓNIMO
452
+ - ▁PISTOLA
453
+ - NGI
454
+ - CIO
455
+ - ▁FRANCISCU
456
+ - ▁TEODORO
457
+ - CER
458
+ - ▁SALUBI
459
+ - ▁MEZA
460
+ - ▁MÚSIC
461
+ - ▁RU
462
+ - ▁CONSTANTINO
463
+ - ▁GARCÍA
464
+ - ▁FRENU
465
+ - ▁ROSA
466
+ - ▁CERVEZA
467
+ - ▁CIGARRU
468
+ - ▁COMISIÓN
469
+ - ▁CUNIJO
470
+ - ▁FRANCISCO
471
+ - ▁HÍJOLE
472
+ - ▁NUEVE
473
+ - ▁MUL
474
+ - ▁PANTALÓN
475
+ - ▁CAMISA
476
+ - ▁CHINGADA
477
+ - ▁SEMANA
478
+ - ▁COM
479
+ - GAR
480
+ - ▁MARTÍN
481
+ - ▁SÁBADO
482
+ - ▁TRABAJO
483
+ - ▁CINCO
484
+ - ▁DIE
485
+ - ▁EST
486
+ - NDWA
487
+ - ▁LECHIN
488
+ - ▁COCO
489
+ - ILLU
490
+ - ▁CORRE
491
+ - ▁MADR
492
+ - ▁REC
493
+ - ▁BAUTISTA
494
+ - ▁VENTANA
495
+ - ▁CUÑAD
496
+ - ▁ANTONIU
497
+ - ▁COPALA
498
+ - LÍN
499
+ - ▁SECUND
500
+ - ▁COHETE
501
+ - ▁HISTORIA
502
+ - ▁POLICÍA
503
+ - ENCIA
504
+ - ▁CAD
505
+ - ▁LUIS
506
+ - ▁DOCTOR
507
+ - ▁GONZÁLEZ
508
+ - ▁JUEVE
509
+ - ▁LIBRU
510
+ - ▁QUESU
511
+ - ▁VIAJE
512
+ - ▁CART
513
+ - ▁LOCO
514
+ - ▁BOL
515
+ - ▁COMPADRE
516
+ - ▁JWI
517
+ - ▁METRU
518
+ - ▁BUENO
519
+ - ▁TRE
520
+ - ▁CASTILLO
521
+ - ▁COMITÉ
522
+ - ▁ETERNO
523
+ - ▁LÍQUIDO
524
+ - ▁MOLE
525
+ - ▁CAPULCU
526
+ - ▁DOMING
527
+ - ▁ROMA
528
+ - ▁CARAJU
529
+ - ▁RIATA
530
+ - ▁TRATU
531
+ - ▁SEIS
532
+ - ▁ADÁN
533
+ - ▁JUANCITO
534
+ - ▁HOR
535
+ - ''''
536
+ - ▁ARRÓ
537
+ - ▁COCINA
538
+ - ▁PALACIO
539
+ - ▁RÓMULO
540
+ - K
541
+ - ▁ALFONSO
542
+ - ▁BARTOLO
543
+ - ▁FELIPE
544
+ - ▁HERRER
545
+ - ▁PAULINO
546
+ - ▁YEGUA
547
+ - ▁LISTA
548
+ - Ú
549
+ - ▁ABRIL
550
+ - ▁CUATRO
551
+ - ▁DICIEMBRE
552
+ - ▁MARGARITO
553
+ - ▁MOJONERA
554
+ - ▁SOLEDAD
555
+ - ▁VESTIDO
556
+ - ▁PELOTA
557
+ - RRET
558
+ - ▁CAPITÁN
559
+ - ▁COMUNIÓN
560
+ - ▁CUCHARA
561
+ - ▁FERNANDO
562
+ - ▁GUADALUPE
563
+ - ▁MIGUEL
564
+ - ▁PELÚN
565
+ - ▁SECRETARIU
566
+ - ▁LENCHU
567
+ - ▁EVA
568
+ - ▁SEGUND
569
+ - ▁CANTOR
570
+ - ▁CHILPANCINGO
571
+ - ▁GABRIEL
572
+ - ▁QUINIENTO
573
+ - ▁RAÚL
574
+ - ▁SEVERIAN
575
+ - ▁TUMBADA
576
+ - ▁MALINCHI
577
+ - ▁PRIMU
578
+ - ▁MORAL
579
+ - ▁AGOSTO
580
+ - ▁CENTÍMETRO
581
+ - ▁FIRMA
582
+ - ▁HUEHUETÁN
583
+ - ▁MANGUERA
584
+ - ▁MEDI
585
+ - ▁MUERT
586
+ - ▁SALAZAR
587
+ - ▁VIERNI
588
+ - LILL
589
+ - ▁LL
590
+ - '-'
591
+ - ▁CAMPESINO
592
+ - ▁CIVIL
593
+ - ▁COMISARIADO
594
+ - )
595
+ - (
596
+ - Ã
597
+ - ‘
598
+ - ¿
599
+ - Ü
600
+ - ¡
601
+ - Q
602
+ - F
603
+ - Á
604
+ - P
605
+ - Ÿ
606
+ - W
607
+ - Ý
608
+ - <sos/eos>
609
+ init: xavier_uniform
610
+ input_size: null
611
+ ctc_conf:
612
+ dropout_rate: 0.0
613
+ ctc_type: builtin
614
+ reduce: true
615
+ ignore_nan_grad: true
616
+ model_conf:
617
+ ctc_weight: 0.3
618
+ lsm_weight: 0.1
619
+ length_normalized_loss: false
620
+ use_preprocessor: true
621
+ token_type: bpe
622
+ bpemodel: data/token_list/bpe_unigram500/bpe.model
623
+ non_linguistic_symbols: null
624
+ cleaner: null
625
+ g2p: null
626
+ speech_volume_normalize: null
627
+ rir_scp: null
628
+ rir_apply_prob: 1.0
629
+ noise_scp: null
630
+ noise_apply_prob: 1.0
631
+ noise_db_range: '13_15'
632
+ frontend: default
633
+ frontend_conf:
634
+ fs: 16k
635
+ specaug: specaug
636
+ specaug_conf:
637
+ apply_time_warp: true
638
+ time_warp_window: 5
639
+ time_warp_mode: bicubic
640
+ apply_freq_mask: true
641
+ freq_mask_width_range:
642
+ - 0
643
+ - 30
644
+ num_freq_mask: 2
645
+ apply_time_mask: true
646
+ time_mask_width_range:
647
+ - 0
648
+ - 40
649
+ num_time_mask: 2
650
+ normalize: global_mvn
651
+ normalize_conf:
652
+ stats_file: exp/asr_stats_raw_bpe500/train/feats_stats.npz
653
+ preencoder: null
654
+ preencoder_conf: {}
655
+ encoder: transformer
656
+ encoder_conf:
657
+ input_layer: conv2d
658
+ num_blocks: 12
659
+ linear_units: 2048
660
+ dropout_rate: 0.1
661
+ output_size: 512
662
+ attention_heads: 4
663
+ attention_dropout_rate: 0.0
664
+ postencoder: null
665
+ postencoder_conf: {}
666
+ decoder: transformer
667
+ decoder_conf:
668
+ input_layer: embed
669
+ num_blocks: 6
670
+ linear_units: 2048
671
+ dropout_rate: 0.1
672
+ required:
673
+ - output_dir
674
+ - token_list
675
+ version: 0.10.4a1
676
+ distributed: false
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/acc.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/backward_time.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/cer.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/cer_ctc.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/forward_time.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/gpu_max_cached_mem_GB.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/iter_time.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/loss.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/loss_att.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/loss_ctc.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/optim0_lr0.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/optim_step_time.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/train_time.png ADDED
exp/asr_train_asr_transformer_specaug_raw_bpe500/images/wer.png ADDED
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: 0.10.5a1
2
+ files:
3
+ asr_model_file: exp/asr_train_asr_transformer_specaug_raw_bpe500/56epoch.pth
4
+ python: "3.9.7 (default, Sep 16 2021, 13:09:58) \n[GCC 7.5.0]"
5
+ timestamp: 1640102337.378326
6
+ torch: 1.9.0
7
+ yaml_files:
8
+ asr_train_config: exp/asr_train_asr_transformer_specaug_raw_bpe500/config.yaml