neillu commited on
Commit
20f8953
1 Parent(s): 46e83e8

Update model

Browse files
README.md ADDED
@@ -0,0 +1,856 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - automatic-speech-recognition
6
+ language: en
7
+ datasets:
8
+ - slurp_mixture
9
+ license: cc-by-4.0
10
+ ---
11
+
12
+ ## ESPnet2 ASR model
13
+
14
+ ### `espnet/Yen-Ju_Lu_spatilaizedslurp_asr_train_asr_conformer_transformer_valid.acc.best`
15
+
16
+ This model was trained by neillu23 using slurp_mixture recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ ```bash
21
+ cd espnet
22
+ git checkout 0fae8113d99d092e7cbe4bcc48f9361e7012cff2
23
+ pip install -e .
24
+ cd egs2/slurp_mixture/asr1
25
+ ./run.sh --skip_data_prep false --skip_train true --download_model espnet/Yen-Ju_Lu_spatilaizedslurp_asr_train_asr_conformer_transformer_valid.acc.best
26
+ ```
27
+
28
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
29
+ # RESULTS
30
+ ## Environments
31
+ - date: `Tue Mar 29 04:17:37 UTC 2022`
32
+ - python version: `3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0]`
33
+ - espnet version: `espnet 0.10.7a1`
34
+ - pytorch version: `pytorch 1.9.0`
35
+ - Git hash: `0fae8113d99d092e7cbe4bcc48f9361e7012cff2`
36
+ - Commit date: `Thu Mar 24 07:54:19 2022 +0000`
37
+
38
+ ## asr_train_asr_raw_en_word
39
+ ### WER
40
+
41
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
42
+ |---|---|---|---|---|---|---|---|---|
43
+ |inference_asr_model_valid.acc.best/devel|8690|109017|63.9|21.1|15.0|2.0|38.1|75.4|
44
+ |inference_asr_model_valid.acc.best/test|6099|77315|69.0|17.4|13.5|1.8|32.8|68.9|
45
+ |inference_asr_model_valid.acc.best/test_ineube|6099|77315|77.8|12.0|10.2|1.5|23.6|59.4|
46
+ |inference_asr_model_valid.acc.best/test_qut|6099|77315|68.4|17.9|13.6|1.8|33.3|69.5|
47
+ |inference_asr_model_valid.acc.best/test_qut_ineube|6099|77315|78.0|11.9|10.2|1.4|23.4|59.3|
48
+
49
+ ### CER
50
+
51
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
52
+ |---|---|---|---|---|---|---|---|---|
53
+ |inference_asr_model_valid.acc.best/devel|8690|513265|79.6|9.3|11.0|3.6|23.9|75.4|
54
+ |inference_asr_model_valid.acc.best/test|6099|362039|82.6|7.6|9.8|3.0|20.4|68.9|
55
+ |inference_asr_model_valid.acc.best/test_ineube|6099|362039|87.5|4.9|7.6|2.1|14.6|59.4|
56
+ |inference_asr_model_valid.acc.best/test_qut|6099|362039|82.3|7.8|9.9|3.1|20.8|69.5|
57
+ |inference_asr_model_valid.acc.best/test_qut_ineube|6099|362039|87.5|4.9|7.6|2.1|14.6|59.3|
58
+
59
+ ## ASR config
60
+
61
+ <details><summary>expand</summary>
62
+
63
+ ```
64
+ config: conf/train_asr.yaml
65
+ print_config: false
66
+ log_level: INFO
67
+ dry_run: false
68
+ iterator_type: sequence
69
+ output_dir: exp/asr_train_asr_raw_en_word
70
+ ngpu: 1
71
+ seed: 0
72
+ num_workers: 1
73
+ num_att_plot: 3
74
+ dist_backend: nccl
75
+ dist_init_method: env://
76
+ dist_world_size: 3
77
+ dist_rank: 0
78
+ local_rank: 0
79
+ dist_master_addr: localhost
80
+ dist_master_port: 35953
81
+ dist_launcher: null
82
+ multiprocessing_distributed: true
83
+ unused_parameters: false
84
+ sharded_ddp: false
85
+ cudnn_enabled: true
86
+ cudnn_benchmark: false
87
+ cudnn_deterministic: true
88
+ collect_stats: false
89
+ write_collected_feats: false
90
+ max_epoch: 50
91
+ patience: null
92
+ val_scheduler_criterion:
93
+ - valid
94
+ - loss
95
+ early_stopping_criterion:
96
+ - valid
97
+ - loss
98
+ - min
99
+ best_model_criterion:
100
+ - - valid
101
+ - acc
102
+ - max
103
+ keep_nbest_models: 10
104
+ nbest_averaging_interval: 0
105
+ grad_clip: 5.0
106
+ grad_clip_type: 2.0
107
+ grad_noise: false
108
+ accum_grad: 1
109
+ no_forward_run: false
110
+ resume: true
111
+ train_dtype: float32
112
+ use_amp: false
113
+ log_interval: null
114
+ use_matplotlib: true
115
+ use_tensorboard: true
116
+ use_wandb: false
117
+ wandb_project: null
118
+ wandb_id: null
119
+ wandb_entity: null
120
+ wandb_name: null
121
+ wandb_model_log_interval: -1
122
+ detect_anomaly: false
123
+ pretrain_path: null
124
+ init_param: []
125
+ ignore_init_mismatch: false
126
+ freeze_param: []
127
+ num_iters_per_epoch: null
128
+ batch_size: 48
129
+ valid_batch_size: null
130
+ batch_bins: 1000000
131
+ valid_batch_bins: null
132
+ train_shape_file:
133
+ - exp/asr_stats_raw_en_word/train/speech_shape
134
+ - exp/asr_stats_raw_en_word/train/text_shape.word
135
+ valid_shape_file:
136
+ - exp/asr_stats_raw_en_word/valid/speech_shape
137
+ - exp/asr_stats_raw_en_word/valid/text_shape.word
138
+ batch_type: folded
139
+ valid_batch_type: null
140
+ fold_length:
141
+ - 80000
142
+ - 150
143
+ sort_in_batch: descending
144
+ sort_batch: descending
145
+ multiple_iterator: false
146
+ chunk_length: 500
147
+ chunk_shift_ratio: 0.5
148
+ num_cache_chunks: 1024
149
+ train_data_path_and_name_and_type:
150
+ - - dump/raw/train/wav.scp
151
+ - speech
152
+ - sound
153
+ - - dump/raw/train/text
154
+ - text
155
+ - text
156
+ valid_data_path_and_name_and_type:
157
+ - - dump/raw/devel/wav.scp
158
+ - speech
159
+ - sound
160
+ - - dump/raw/devel/text
161
+ - text
162
+ - text
163
+ allow_variable_data_keys: false
164
+ max_cache_size: 0.0
165
+ max_cache_fd: 32
166
+ valid_max_cache_size: null
167
+ optim: adam
168
+ optim_conf:
169
+ lr: 0.0002
170
+ scheduler: warmuplr
171
+ scheduler_conf:
172
+ warmup_steps: 25000
173
+ token_list:
174
+ - <blank>
175
+ - <unk>
176
+ - ▁the
177
+ - s
178
+ - ▁to
179
+ - ▁i
180
+ - ▁me
181
+ - ▁you
182
+ - ▁what
183
+ - ▁a
184
+ - ▁is
185
+ - a
186
+ - ▁my
187
+ - ▁please
188
+ - y
189
+ - ''''
190
+ - ▁in
191
+ - ing
192
+ - ▁s
193
+ - e
194
+ - o
195
+ - ▁for
196
+ - i
197
+ - ▁on
198
+ - d
199
+ - t
200
+ - u
201
+ - er
202
+ - p
203
+ - ▁of
204
+ - es
205
+ - re
206
+ - l
207
+ - ▁it
208
+ - ▁p
209
+ - le
210
+ - ▁f
211
+ - ▁m
212
+ - ▁email
213
+ - ▁d
214
+ - m
215
+ - ▁c
216
+ - ▁b
217
+ - st
218
+ - r
219
+ - n
220
+ - ar
221
+ - ▁t
222
+ - ▁h
223
+ - b
224
+ - ▁that
225
+ - c
226
+ - ▁this
227
+ - h
228
+ - an
229
+ - email_query
230
+ - ▁play
231
+ - ▁re
232
+ - ▁do
233
+ - ▁can
234
+ - at
235
+ - ▁have
236
+ - g
237
+ - ▁from
238
+ - ▁and
239
+ - en
240
+ - email_sendemail
241
+ - ▁olly
242
+ - 'on'
243
+ - ▁new
244
+ - it
245
+ - qa_factoid
246
+ - calendar_set
247
+ - ▁any
248
+ - or
249
+ - ▁g
250
+ - ent
251
+ - ▁how
252
+ - ▁tell
253
+ - ch
254
+ - ▁not
255
+ - ▁about
256
+ - ▁at
257
+ - ate
258
+ - general_negate
259
+ - f
260
+ - ▁today
261
+ - ▁e
262
+ - ed
263
+ - ▁list
264
+ - ▁r
265
+ - in
266
+ - k
267
+ - ic
268
+ - social_post
269
+ - ▁are
270
+ - play_music
271
+ - general_quirky
272
+ - ▁l
273
+ - al
274
+ - v
275
+ - ▁n
276
+ - ▁be
277
+ - ▁an
278
+ - ▁st
279
+ - et
280
+ - ▁am
281
+ - general_praise
282
+ - ▁time
283
+ - weather_query
284
+ - ▁up
285
+ - ▁check
286
+ - calendar_query
287
+ - ▁w
288
+ - om
289
+ - ur
290
+ - ▁send
291
+ - ▁with
292
+ - ly
293
+ - w
294
+ - general_explain
295
+ - ad
296
+ - ▁th
297
+ - news_query
298
+ - ▁one
299
+ - ▁emails
300
+ - day
301
+ - ▁sh
302
+ - ce
303
+ - ▁
304
+ - ▁last
305
+ - ve
306
+ - ▁he
307
+ - z
308
+ - ▁ch
309
+ - ▁will
310
+ - ▁set
311
+ - ▁would
312
+ - ▁was
313
+ - x
314
+ - general_repeat
315
+ - ▁add
316
+ - ▁again
317
+ - ou
318
+ - ▁ex
319
+ - is
320
+ - ct
321
+ - general_affirm
322
+ - general_confirm
323
+ - ▁song
324
+ - ▁next
325
+ - ▁j
326
+ - ▁meeting
327
+ - um
328
+ - ation
329
+ - ▁turn
330
+ - ▁did
331
+ - if
332
+ - ▁alarm
333
+ - am
334
+ - ▁like
335
+ - datetime_query
336
+ - ter
337
+ - ▁remind
338
+ - ▁o
339
+ - qa_definition
340
+ - ▁said
341
+ - ▁calendar
342
+ - ll
343
+ - se
344
+ - ers
345
+ - ▁pr
346
+ - th
347
+ - ▁get
348
+ - our
349
+ - ▁need
350
+ - ▁all
351
+ - ot
352
+ - ▁want
353
+ - ▁off
354
+ - and
355
+ - ▁right
356
+ - ▁de
357
+ - ▁tr
358
+ - ut
359
+ - general_dontcare
360
+ - as
361
+ - ▁week
362
+ - ▁tweet
363
+ - ight
364
+ - ir
365
+ - ▁your
366
+ - ▁event
367
+ - ▁news
368
+ - ▁se
369
+ - ay
370
+ - ion
371
+ - ▁com
372
+ - ▁there
373
+ - ▁ye
374
+ - ▁weather
375
+ - un
376
+ - ▁confirm
377
+ - ld
378
+ - calendar_remove
379
+ - ▁y
380
+ - ▁lights
381
+ - ▁more
382
+ - ▁v
383
+ - play_radio
384
+ - ▁does
385
+ - ▁po
386
+ - ▁now
387
+ - id
388
+ - email_querycontact
389
+ - ▁show
390
+ - ▁could
391
+ - ery
392
+ - op
393
+ - ▁day
394
+ - ▁pm
395
+ - ▁music
396
+ - ▁tomorrow
397
+ - ▁train
398
+ - ▁u
399
+ - ine
400
+ - ▁or
401
+ - ange
402
+ - qa_currency
403
+ - ice
404
+ - ▁contact
405
+ - ▁just
406
+ - ▁jo
407
+ - ▁think
408
+ - qa_stock
409
+ - end
410
+ - ss
411
+ - ber
412
+ - ▁tw
413
+ - ▁command
414
+ - ▁make
415
+ - ▁no
416
+ - ▁mo
417
+ - pe
418
+ - ▁find
419
+ - general_commandstop
420
+ - ▁when
421
+ - social_query
422
+ - ▁so
423
+ - ong
424
+ - ▁co
425
+ - ant
426
+ - ow
427
+ - q
428
+ - ▁much
429
+ - ▁where
430
+ - ue
431
+ - ul
432
+ - ri
433
+ - ake
434
+ - ap
435
+ - ▁start
436
+ - ▁mar
437
+ - ▁by
438
+ - one
439
+ - ▁know
440
+ - ▁wor
441
+ - oo
442
+ - ▁give
443
+ - ▁let
444
+ - ▁events
445
+ - der
446
+ - ▁ro
447
+ - ▁pl
448
+ - play_podcasts
449
+ - art
450
+ - us
451
+ - ▁work
452
+ - ▁current
453
+ - ol
454
+ - cooking_recipe
455
+ - nt
456
+ - ▁correct
457
+ - transport_query
458
+ - ia
459
+ - ▁stock
460
+ - ▁br
461
+ - ive
462
+ - ▁app
463
+ - ▁two
464
+ - ▁latest
465
+ - lists_query
466
+ - recommendation_events
467
+ - ab
468
+ - ▁go
469
+ - ▁but
470
+ - ook
471
+ - ▁some
472
+ - ke
473
+ - alarm_set
474
+ - play_audiobook
475
+ - ▁k
476
+ - ▁response
477
+ - ▁wr
478
+ - cast
479
+ - ▁open
480
+ - ▁cle
481
+ - ▁done
482
+ - ▁got
483
+ - ▁ca
484
+ - ite
485
+ - ase
486
+ - ▁thank
487
+ - iv
488
+ - ag
489
+ - ah
490
+ - ▁answer
491
+ - ie
492
+ - ▁five
493
+ - ▁book
494
+ - ▁rec
495
+ - ore
496
+ - ▁john
497
+ - ist
498
+ - ment
499
+ - ▁appreci
500
+ - ▁fri
501
+ - ack
502
+ - ▁remove
503
+ - ated
504
+ - ock
505
+ - ree
506
+ - j
507
+ - ▁good
508
+ - ▁many
509
+ - orn
510
+ - fe
511
+ - ▁radio
512
+ - ▁we
513
+ - int
514
+ - ▁facebook
515
+ - ▁cl
516
+ - ▁sev
517
+ - ▁schedule
518
+ - ard
519
+ - ▁per
520
+ - ▁li
521
+ - ▁going
522
+ - nd
523
+ - ain
524
+ - recommendation_locations
525
+ - ▁post
526
+ - lists_createoradd
527
+ - ff
528
+ - ▁su
529
+ - red
530
+ - iot_hue_lightoff
531
+ - lists_remove
532
+ - ▁ar
533
+ - een
534
+ - ▁say
535
+ - ro
536
+ - ▁volume
537
+ - ▁le
538
+ - ▁reply
539
+ - ▁complaint
540
+ - ▁delete
541
+ - ▁out
542
+ - lly
543
+ - ame
544
+ - ▁ne
545
+ - ▁detail
546
+ - ▁if
547
+ - im
548
+ - ▁happ
549
+ - orr
550
+ - ich
551
+ - em
552
+ - ▁ev
553
+ - ction
554
+ - ▁dollar
555
+ - ▁as
556
+ - alarm_query
557
+ - audio_volume_mute
558
+ - ac
559
+ - music_query
560
+ - ▁mon
561
+ - ther
562
+ - ▁thanks
563
+ - cel
564
+ - ▁who
565
+ - ave
566
+ - ▁service
567
+ - ▁mail
568
+ - ▁hear
569
+ - ty
570
+ - de
571
+ - ▁si
572
+ - ▁wh
573
+ - ood
574
+ - ell
575
+ - ▁con
576
+ - icket
577
+ - ▁once
578
+ - ound
579
+ - ▁don
580
+ - ▁loc
581
+ - ▁light
582
+ - ▁birthday
583
+ - ▁inf
584
+ - ffe
585
+ - ▁has
586
+ - ▁playlist
587
+ - ort
588
+ - el
589
+ - ening
590
+ - ▁us
591
+ - ▁un
592
+ - own
593
+ - ▁inc
594
+ - ai
595
+ - ▁speak
596
+ - age
597
+ - ▁mess
598
+ - ast
599
+ - ci
600
+ - ver
601
+ - ▁ten
602
+ - ▁underst
603
+ - gh
604
+ - audio_volume_up
605
+ - ome
606
+ - transport_ticket
607
+ - ind
608
+ - iot_hue_lightchange
609
+ - iot_coffee
610
+ - pp
611
+ - ▁res
612
+ - plain
613
+ - io
614
+ - lar
615
+ - takeaway_query
616
+ - ge
617
+ - takeaway_order
618
+ - email_addcontact
619
+ - play_game
620
+ - ak
621
+ - ▁fa
622
+ - transport_traffic
623
+ - music_likeness
624
+ - ▁rep
625
+ - act
626
+ - ust
627
+ - transport_taxi
628
+ - iot_hue_lightdim
629
+ - ▁mu
630
+ - ▁ti
631
+ - ick
632
+ - ▁ha
633
+ - ould
634
+ - general_joke
635
+ - '1'
636
+ - qa_maths
637
+ - ▁lo
638
+ - iot_cleaning
639
+ - ill
640
+ - her
641
+ - iot_hue_lightup
642
+ - pl
643
+ - '2'
644
+ - alarm_remove
645
+ - orrect
646
+ - ▁cont
647
+ - mail
648
+ - out
649
+ - audio_volume_down
650
+ - book
651
+ - ail
652
+ - recommendation_movies
653
+ - ck
654
+ - ▁man
655
+ - ▁mus
656
+ - ▁che
657
+ - me
658
+ - ume
659
+ - ▁answ
660
+ - datetime_convert
661
+ - ▁late
662
+ - iot_wemo_on
663
+ - ▁twe
664
+ - music_settings
665
+ - iot_wemo_off
666
+ - orre
667
+ - ith
668
+ - ▁tom
669
+ - ▁fr
670
+ - ere
671
+ - ▁ad
672
+ - xt
673
+ - ▁ab
674
+ - ank
675
+ - general_greet
676
+ - now
677
+ - ▁meet
678
+ - ▁curre
679
+ - ▁respon
680
+ - ▁ag
681
+ - audio_volume_other
682
+ - ink
683
+ - ▁spe
684
+ - iot_hue_lighton
685
+ - ght
686
+ - ▁rem
687
+ - '?'
688
+ - urn
689
+ - ▁op
690
+ - ▁complain
691
+ - ▁comm
692
+ - let
693
+ - music_dislikeness
694
+ - ove
695
+ - ▁sch
696
+ - ather
697
+ - ▁rad
698
+ - edule
699
+ - ▁under
700
+ - lease
701
+ - ▁bir
702
+ - erv
703
+ - ▁birth
704
+ - ▁face
705
+ - ▁cur
706
+ - sw
707
+ - ▁serv
708
+ - ek
709
+ - aid
710
+ - '9'
711
+ - ▁vol
712
+ - edu
713
+ - '5'
714
+ - cooking_query
715
+ - lete
716
+ - ▁joh
717
+ - ▁det
718
+ - firm
719
+ - nder
720
+ - '0'
721
+ - _
722
+ - irm
723
+ - '8'
724
+ - '&'
725
+ - list
726
+ - pon
727
+ - qa_query
728
+ - '7'
729
+ - '3'
730
+ - '-'
731
+ - N
732
+ - A
733
+ - M
734
+ - E
735
+ - ']'
736
+ - '['
737
+ - ':'
738
+ - reci
739
+ - ▁doll
740
+ - <sos/eos>
741
+ init: null
742
+ input_size: null
743
+ ctc_conf:
744
+ dropout_rate: 0.0
745
+ ctc_type: builtin
746
+ reduce: true
747
+ ignore_nan_grad: true
748
+ joint_net_conf: null
749
+ model_conf:
750
+ ctc_weight: 0.3
751
+ lsm_weight: 0.1
752
+ length_normalized_loss: false
753
+ extract_feats_in_collect_stats: false
754
+ use_preprocessor: true
755
+ token_type: word
756
+ bpemodel: null
757
+ non_linguistic_symbols: null
758
+ cleaner: null
759
+ g2p: null
760
+ speech_volume_normalize: null
761
+ rir_scp: null
762
+ rir_apply_prob: 1.0
763
+ noise_scp: null
764
+ noise_apply_prob: 1.0
765
+ noise_db_range: '13_15'
766
+ frontend: default
767
+ frontend_conf:
768
+ fs: 16k
769
+ specaug: specaug
770
+ specaug_conf:
771
+ apply_time_warp: true
772
+ time_warp_window: 5
773
+ time_warp_mode: bicubic
774
+ apply_freq_mask: true
775
+ freq_mask_width_range:
776
+ - 0
777
+ - 30
778
+ num_freq_mask: 2
779
+ apply_time_mask: true
780
+ time_mask_width_range:
781
+ - 0
782
+ - 40
783
+ num_time_mask: 2
784
+ normalize: utterance_mvn
785
+ normalize_conf: {}
786
+ preencoder: null
787
+ preencoder_conf: {}
788
+ encoder: conformer
789
+ encoder_conf:
790
+ output_size: 512
791
+ attention_heads: 8
792
+ linear_units: 2048
793
+ num_blocks: 12
794
+ dropout_rate: 0.1
795
+ positional_dropout_rate: 0.1
796
+ attention_dropout_rate: 0.1
797
+ input_layer: conv2d
798
+ normalize_before: true
799
+ macaron_style: true
800
+ pos_enc_layer_type: rel_pos
801
+ selfattention_layer_type: rel_selfattn
802
+ activation_type: swish
803
+ use_cnn_module: true
804
+ cnn_module_kernel: 31
805
+ postencoder: null
806
+ postencoder_conf: {}
807
+ decoder: transformer
808
+ decoder_conf:
809
+ attention_heads: 8
810
+ linear_units: 2048
811
+ num_blocks: 6
812
+ dropout_rate: 0.1
813
+ positional_dropout_rate: 0.1
814
+ self_attention_dropout_rate: 0.1
815
+ src_attention_dropout_rate: 0.1
816
+ required:
817
+ - output_dir
818
+ - token_list
819
+ version: 0.10.7a1
820
+ distributed: true
821
+ ```
822
+
823
+ </details>
824
+
825
+
826
+
827
+ ### Citing ESPnet
828
+
829
+ ```BibTex
830
+ @inproceedings{watanabe2018espnet,
831
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
832
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
833
+ year={2018},
834
+ booktitle={Proceedings of Interspeech},
835
+ pages={2207--2211},
836
+ doi={10.21437/Interspeech.2018-1456},
837
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
838
+ }
839
+
840
+
841
+
842
+
843
+ ```
844
+
845
+ or arXiv:
846
+
847
+ ```bibtex
848
+ @misc{watanabe2018espnet,
849
+ title={ESPnet: End-to-End Speech Processing Toolkit},
850
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
851
+ year={2018},
852
+ eprint={1804.00015},
853
+ archivePrefix={arXiv},
854
+ primaryClass={cs.CL}
855
+ }
856
+ ```
exp/asr_train_asr_raw_en_word/35epoch.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1a301e560edb1d5351d8d39d3b19b78d5da8415629efa9fdc7f31b9a33ef797
3
+ size 437693801
exp/asr_train_asr_raw_en_word/RESULTS.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
2
+ # RESULTS
3
+ ## Environments
4
+ - date: `Tue Mar 29 04:17:37 UTC 2022`
5
+ - python version: `3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0]`
6
+ - espnet version: `espnet 0.10.7a1`
7
+ - pytorch version: `pytorch 1.9.0`
8
+ - Git hash: `0fae8113d99d092e7cbe4bcc48f9361e7012cff2`
9
+ - Commit date: `Thu Mar 24 07:54:19 2022 +0000`
10
+
11
+ ## asr_train_asr_raw_en_word
12
+ ### WER
13
+
14
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
15
+ |---|---|---|---|---|---|---|---|---|
16
+ |inference_asr_model_valid.acc.best/devel|8690|109017|63.9|21.1|15.0|2.0|38.1|75.4|
17
+ |inference_asr_model_valid.acc.best/test|6099|77315|69.0|17.4|13.5|1.8|32.8|68.9|
18
+ |inference_asr_model_valid.acc.best/test_ineube|6099|77315|77.8|12.0|10.2|1.5|23.6|59.4|
19
+ |inference_asr_model_valid.acc.best/test_qut|6099|77315|68.4|17.9|13.6|1.8|33.3|69.5|
20
+ |inference_asr_model_valid.acc.best/test_qut_ineube|6099|77315|78.0|11.9|10.2|1.4|23.4|59.3|
21
+
22
+ ### CER
23
+
24
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
25
+ |---|---|---|---|---|---|---|---|---|
26
+ |inference_asr_model_valid.acc.best/devel|8690|513265|79.6|9.3|11.0|3.6|23.9|75.4|
27
+ |inference_asr_model_valid.acc.best/test|6099|362039|82.6|7.6|9.8|3.0|20.4|68.9|
28
+ |inference_asr_model_valid.acc.best/test_ineube|6099|362039|87.5|4.9|7.6|2.1|14.6|59.4|
29
+ |inference_asr_model_valid.acc.best/test_qut|6099|362039|82.3|7.8|9.9|3.1|20.8|69.5|
30
+ |inference_asr_model_valid.acc.best/test_qut_ineube|6099|362039|87.5|4.9|7.6|2.1|14.6|59.3|
exp/asr_train_asr_raw_en_word/config.yaml ADDED
@@ -0,0 +1,757 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/train_asr.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ dry_run: false
5
+ iterator_type: sequence
6
+ output_dir: exp/asr_train_asr_raw_en_word
7
+ ngpu: 1
8
+ seed: 0
9
+ num_workers: 1
10
+ num_att_plot: 3
11
+ dist_backend: nccl
12
+ dist_init_method: env://
13
+ dist_world_size: 3
14
+ dist_rank: 0
15
+ local_rank: 0
16
+ dist_master_addr: localhost
17
+ dist_master_port: 35953
18
+ dist_launcher: null
19
+ multiprocessing_distributed: true
20
+ unused_parameters: false
21
+ sharded_ddp: false
22
+ cudnn_enabled: true
23
+ cudnn_benchmark: false
24
+ cudnn_deterministic: true
25
+ collect_stats: false
26
+ write_collected_feats: false
27
+ max_epoch: 50
28
+ patience: null
29
+ val_scheduler_criterion:
30
+ - valid
31
+ - loss
32
+ early_stopping_criterion:
33
+ - valid
34
+ - loss
35
+ - min
36
+ best_model_criterion:
37
+ - - valid
38
+ - acc
39
+ - max
40
+ keep_nbest_models: 10
41
+ nbest_averaging_interval: 0
42
+ grad_clip: 5.0
43
+ grad_clip_type: 2.0
44
+ grad_noise: false
45
+ accum_grad: 1
46
+ no_forward_run: false
47
+ resume: true
48
+ train_dtype: float32
49
+ use_amp: false
50
+ log_interval: null
51
+ use_matplotlib: true
52
+ use_tensorboard: true
53
+ use_wandb: false
54
+ wandb_project: null
55
+ wandb_id: null
56
+ wandb_entity: null
57
+ wandb_name: null
58
+ wandb_model_log_interval: -1
59
+ detect_anomaly: false
60
+ pretrain_path: null
61
+ init_param: []
62
+ ignore_init_mismatch: false
63
+ freeze_param: []
64
+ num_iters_per_epoch: null
65
+ batch_size: 48
66
+ valid_batch_size: null
67
+ batch_bins: 1000000
68
+ valid_batch_bins: null
69
+ train_shape_file:
70
+ - exp/asr_stats_raw_en_word/train/speech_shape
71
+ - exp/asr_stats_raw_en_word/train/text_shape.word
72
+ valid_shape_file:
73
+ - exp/asr_stats_raw_en_word/valid/speech_shape
74
+ - exp/asr_stats_raw_en_word/valid/text_shape.word
75
+ batch_type: folded
76
+ valid_batch_type: null
77
+ fold_length:
78
+ - 80000
79
+ - 150
80
+ sort_in_batch: descending
81
+ sort_batch: descending
82
+ multiple_iterator: false
83
+ chunk_length: 500
84
+ chunk_shift_ratio: 0.5
85
+ num_cache_chunks: 1024
86
+ train_data_path_and_name_and_type:
87
+ - - dump/raw/train/wav.scp
88
+ - speech
89
+ - sound
90
+ - - dump/raw/train/text
91
+ - text
92
+ - text
93
+ valid_data_path_and_name_and_type:
94
+ - - dump/raw/devel/wav.scp
95
+ - speech
96
+ - sound
97
+ - - dump/raw/devel/text
98
+ - text
99
+ - text
100
+ allow_variable_data_keys: false
101
+ max_cache_size: 0.0
102
+ max_cache_fd: 32
103
+ valid_max_cache_size: null
104
+ optim: adam
105
+ optim_conf:
106
+ lr: 0.0002
107
+ scheduler: warmuplr
108
+ scheduler_conf:
109
+ warmup_steps: 25000
110
+ token_list:
111
+ - <blank>
112
+ - <unk>
113
+ - ▁the
114
+ - s
115
+ - ▁to
116
+ - ▁i
117
+ - ▁me
118
+ - ▁you
119
+ - ▁what
120
+ - ▁a
121
+ - ▁is
122
+ - a
123
+ - ▁my
124
+ - ▁please
125
+ - y
126
+ - ''''
127
+ - ▁in
128
+ - ing
129
+ - ▁s
130
+ - e
131
+ - o
132
+ - ▁for
133
+ - i
134
+ - ▁on
135
+ - d
136
+ - t
137
+ - u
138
+ - er
139
+ - p
140
+ - ▁of
141
+ - es
142
+ - re
143
+ - l
144
+ - ▁it
145
+ - ▁p
146
+ - le
147
+ - ▁f
148
+ - ▁m
149
+ - ▁email
150
+ - ▁d
151
+ - m
152
+ - ▁c
153
+ - ▁b
154
+ - st
155
+ - r
156
+ - n
157
+ - ar
158
+ - ▁t
159
+ - ▁h
160
+ - b
161
+ - ▁that
162
+ - c
163
+ - ▁this
164
+ - h
165
+ - an
166
+ - email_query
167
+ - ▁play
168
+ - ▁re
169
+ - ▁do
170
+ - ▁can
171
+ - at
172
+ - ▁have
173
+ - g
174
+ - ▁from
175
+ - ▁and
176
+ - en
177
+ - email_sendemail
178
+ - ▁olly
179
+ - 'on'
180
+ - ▁new
181
+ - it
182
+ - qa_factoid
183
+ - calendar_set
184
+ - ▁any
185
+ - or
186
+ - ▁g
187
+ - ent
188
+ - ▁how
189
+ - ▁tell
190
+ - ch
191
+ - ▁not
192
+ - ▁about
193
+ - ▁at
194
+ - ate
195
+ - general_negate
196
+ - f
197
+ - ▁today
198
+ - ▁e
199
+ - ed
200
+ - ▁list
201
+ - ▁r
202
+ - in
203
+ - k
204
+ - ic
205
+ - social_post
206
+ - ▁are
207
+ - play_music
208
+ - general_quirky
209
+ - ▁l
210
+ - al
211
+ - v
212
+ - ▁n
213
+ - ▁be
214
+ - ▁an
215
+ - ▁st
216
+ - et
217
+ - ▁am
218
+ - general_praise
219
+ - ▁time
220
+ - weather_query
221
+ - ▁up
222
+ - ▁check
223
+ - calendar_query
224
+ - ▁w
225
+ - om
226
+ - ur
227
+ - ▁send
228
+ - ▁with
229
+ - ly
230
+ - w
231
+ - general_explain
232
+ - ad
233
+ - ▁th
234
+ - news_query
235
+ - ▁one
236
+ - ▁emails
237
+ - day
238
+ - ▁sh
239
+ - ce
240
+ - ▁
241
+ - ▁last
242
+ - ve
243
+ - ▁he
244
+ - z
245
+ - ▁ch
246
+ - ▁will
247
+ - ▁set
248
+ - ▁would
249
+ - ▁was
250
+ - x
251
+ - general_repeat
252
+ - ▁add
253
+ - ▁again
254
+ - ou
255
+ - ▁ex
256
+ - is
257
+ - ct
258
+ - general_affirm
259
+ - general_confirm
260
+ - ▁song
261
+ - ▁next
262
+ - ▁j
263
+ - ▁meeting
264
+ - um
265
+ - ation
266
+ - ▁turn
267
+ - ▁did
268
+ - if
269
+ - ▁alarm
270
+ - am
271
+ - ▁like
272
+ - datetime_query
273
+ - ter
274
+ - ▁remind
275
+ - ▁o
276
+ - qa_definition
277
+ - ▁said
278
+ - ▁calendar
279
+ - ll
280
+ - se
281
+ - ers
282
+ - ▁pr
283
+ - th
284
+ - ▁get
285
+ - our
286
+ - ▁need
287
+ - ▁all
288
+ - ot
289
+ - ▁want
290
+ - ▁off
291
+ - and
292
+ - ▁right
293
+ - ▁de
294
+ - ▁tr
295
+ - ut
296
+ - general_dontcare
297
+ - as
298
+ - ▁week
299
+ - ▁tweet
300
+ - ight
301
+ - ir
302
+ - ▁your
303
+ - ▁event
304
+ - ▁news
305
+ - ▁se
306
+ - ay
307
+ - ion
308
+ - ▁com
309
+ - ▁there
310
+ - ▁ye
311
+ - ▁weather
312
+ - un
313
+ - ▁confirm
314
+ - ld
315
+ - calendar_remove
316
+ - ▁y
317
+ - ▁lights
318
+ - ▁more
319
+ - ▁v
320
+ - play_radio
321
+ - ▁does
322
+ - ▁po
323
+ - ▁now
324
+ - id
325
+ - email_querycontact
326
+ - ▁show
327
+ - ▁could
328
+ - ery
329
+ - op
330
+ - ▁day
331
+ - ▁pm
332
+ - ▁music
333
+ - ▁tomorrow
334
+ - ▁train
335
+ - ▁u
336
+ - ine
337
+ - ▁or
338
+ - ange
339
+ - qa_currency
340
+ - ice
341
+ - ▁contact
342
+ - ▁just
343
+ - ▁jo
344
+ - ▁think
345
+ - qa_stock
346
+ - end
347
+ - ss
348
+ - ber
349
+ - ▁tw
350
+ - ▁command
351
+ - ▁make
352
+ - ▁no
353
+ - ▁mo
354
+ - pe
355
+ - ▁find
356
+ - general_commandstop
357
+ - ▁when
358
+ - social_query
359
+ - ▁so
360
+ - ong
361
+ - ▁co
362
+ - ant
363
+ - ow
364
+ - q
365
+ - ▁much
366
+ - ▁where
367
+ - ue
368
+ - ul
369
+ - ri
370
+ - ake
371
+ - ap
372
+ - ▁start
373
+ - ▁mar
374
+ - ▁by
375
+ - one
376
+ - ▁know
377
+ - ▁wor
378
+ - oo
379
+ - ▁give
380
+ - ▁let
381
+ - ▁events
382
+ - der
383
+ - ▁ro
384
+ - ▁pl
385
+ - play_podcasts
386
+ - art
387
+ - us
388
+ - ▁work
389
+ - ▁current
390
+ - ol
391
+ - cooking_recipe
392
+ - nt
393
+ - ▁correct
394
+ - transport_query
395
+ - ia
396
+ - ▁stock
397
+ - ▁br
398
+ - ive
399
+ - ▁app
400
+ - ▁two
401
+ - ▁latest
402
+ - lists_query
403
+ - recommendation_events
404
+ - ab
405
+ - ▁go
406
+ - ▁but
407
+ - ook
408
+ - ▁some
409
+ - ke
410
+ - alarm_set
411
+ - play_audiobook
412
+ - ▁k
413
+ - ▁response
414
+ - ▁wr
415
+ - cast
416
+ - ▁open
417
+ - ▁cle
418
+ - ▁done
419
+ - ▁got
420
+ - ▁ca
421
+ - ite
422
+ - ase
423
+ - ▁thank
424
+ - iv
425
+ - ag
426
+ - ah
427
+ - ▁answer
428
+ - ie
429
+ - ▁five
430
+ - ▁book
431
+ - ▁rec
432
+ - ore
433
+ - ▁john
434
+ - ist
435
+ - ment
436
+ - ▁appreci
437
+ - ▁fri
438
+ - ack
439
+ - ▁remove
440
+ - ated
441
+ - ock
442
+ - ree
443
+ - j
444
+ - ▁good
445
+ - ▁many
446
+ - orn
447
+ - fe
448
+ - ▁radio
449
+ - ▁we
450
+ - int
451
+ - ▁facebook
452
+ - ▁cl
453
+ - ▁sev
454
+ - ▁schedule
455
+ - ard
456
+ - ▁per
457
+ - ▁li
458
+ - ▁going
459
+ - nd
460
+ - ain
461
+ - recommendation_locations
462
+ - ▁post
463
+ - lists_createoradd
464
+ - ff
465
+ - ▁su
466
+ - red
467
+ - iot_hue_lightoff
468
+ - lists_remove
469
+ - ▁ar
470
+ - een
471
+ - ▁say
472
+ - ro
473
+ - ▁volume
474
+ - ▁le
475
+ - ▁reply
476
+ - ▁complaint
477
+ - ▁delete
478
+ - ▁out
479
+ - lly
480
+ - ame
481
+ - ▁ne
482
+ - ▁detail
483
+ - ▁if
484
+ - im
485
+ - ▁happ
486
+ - orr
487
+ - ich
488
+ - em
489
+ - ▁ev
490
+ - ction
491
+ - ▁dollar
492
+ - ▁as
493
+ - alarm_query
494
+ - audio_volume_mute
495
+ - ac
496
+ - music_query
497
+ - ▁mon
498
+ - ther
499
+ - ▁thanks
500
+ - cel
501
+ - ▁who
502
+ - ave
503
+ - ▁service
504
+ - ▁mail
505
+ - ▁hear
506
+ - ty
507
+ - de
508
+ - ▁si
509
+ - ▁wh
510
+ - ood
511
+ - ell
512
+ - ▁con
513
+ - icket
514
+ - ▁once
515
+ - ound
516
+ - ▁don
517
+ - ▁loc
518
+ - ▁light
519
+ - ▁birthday
520
+ - ▁inf
521
+ - ffe
522
+ - ▁has
523
+ - ▁playlist
524
+ - ort
525
+ - el
526
+ - ening
527
+ - ▁us
528
+ - ▁un
529
+ - own
530
+ - ▁inc
531
+ - ai
532
+ - ▁speak
533
+ - age
534
+ - ▁mess
535
+ - ast
536
+ - ci
537
+ - ver
538
+ - ▁ten
539
+ - ▁underst
540
+ - gh
541
+ - audio_volume_up
542
+ - ome
543
+ - transport_ticket
544
+ - ind
545
+ - iot_hue_lightchange
546
+ - iot_coffee
547
+ - pp
548
+ - ▁res
549
+ - plain
550
+ - io
551
+ - lar
552
+ - takeaway_query
553
+ - ge
554
+ - takeaway_order
555
+ - email_addcontact
556
+ - play_game
557
+ - ak
558
+ - ▁fa
559
+ - transport_traffic
560
+ - music_likeness
561
+ - ▁rep
562
+ - act
563
+ - ust
564
+ - transport_taxi
565
+ - iot_hue_lightdim
566
+ - ▁mu
567
+ - ▁ti
568
+ - ick
569
+ - ▁ha
570
+ - ould
571
+ - general_joke
572
+ - '1'
573
+ - qa_maths
574
+ - ▁lo
575
+ - iot_cleaning
576
+ - ill
577
+ - her
578
+ - iot_hue_lightup
579
+ - pl
580
+ - '2'
581
+ - alarm_remove
582
+ - orrect
583
+ - ▁cont
584
+ - mail
585
+ - out
586
+ - audio_volume_down
587
+ - book
588
+ - ail
589
+ - recommendation_movies
590
+ - ck
591
+ - ▁man
592
+ - ▁mus
593
+ - ▁che
594
+ - me
595
+ - ume
596
+ - ▁answ
597
+ - datetime_convert
598
+ - ▁late
599
+ - iot_wemo_on
600
+ - ▁twe
601
+ - music_settings
602
+ - iot_wemo_off
603
+ - orre
604
+ - ith
605
+ - ▁tom
606
+ - ▁fr
607
+ - ere
608
+ - ▁ad
609
+ - xt
610
+ - ▁ab
611
+ - ank
612
+ - general_greet
613
+ - now
614
+ - ▁meet
615
+ - ▁curre
616
+ - ▁respon
617
+ - ▁ag
618
+ - audio_volume_other
619
+ - ink
620
+ - ▁spe
621
+ - iot_hue_lighton
622
+ - ght
623
+ - ▁rem
624
+ - '?'
625
+ - urn
626
+ - ▁op
627
+ - ▁complain
628
+ - ▁comm
629
+ - let
630
+ - music_dislikeness
631
+ - ove
632
+ - ▁sch
633
+ - ather
634
+ - ▁rad
635
+ - edule
636
+ - ▁under
637
+ - lease
638
+ - ▁bir
639
+ - erv
640
+ - ▁birth
641
+ - ▁face
642
+ - ▁cur
643
+ - sw
644
+ - ▁serv
645
+ - ek
646
+ - aid
647
+ - '9'
648
+ - ▁vol
649
+ - edu
650
+ - '5'
651
+ - cooking_query
652
+ - lete
653
+ - ▁joh
654
+ - ▁det
655
+ - firm
656
+ - nder
657
+ - '0'
658
+ - _
659
+ - irm
660
+ - '8'
661
+ - '&'
662
+ - list
663
+ - pon
664
+ - qa_query
665
+ - '7'
666
+ - '3'
667
+ - '-'
668
+ - N
669
+ - A
670
+ - M
671
+ - E
672
+ - ']'
673
+ - '['
674
+ - ':'
675
+ - reci
676
+ - ▁doll
677
+ - <sos/eos>
678
+ init: null
679
+ input_size: null
680
+ ctc_conf:
681
+ dropout_rate: 0.0
682
+ ctc_type: builtin
683
+ reduce: true
684
+ ignore_nan_grad: true
685
+ joint_net_conf: null
686
+ model_conf:
687
+ ctc_weight: 0.3
688
+ lsm_weight: 0.1
689
+ length_normalized_loss: false
690
+ extract_feats_in_collect_stats: false
691
+ use_preprocessor: true
692
+ token_type: word
693
+ bpemodel: null
694
+ non_linguistic_symbols: null
695
+ cleaner: null
696
+ g2p: null
697
+ speech_volume_normalize: null
698
+ rir_scp: null
699
+ rir_apply_prob: 1.0
700
+ noise_scp: null
701
+ noise_apply_prob: 1.0
702
+ noise_db_range: '13_15'
703
+ frontend: default
704
+ frontend_conf:
705
+ fs: 16k
706
+ specaug: specaug
707
+ specaug_conf:
708
+ apply_time_warp: true
709
+ time_warp_window: 5
710
+ time_warp_mode: bicubic
711
+ apply_freq_mask: true
712
+ freq_mask_width_range:
713
+ - 0
714
+ - 30
715
+ num_freq_mask: 2
716
+ apply_time_mask: true
717
+ time_mask_width_range:
718
+ - 0
719
+ - 40
720
+ num_time_mask: 2
721
+ normalize: utterance_mvn
722
+ normalize_conf: {}
723
+ preencoder: null
724
+ preencoder_conf: {}
725
+ encoder: conformer
726
+ encoder_conf:
727
+ output_size: 512
728
+ attention_heads: 8
729
+ linear_units: 2048
730
+ num_blocks: 12
731
+ dropout_rate: 0.1
732
+ positional_dropout_rate: 0.1
733
+ attention_dropout_rate: 0.1
734
+ input_layer: conv2d
735
+ normalize_before: true
736
+ macaron_style: true
737
+ pos_enc_layer_type: rel_pos
738
+ selfattention_layer_type: rel_selfattn
739
+ activation_type: swish
740
+ use_cnn_module: true
741
+ cnn_module_kernel: 31
742
+ postencoder: null
743
+ postencoder_conf: {}
744
+ decoder: transformer
745
+ decoder_conf:
746
+ attention_heads: 8
747
+ linear_units: 2048
748
+ num_blocks: 6
749
+ dropout_rate: 0.1
750
+ positional_dropout_rate: 0.1
751
+ self_attention_dropout_rate: 0.1
752
+ src_attention_dropout_rate: 0.1
753
+ required:
754
+ - output_dir
755
+ - token_list
756
+ version: 0.10.7a1
757
+ distributed: true
exp/asr_train_asr_raw_en_word/images/acc.png ADDED
exp/asr_train_asr_raw_en_word/images/backward_time.png ADDED
exp/asr_train_asr_raw_en_word/images/cer.png ADDED
exp/asr_train_asr_raw_en_word/images/cer_ctc.png ADDED
exp/asr_train_asr_raw_en_word/images/forward_time.png ADDED
exp/asr_train_asr_raw_en_word/images/gpu_max_cached_mem_GB.png ADDED
exp/asr_train_asr_raw_en_word/images/iter_time.png ADDED
exp/asr_train_asr_raw_en_word/images/loss.png ADDED
exp/asr_train_asr_raw_en_word/images/loss_att.png ADDED
exp/asr_train_asr_raw_en_word/images/loss_ctc.png ADDED
exp/asr_train_asr_raw_en_word/images/optim0_lr0.png ADDED
exp/asr_train_asr_raw_en_word/images/optim_step_time.png ADDED
exp/asr_train_asr_raw_en_word/images/train_time.png ADDED
exp/asr_train_asr_raw_en_word/images/wer.png ADDED
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: 0.10.7a1
2
+ files:
3
+ asr_model_file: exp/asr_train_asr_raw_en_word/35epoch.pth
4
+ python: "3.8.12 (default, Oct 12 2021, 13:49:34) \n[GCC 7.5.0]"
5
+ timestamp: 1654349933.152364
6
+ torch: 1.9.0
7
+ yaml_files:
8
+ asr_train_config: exp/asr_train_asr_raw_en_word/config.yaml