“siddhu001” commited on
Commit
c1abc67
·
1 Parent(s): 93faa41

Update model

Browse files
Files changed (22) hide show
  1. README.md +1337 -0
  2. exp/slu_train_asr_whisper_superb_raw_en_word_sp/RESULTS.md +44 -0
  3. exp/slu_train_asr_whisper_superb_raw_en_word_sp/config.yaml +1219 -0
  4. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/acc.png +0 -0
  5. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/backward_time.png +0 -0
  6. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/cer.png +0 -0
  7. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/cer_ctc.png +0 -0
  8. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/clip.png +0 -0
  9. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/forward_time.png +0 -0
  10. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/gpu_max_cached_mem_GB.png +0 -0
  11. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/grad_norm.png +0 -0
  12. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/iter_time.png +0 -0
  13. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/loss.png +0 -0
  14. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/loss_att.png +0 -0
  15. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/loss_ctc.png +0 -0
  16. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/loss_scale.png +0 -0
  17. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/optim0_lr0.png +0 -0
  18. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/optim_step_time.png +0 -0
  19. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/train_time.png +0 -0
  20. exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/wer.png +0 -0
  21. exp/slu_train_asr_whisper_superb_raw_en_word_sp/valid.loss.ave_10best.pth +3 -0
  22. meta.yaml +8 -0
README.md ADDED
@@ -0,0 +1,1337 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - automatic-speech-recognition
6
+ language: en
7
+ datasets:
8
+ - slue-voxceleb
9
+ license: cc-by-4.0
10
+ ---
11
+
12
+ ## ESPnet2 ASR model
13
+
14
+ ### `espnet/sluevoxceleb_whisper_lightweight_asr`
15
+
16
+ This model was trained by “siddhu001” using slue-voxceleb recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ Follow the [ESPnet installation instructions](https://espnet.github.io/espnet/installation.html)
21
+ if you haven't done that already.
22
+
23
+ ```bash
24
+ cd espnet
25
+ git checkout e23ef85f0b3116ad5c60d0833f186da0deec0734
26
+ pip install -e .
27
+ cd egs2/slue-voxceleb/slu1_asr
28
+ ./run.sh --skip_data_prep false --skip_train true --download_model espnet/sluevoxceleb_whisper_lightweight_asr
29
+ ```
30
+
31
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
32
+ # RESULTS
33
+ ## Environments
34
+ - date: `Mon Feb 5 19:05:58 CST 2024`
35
+ - python version: `3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0]`
36
+ - espnet version: `espnet 202310`
37
+ - pytorch version: `pytorch 2.1.0+cu121`
38
+ - Git hash: `21d2105784e4da98397bf487b2550d4c6e16d40d`
39
+ - Commit date: `Wed Jan 31 13:40:37 2024 -0600`
40
+
41
+ ## exp/slu_train_asr_whisper_superb_raw_en_word_sp
42
+ ### WER
43
+
44
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
45
+ |---|---|---|---|---|---|---|---|---|
46
+ |decode_asr_ctc_slu_model_valid.cer_ctc.ave/test|3426|135368|88.9|6.9|4.2|3.5|14.6|91.9|
47
+
48
+ ### CER
49
+
50
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
51
+ |---|---|---|---|---|---|---|---|---|
52
+ |decode_asr_ctc_slu_model_valid.cer_ctc.ave/test|3426|591261|95.0|1.6|3.5|3.3|8.3|91.9|
53
+
54
+ ### TER
55
+
56
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
57
+ |---|---|---|---|---|---|---|---|---|
58
+ ## exp/slu_train_asr_whisper_superb_raw_en_word_sp/decode_asr_ctc_slu_model_valid.cer_ctc.ave
59
+ ### WER
60
+
61
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
62
+ |---|---|---|---|---|---|---|---|---|
63
+ |org/devel|1437|56031|90.5|6.0|3.6|3.0|12.5|89.8|
64
+
65
+ ### CER
66
+
67
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
68
+ |---|---|---|---|---|---|---|---|---|
69
+ |org/devel|1437|241556|95.9|1.2|2.9|2.9|6.9|89.8|
70
+
71
+ ### TER
72
+
73
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
74
+ |---|---|---|---|---|---|---|---|---|
75
+
76
+ ## ASR config
77
+
78
+ <details><summary>expand</summary>
79
+
80
+ ```
81
+ config: conf/tuning/train_asr_whisper_superb.yaml
82
+ print_config: false
83
+ log_level: INFO
84
+ drop_last_iter: false
85
+ dry_run: false
86
+ iterator_type: sequence
87
+ valid_iterator_type: null
88
+ output_dir: exp/slu_train_asr_whisper_superb_raw_en_word_sp
89
+ ngpu: 1
90
+ seed: 2022
91
+ num_workers: 2
92
+ num_att_plot: 3
93
+ dist_backend: nccl
94
+ dist_init_method: env://
95
+ dist_world_size: 4
96
+ dist_rank: 0
97
+ local_rank: 0
98
+ dist_master_addr: localhost
99
+ dist_master_port: 49737
100
+ dist_launcher: null
101
+ multiprocessing_distributed: true
102
+ unused_parameters: false
103
+ sharded_ddp: false
104
+ cudnn_enabled: true
105
+ cudnn_benchmark: false
106
+ cudnn_deterministic: true
107
+ collect_stats: false
108
+ write_collected_feats: false
109
+ max_epoch: 70
110
+ patience: null
111
+ val_scheduler_criterion:
112
+ - valid
113
+ - loss
114
+ early_stopping_criterion:
115
+ - valid
116
+ - loss
117
+ - min
118
+ best_model_criterion:
119
+ - - valid
120
+ - cer_ctc
121
+ - min
122
+ - - valid
123
+ - loss
124
+ - min
125
+ keep_nbest_models: 10
126
+ nbest_averaging_interval: 0
127
+ grad_clip: 5.0
128
+ grad_clip_type: 2.0
129
+ grad_noise: false
130
+ accum_grad: 1
131
+ no_forward_run: false
132
+ resume: true
133
+ train_dtype: float32
134
+ use_amp: false
135
+ log_interval: null
136
+ use_matplotlib: true
137
+ use_tensorboard: true
138
+ create_graph_in_tensorboard: false
139
+ use_wandb: false
140
+ wandb_project: null
141
+ wandb_id: null
142
+ wandb_entity: null
143
+ wandb_name: null
144
+ wandb_model_log_interval: -1
145
+ detect_anomaly: false
146
+ use_lora: false
147
+ save_lora_only: true
148
+ lora_conf: {}
149
+ pretrain_path: null
150
+ init_param: []
151
+ ignore_init_mismatch: false
152
+ freeze_param:
153
+ - encoder
154
+ num_iters_per_epoch: null
155
+ batch_size: 20
156
+ valid_batch_size: null
157
+ batch_bins: 12000000
158
+ valid_batch_bins: null
159
+ train_shape_file:
160
+ - exp/slu_stats_raw_en_word_sp/train/speech_shape
161
+ - exp/slu_stats_raw_en_word_sp/train/text_shape.word
162
+ valid_shape_file:
163
+ - exp/slu_stats_raw_en_word_sp/valid/speech_shape
164
+ - exp/slu_stats_raw_en_word_sp/valid/text_shape.word
165
+ batch_type: numel
166
+ valid_batch_type: null
167
+ fold_length:
168
+ - 80000
169
+ - 150
170
+ sort_in_batch: descending
171
+ shuffle_within_batch: false
172
+ sort_batch: descending
173
+ multiple_iterator: false
174
+ chunk_length: 500
175
+ chunk_shift_ratio: 0.5
176
+ num_cache_chunks: 1024
177
+ chunk_excluded_key_prefixes: []
178
+ chunk_default_fs: null
179
+ train_data_path_and_name_and_type:
180
+ - - dump/raw/train_sp/wav.scp
181
+ - speech
182
+ - sound
183
+ - - dump/raw/train_sp/text
184
+ - text
185
+ - text
186
+ valid_data_path_and_name_and_type:
187
+ - - dump/raw/devel/wav.scp
188
+ - speech
189
+ - sound
190
+ - - dump/raw/devel/text
191
+ - text
192
+ - text
193
+ allow_variable_data_keys: false
194
+ max_cache_size: 0.0
195
+ max_cache_fd: 32
196
+ allow_multi_rates: false
197
+ valid_max_cache_size: null
198
+ exclude_weight_decay: false
199
+ exclude_weight_decay_conf: {}
200
+ optim: adam
201
+ optim_conf:
202
+ lr: 0.005
203
+ weight_decay: 1.0e-06
204
+ scheduler: warmuplr
205
+ scheduler_conf:
206
+ warmup_steps: 5000
207
+ token_list:
208
+ - <blank>
209
+ - <unk>
210
+ - ▁i
211
+ - ▁and
212
+ - ''''
213
+ - s
214
+ - ▁the
215
+ - ▁a
216
+ - ▁it
217
+ - ▁to
218
+ - ▁you
219
+ - ▁that
220
+ - ▁of
221
+ - ▁in
222
+ - ��was
223
+ - ▁uh
224
+ - ▁know
225
+ - t
226
+ - ▁so
227
+ - ▁we
228
+ - ▁he
229
+ - ing
230
+ - m
231
+ - ▁um
232
+ - ▁like
233
+ - ed
234
+ - ▁is
235
+ - ▁but
236
+ - ▁just
237
+ - ▁they
238
+ - re
239
+ - y
240
+ - ▁this
241
+ - ▁for
242
+ - ▁be
243
+ - ▁my
244
+ - er
245
+ - ▁with
246
+ - ▁on
247
+ - ▁think
248
+ - ▁have
249
+ - ▁p
250
+ - ▁she
251
+ - ▁me
252
+ - e
253
+ - ▁really
254
+ - ▁there
255
+ - ▁what
256
+ - al
257
+ - ▁m
258
+ - ▁do
259
+ - ▁all
260
+ - a
261
+ - ve
262
+ - ▁as
263
+ - c
264
+ - n
265
+ - ▁about
266
+ - ▁not
267
+ - i
268
+ - ▁at
269
+ - l
270
+ - ▁t
271
+ - ▁had
272
+ - ▁when
273
+ - ▁c
274
+ - g
275
+ - in
276
+ - ▁b
277
+ - d
278
+ - le
279
+ - en
280
+ - ▁out
281
+ - u
282
+ - ly
283
+ - ▁an
284
+ - or
285
+ - ▁people
286
+ - ar
287
+ - ll
288
+ - o
289
+ - ▁are
290
+ - ▁very
291
+ - ▁because
292
+ - es
293
+ - ▁can
294
+ - ▁don
295
+ - ▁s
296
+ - ▁or
297
+ - ▁up
298
+ - it
299
+ - b
300
+ - ▁e
301
+ - ▁one
302
+ - an
303
+ - st
304
+ - ▁if
305
+ - ▁f
306
+ - ▁were
307
+ - p
308
+ - ▁mean
309
+ - ▁d
310
+ - ▁who
311
+ - ▁then
312
+ - ic
313
+ - 'on'
314
+ - ▁no
315
+ - ▁go
316
+ - ▁her
317
+ - ▁g
318
+ - ▁st
319
+ - ▁kind
320
+ - ri
321
+ - ▁would
322
+ - ▁get
323
+ - at
324
+ - r
325
+ - ▁time
326
+ - v
327
+ - ent
328
+ - ▁re
329
+ - h
330
+ - ▁from
331
+ - ▁l
332
+ - ▁said
333
+ - ▁w
334
+ - ▁him
335
+ - ▁how
336
+ - ▁well
337
+ - ▁h
338
+ - ▁gonna
339
+ - ▁lot
340
+ - ▁see
341
+ - w
342
+ - ▁his
343
+ - ce
344
+ - ion
345
+ - ▁been
346
+ - f
347
+ - ▁great
348
+ - ▁yeah
349
+ - ▁love
350
+ - ▁which
351
+ - ▁got
352
+ - k
353
+ - ▁them
354
+ - ▁way
355
+ - ▁n
356
+ - id
357
+ - ▁show
358
+ - ▁some
359
+ - ▁your
360
+ - ▁did
361
+ - ▁sort
362
+ - et
363
+ - ▁has
364
+ - ▁things
365
+ - ▁back
366
+ - ▁where
367
+ - ▁something
368
+ - ir
369
+ - ▁thing
370
+ - ad
371
+ - ▁su
372
+ - il
373
+ - as
374
+ - ▁j
375
+ - ▁more
376
+ - ▁co
377
+ - se
378
+ - ▁say
379
+ - nd
380
+ - ▁much
381
+ - ▁come
382
+ - ▁always
383
+ - ine
384
+ - ▁r
385
+ - ation
386
+ - ▁other
387
+ - th
388
+ - ur
389
+ - ▁se
390
+ - ▁now
391
+ - ate
392
+ - ▁doing
393
+ - ▁work
394
+ - ow
395
+ - ▁could
396
+ - ally
397
+ - ▁these
398
+ - ▁good
399
+ - ▁any
400
+ - ▁cause
401
+ - ▁ex
402
+ - ▁ch
403
+ - ers
404
+ - ▁little
405
+ - ▁actually
406
+ - ▁into
407
+ - ▁make
408
+ - ▁first
409
+ - ▁being
410
+ - ra
411
+ - ▁our
412
+ - ▁al
413
+ - ▁by
414
+ - ▁didn
415
+ - ▁v
416
+ - ct
417
+ - ity
418
+ - ch
419
+ - un
420
+ - ▁part
421
+ - ▁de
422
+ - is
423
+ - ▁film
424
+ - ie
425
+ - ▁right
426
+ - ▁pro
427
+ - ▁off
428
+ - ol
429
+ - ▁two
430
+ - ▁never
431
+ - ▁o
432
+ - ▁
433
+ - ▁le
434
+ - ot
435
+ - ut
436
+ - ▁movie
437
+ - ▁play
438
+ - ge
439
+ - ies
440
+ - el
441
+ - ▁going
442
+ - ke
443
+ - ▁want
444
+ - ▁con
445
+ - ck
446
+ - ▁feel
447
+ - ive
448
+ - ro
449
+ - ▁mo
450
+ - im
451
+ - ▁different
452
+ - ▁life
453
+ - ci
454
+ - am
455
+ - ▁oh
456
+ - all
457
+ - ▁lo
458
+ - ard
459
+ - ▁went
460
+ - and
461
+ - ist
462
+ - ▁sh
463
+ - ▁even
464
+ - ry
465
+ - ▁years
466
+ - ▁look
467
+ - ▁k
468
+ - ▁us
469
+ - ant
470
+ - ▁te
471
+ - ▁li
472
+ - ▁happen
473
+ - ure
474
+ - ▁their
475
+ - ▁those
476
+ - ▁take
477
+ - ment
478
+ - ▁day
479
+ - ast
480
+ - ▁every
481
+ - ill
482
+ - ▁thought
483
+ - ou
484
+ - us
485
+ - ▁th
486
+ - ay
487
+ - ▁put
488
+ - ▁story
489
+ - ▁new
490
+ - ▁down
491
+ - ish
492
+ - ▁big
493
+ - ▁wanna
494
+ - red
495
+ - ▁ro
496
+ - ▁also
497
+ - ▁read
498
+ - ▁around
499
+ - ous
500
+ - ▁through
501
+ - ▁came
502
+ - ▁character
503
+ - ess
504
+ - te
505
+ - ver
506
+ - ▁will
507
+ - ag
508
+ - ss
509
+ - ▁fun
510
+ - ▁over
511
+ - ▁many
512
+ - ▁bl
513
+ - ▁cl
514
+ - ▁man
515
+ - ▁than
516
+ - ▁pre
517
+ - ▁world
518
+ - ▁person
519
+ - z
520
+ - ▁sp
521
+ - ven
522
+ - ▁wanted
523
+ - ▁bit
524
+ - ▁before
525
+ - ▁mar
526
+ - one
527
+ - ab
528
+ - ain
529
+ - ▁en
530
+ - ▁set
531
+ - ▁ha
532
+ - ▁find
533
+ - ul
534
+ - ▁end
535
+ - ▁un
536
+ - ▁sc
537
+ - ▁after
538
+ - een
539
+ - ▁working
540
+ - ▁why
541
+ - ter
542
+ - me
543
+ - ▁such
544
+ - ne
545
+ - ▁whole
546
+ - om
547
+ - ▁kinda
548
+ - pe
549
+ - ▁bo
550
+ - ▁fi
551
+ - x
552
+ - ▁most
553
+ - ▁ad
554
+ - ▁guy
555
+ - ▁spe
556
+ - ars
557
+ - op
558
+ - ▁am
559
+ - ful
560
+ - pt
561
+ - ▁together
562
+ - ▁let
563
+ - ▁quite
564
+ - ▁everything
565
+ - ▁made
566
+ - ig
567
+ - ▁old
568
+ - able
569
+ - ▁comp
570
+ - ▁tr
571
+ - ak
572
+ - ▁fo
573
+ - ▁po
574
+ - ore
575
+ - ice
576
+ - ▁real
577
+ - ▁bas
578
+ - ▁knew
579
+ - ▁hard
580
+ - pp
581
+ - age
582
+ - ated
583
+ - ▁same
584
+ - ▁start
585
+ - ▁ever
586
+ - ning
587
+ - ▁watch
588
+ - art
589
+ - ▁again
590
+ - ▁here
591
+ - are
592
+ - ght
593
+ - ong
594
+ - ▁done
595
+ - ▁only
596
+ - ▁live
597
+ - ▁wasn
598
+ - ▁ho
599
+ - ▁u
600
+ - ▁maybe
601
+ - ▁need
602
+ - ▁everybody
603
+ - ust
604
+ - ▁three
605
+ - ▁having
606
+ - ▁music
607
+ - ack
608
+ - ld
609
+ - ▁trying
610
+ - ▁guys
611
+ - rou
612
+ - ach
613
+ - ving
614
+ - ▁tell
615
+ - ▁should
616
+ - ff
617
+ - ide
618
+ - ▁four
619
+ - ▁started
620
+ - ass
621
+ - ▁long
622
+ - ▁fe
623
+ - ans
624
+ - ▁course
625
+ - ▁called
626
+ - ▁own
627
+ - ress
628
+ - ▁moment
629
+ - ▁pl
630
+ - ▁still
631
+ - ▁anything
632
+ - ▁family
633
+ - ▁fin
634
+ - ▁dan
635
+ - ▁bro
636
+ - 'no'
637
+ - ▁com
638
+ - ther
639
+ - ▁amazing
640
+ - ▁stuff
641
+ - os
642
+ - ▁per
643
+ - ▁jo
644
+ - ▁certain
645
+ - ▁talk
646
+ - ater
647
+ - per
648
+ - ▁help
649
+ - ▁too
650
+ - ▁year
651
+ - ight
652
+ - ▁fa
653
+ - self
654
+ - ces
655
+ - ▁br
656
+ - ▁bet
657
+ - ▁someone
658
+ - ▁di
659
+ - ▁sing
660
+ - nt
661
+ - ick
662
+ - ▁ph
663
+ - row
664
+ - ▁script
665
+ - ▁remember
666
+ - ▁try
667
+ - qu
668
+ - ite
669
+ - ▁young
670
+ - ▁wh
671
+ - ▁ser
672
+ - ▁ask
673
+ - um
674
+ - ▁book
675
+ - ▁each
676
+ - ▁wr
677
+ - ▁best
678
+ - ▁ag
679
+ - ▁women
680
+ - ose
681
+ - ions
682
+ - ved
683
+ - j
684
+ - ue
685
+ - ▁does
686
+ - ty
687
+ - ▁five
688
+ - ▁both
689
+ - ▁friends
690
+ - ▁act
691
+ - iz
692
+ - ind
693
+ - cess
694
+ - ▁somebody
695
+ - ft
696
+ - ▁nice
697
+ - ▁tur
698
+ - ▁myself
699
+ - mb
700
+ - fe
701
+ - ict
702
+ - ▁child
703
+ - ud
704
+ - ▁hope
705
+ - ▁fact
706
+ - ▁saying
707
+ - les
708
+ - ave
709
+ - icul
710
+ - au
711
+ - ris
712
+ - ▁twenty
713
+ - ▁school
714
+ - ▁doesn
715
+ - ▁able
716
+ - pect
717
+ - ▁last
718
+ - ▁song
719
+ - od
720
+ - ▁str
721
+ - ▁interesting
722
+ - lf
723
+ - ▁wor
724
+ - sp
725
+ - ap
726
+ - og
727
+ - ▁ra
728
+ - ▁dis
729
+ - ▁coming
730
+ - ▁ab
731
+ - ▁house
732
+ - ▁next
733
+ - ▁tra
734
+ - ▁okay
735
+ - ere
736
+ - ib
737
+ - ary
738
+ - ▁incredib
739
+ - ▁car
740
+ - ▁job
741
+ - ▁used
742
+ - ▁give
743
+ - ▁god
744
+ - ▁americ
745
+ - ▁characters
746
+ - ▁app
747
+ - ▁walk
748
+ - ▁yes
749
+ - rew
750
+ - ▁getting
751
+ - ▁six
752
+ - ▁chan
753
+ - ▁ne
754
+ - ale
755
+ - ▁pretty
756
+ - mp
757
+ - ang
758
+ - ▁creat
759
+ - ▁another
760
+ - ▁ter
761
+ - ▁kids
762
+ - ▁felt
763
+ - ▁sometimes
764
+ - ▁place
765
+ - ▁int
766
+ - ically
767
+ - out
768
+ - ▁funny
769
+ - ase
770
+ - ich
771
+ - act
772
+ - ▁days
773
+ - ▁bring
774
+ - ▁making
775
+ - ▁become
776
+ - ute
777
+ - ▁wonderful
778
+ - ron
779
+ - ▁saw
780
+ - ▁point
781
+ - ia
782
+ - ▁realiz
783
+ - ▁away
784
+ - ays
785
+ - ▁home
786
+ - ace
787
+ - ▁relationship
788
+ - day
789
+ - ▁woman
790
+ - ▁everyone
791
+ - ▁comes
792
+ - ▁high
793
+ - ▁wee
794
+ - dd
795
+ - ▁night
796
+ - ath
797
+ - ts
798
+ - ▁else
799
+ - vent
800
+ - ▁shoot
801
+ - vers
802
+ - ▁sure
803
+ - ried
804
+ - ned
805
+ - ▁obviously
806
+ - ▁dra
807
+ - co
808
+ - iew
809
+ - man
810
+ - ▁playing
811
+ - ▁important
812
+ - ort
813
+ - uck
814
+ - ision
815
+ - pport
816
+ - ▁nor
817
+ - ▁seen
818
+ - ▁fl
819
+ - est
820
+ - ▁inter
821
+ - ks
822
+ - ▁actor
823
+ - ▁lear
824
+ - ▁worked
825
+ - ▁believe
826
+ - ▁gen
827
+ - ▁keep
828
+ - ull
829
+ - ▁friend
830
+ - ▁sw
831
+ - ▁des
832
+ - ▁times
833
+ - ▁sur
834
+ - ms
835
+ - ▁sit
836
+ - ▁probably
837
+ - ok
838
+ - ▁took
839
+ - ep
840
+ - ough
841
+ - ip
842
+ - ood
843
+ - ▁sa
844
+ - ▁season
845
+ - vel
846
+ - wn
847
+ - ▁dec
848
+ - ▁excited
849
+ - ame
850
+ - ian
851
+ - ire
852
+ - ▁name
853
+ - ▁im
854
+ - ▁month
855
+ - ner
856
+ - ▁min
857
+ - ▁rel
858
+ - ating
859
+ - body
860
+ - ition
861
+ - ▁loved
862
+ - ▁aw
863
+ - ▁hear
864
+ - ph
865
+ - ▁cool
866
+ - ▁list
867
+ - ord
868
+ - pl
869
+ - ble
870
+ - our
871
+ - ▁game
872
+ - ub
873
+ - ▁might
874
+ - ▁kid
875
+ - ▁movies
876
+ - ical
877
+ - ▁bad
878
+ - ▁scene
879
+ - iv
880
+ - ▁enough
881
+ - ▁sm
882
+ - ▁fift
883
+ - ▁eight
884
+ - ▁experience
885
+ - ▁actors
886
+ - ▁understand
887
+ - ▁few
888
+ - gin
889
+ - ting
890
+ - ▁director
891
+ - ▁almost
892
+ - ▁open
893
+ - ren
894
+ - ▁star
895
+ - ▁room
896
+ - ▁call
897
+ - oy
898
+ - ▁goes
899
+ - ▁told
900
+ - ▁once
901
+ - ▁found
902
+ - arly
903
+ - ations
904
+ - ward
905
+ - ▁audience
906
+ - ird
907
+ - ▁qu
908
+ - ▁ar
909
+ - ▁definitely
910
+ - ious
911
+ - iting
912
+ - ▁pol
913
+ - ▁huge
914
+ - ▁makes
915
+ - aking
916
+ - ▁la
917
+ - ▁ac
918
+ - iter
919
+ - ▁run
920
+ - ▁gotta
921
+ - ▁gr
922
+ - ▁cam
923
+ - sh
924
+ - ▁gets
925
+ - ▁wa
926
+ - ully
927
+ - ▁says
928
+ - ▁cont
929
+ - side
930
+ - ▁bus
931
+ - ▁shows
932
+ - ▁dr
933
+ - ▁inv
934
+ - ▁idea
935
+ - ▁talking
936
+ - way
937
+ - ▁art
938
+ - ▁whatever
939
+ - ▁write
940
+ - ash
941
+ - itt
942
+ - ▁met
943
+ - ▁wants
944
+ - ▁role
945
+ - if
946
+ - ▁mu
947
+ - ▁boy
948
+ - ▁wrote
949
+ - ger
950
+ - ately
951
+ - ▁exc
952
+ - ▁gu
953
+ - ▁mother
954
+ - ▁produ
955
+ - ▁cra
956
+ - ates
957
+ - ▁though
958
+ - av
959
+ - ▁episode
960
+ - ▁sl
961
+ - ▁change
962
+ - be
963
+ - ▁voice
964
+ - ▁played
965
+ - ily
966
+ - ▁guess
967
+ - ves
968
+ - ▁hand
969
+ - ady
970
+ - ▁happy
971
+ - ith
972
+ - ny
973
+ - ▁gi
974
+ - med
975
+ - ▁looking
976
+ - lev
977
+ - ream
978
+ - ▁acting
979
+ - aught
980
+ - iss
981
+ - ount
982
+ - rom
983
+ - ▁tw
984
+ - ▁john
985
+ - ▁far
986
+ - ▁res
987
+ - ▁sense
988
+ - ake
989
+ - ▁meet
990
+ - ▁bre
991
+ - ens
992
+ - ety
993
+ - ▁girl
994
+ - ▁york
995
+ - ▁count
996
+ - ▁shot
997
+ - ise
998
+ - ject
999
+ - ▁tot
1000
+ - ▁stud
1001
+ - ▁feels
1002
+ - ▁thinking
1003
+ - ma
1004
+ - ▁head
1005
+ - ▁cast
1006
+ - ▁writing
1007
+ - ▁imp
1008
+ - ▁rehe
1009
+ - ▁written
1010
+ - ▁perfor
1011
+ - ▁fan
1012
+ - der
1013
+ - ect
1014
+ - ▁sk
1015
+ - ▁hour
1016
+ - ▁father
1017
+ - ered
1018
+ - ▁hundred
1019
+ - ▁ind
1020
+ - ▁che
1021
+ - ▁acc
1022
+ - up
1023
+ - ▁while
1024
+ - fort
1025
+ - ▁true
1026
+ - itch
1027
+ - ▁inst
1028
+ - ▁second
1029
+ - ▁pick
1030
+ - ▁record
1031
+ - ross
1032
+ - ▁quest
1033
+ - ged
1034
+ - ▁career
1035
+ - ▁reason
1036
+ - ▁since
1037
+ - ▁bu
1038
+ - ▁bra
1039
+ - ▁char
1040
+ - ree
1041
+ - ▁girls
1042
+ - ▁dad
1043
+ - ▁fant
1044
+ - ▁extra
1045
+ - ▁laugh
1046
+ - ▁stand
1047
+ - ▁honest
1048
+ - na
1049
+ - als
1050
+ - ▁yet
1051
+ - ▁human
1052
+ - ▁couple
1053
+ - dy
1054
+ - ▁mind
1055
+ - ▁sound
1056
+ - ▁ke
1057
+ - ▁pop
1058
+ - ▁ent
1059
+ - ory
1060
+ - ▁war
1061
+ - ▁ten
1062
+ - ink
1063
+ - ▁bec
1064
+ - ▁direct
1065
+ - reat
1066
+ - round
1067
+ - ien
1068
+ - ▁under
1069
+ - ile
1070
+ - ▁diff
1071
+ - ually
1072
+ - thing
1073
+ - sic
1074
+ - ▁gon
1075
+ - ather
1076
+ - ▁aud
1077
+ - ert
1078
+ - for
1079
+ - ▁scen
1080
+ - mber
1081
+ - atch
1082
+ - ▁sho
1083
+ - ever
1084
+ - tra
1085
+ - ▁pe
1086
+ - ▁hu
1087
+ - ild
1088
+ - int
1089
+ - ▁ob
1090
+ - ▁care
1091
+ - ▁fam
1092
+ - ▁ide
1093
+ - ade
1094
+ - right
1095
+ - ▁may
1096
+ - he
1097
+ - mo
1098
+ - ody
1099
+ - ense
1100
+ - ▁interest
1101
+ - ah
1102
+ - ork
1103
+ - ▁episod
1104
+ - ▁prob
1105
+ - ▁rec
1106
+ - ▁hop
1107
+ - ited
1108
+ - ▁exper
1109
+ - gh
1110
+ - ▁bel
1111
+ - ▁el
1112
+ - ▁stu
1113
+ - enty
1114
+ - ound
1115
+ - ▁gott
1116
+ - ▁id
1117
+ - ime
1118
+ - rie
1119
+ - ▁inc
1120
+ - ertain
1121
+ - ▁wo
1122
+ - ▁mon
1123
+ - az
1124
+ - xt
1125
+ - riend
1126
+ - now
1127
+ - ▁y
1128
+ - ple
1129
+ - ome
1130
+ - so
1131
+ - ause
1132
+ - ▁cou
1133
+ - iously
1134
+ - ▁sch
1135
+ - ▁vo
1136
+ - ▁fil
1137
+ - ▁op
1138
+ - ason
1139
+ - ▁mov
1140
+ - ▁hi
1141
+ - ▁pers
1142
+ - ▁ye
1143
+ - ▁def
1144
+ - ▁belie
1145
+ - fore
1146
+ - ix
1147
+ - very
1148
+ - ▁differe
1149
+ - ▁wonder
1150
+ - nder
1151
+ - ▁obv
1152
+ - ▁ep
1153
+ - ship
1154
+ - ▁lau
1155
+ - ience
1156
+ - ool
1157
+ - ▁sin
1158
+ - rect
1159
+ - ▁happ
1160
+ - ▁gir
1161
+ - ▁hel
1162
+ - du
1163
+ - ng
1164
+ - ▁underst
1165
+ - most
1166
+ - eric
1167
+ - ouse
1168
+ - time
1169
+ - ▁cour
1170
+ - ▁relation
1171
+ - rough
1172
+ - q
1173
+ - ▁defin
1174
+ - ▁reme
1175
+ - redib
1176
+ - ▁fir
1177
+ - anna
1178
+ - ways
1179
+ - itten
1180
+ - elt
1181
+ - ▁sometime
1182
+ - ':'
1183
+ - alk
1184
+ - ▁ok
1185
+ - ably
1186
+ - rote
1187
+ - gether
1188
+ - ▁definite
1189
+ - ▁import
1190
+ - '&'
1191
+ - new
1192
+ - fter
1193
+ - onest
1194
+ - erest
1195
+ - ▁amaz
1196
+ - ▁ano
1197
+ - <sos/eos>
1198
+ transcript_token_list: null
1199
+ two_pass: false
1200
+ pre_postencoder_norm: false
1201
+ init: null
1202
+ input_size: 1
1203
+ ctc_conf:
1204
+ dropout_rate: 0.0
1205
+ ctc_type: builtin
1206
+ reduce: true
1207
+ ignore_nan_grad: null
1208
+ zero_infinity: true
1209
+ brctc_risk_strategy: exp
1210
+ brctc_group_strategy: end
1211
+ brctc_risk_factor: 0.0
1212
+ joint_net_conf: null
1213
+ use_preprocessor: true
1214
+ token_type: word
1215
+ bpemodel: null
1216
+ non_linguistic_symbols: null
1217
+ cleaner: null
1218
+ g2p: null
1219
+ speech_volume_normalize: null
1220
+ rir_scp: null
1221
+ rir_apply_prob: 1.0
1222
+ noise_scp: null
1223
+ noise_apply_prob: 1.0
1224
+ noise_db_range: '13_15'
1225
+ short_noise_thres: 0.5
1226
+ frontend: null
1227
+ frontend_conf: {}
1228
+ specaug: null
1229
+ specaug_conf: {}
1230
+ normalize: null
1231
+ normalize_conf: {}
1232
+ model: espnet
1233
+ model_conf:
1234
+ ctc_weight: 1.0
1235
+ lsm_weight: 0.1
1236
+ length_normalized_loss: false
1237
+ weighted_sum: true
1238
+ extract_feats_in_collect_stats: false
1239
+ preencoder: null
1240
+ preencoder_conf: {}
1241
+ encoder: whisper
1242
+ encoder_conf:
1243
+ whisper_model: medium
1244
+ dropout_rate: 0.0
1245
+ use_specaug: true
1246
+ specaug_conf:
1247
+ apply_time_warp: true
1248
+ time_warp_window: 5
1249
+ time_warp_mode: bicubic
1250
+ apply_freq_mask: true
1251
+ freq_mask_width_range:
1252
+ - 0
1253
+ - 40
1254
+ num_freq_mask: 2
1255
+ apply_time_mask: true
1256
+ time_mask_width_ratio_range:
1257
+ - 0.0
1258
+ - 0.12
1259
+ num_time_mask: 5
1260
+ prepostencoder: linear
1261
+ prepostencoder_conf:
1262
+ input_size: 1024
1263
+ output_size: 80
1264
+ postencoder: conformer_full
1265
+ postencoder_conf:
1266
+ output_size: 256
1267
+ attention_heads: 4
1268
+ linear_units: 1024
1269
+ num_blocks: 2
1270
+ dropout_rate: 0.1
1271
+ positional_dropout_rate: 0.1
1272
+ attention_dropout_rate: 0.1
1273
+ input_layer: conv2d1
1274
+ normalize_before: true
1275
+ macaron_style: true
1276
+ rel_pos_type: latest
1277
+ pos_enc_layer_type: rel_pos
1278
+ selfattention_layer_type: rel_selfattn
1279
+ activation_type: swish
1280
+ use_cnn_module: true
1281
+ cnn_module_kernel: 31
1282
+ deliberationencoder: null
1283
+ deliberationencoder_conf: {}
1284
+ decoder: transformer
1285
+ decoder_conf:
1286
+ attention_heads: 4
1287
+ linear_units: 2048
1288
+ num_blocks: 6
1289
+ dropout_rate: 0.1
1290
+ positional_dropout_rate: 0.1
1291
+ self_attention_dropout_rate: 0.1
1292
+ src_attention_dropout_rate: 0.1
1293
+ postdecoder: null
1294
+ postdecoder_conf: {}
1295
+ required:
1296
+ - output_dir
1297
+ - token_list
1298
+ version: '202310'
1299
+ distributed: true
1300
+ ```
1301
+
1302
+ </details>
1303
+
1304
+
1305
+
1306
+ ### Citing ESPnet
1307
+
1308
+ ```BibTex
1309
+ @inproceedings{watanabe2018espnet,
1310
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
1311
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
1312
+ year={2018},
1313
+ booktitle={Proceedings of Interspeech},
1314
+ pages={2207--2211},
1315
+ doi={10.21437/Interspeech.2018-1456},
1316
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
1317
+ }
1318
+
1319
+
1320
+
1321
+
1322
+
1323
+
1324
+ ```
1325
+
1326
+ or arXiv:
1327
+
1328
+ ```bibtex
1329
+ @misc{watanabe2018espnet,
1330
+ title={ESPnet: End-to-End Speech Processing Toolkit},
1331
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
1332
+ year={2018},
1333
+ eprint={1804.00015},
1334
+ archivePrefix={arXiv},
1335
+ primaryClass={cs.CL}
1336
+ }
1337
+ ```
exp/slu_train_asr_whisper_superb_raw_en_word_sp/RESULTS.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
2
+ # RESULTS
3
+ ## Environments
4
+ - date: `Mon Feb 5 19:05:58 CST 2024`
5
+ - python version: `3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0]`
6
+ - espnet version: `espnet 202310`
7
+ - pytorch version: `pytorch 2.1.0+cu121`
8
+ - Git hash: `21d2105784e4da98397bf487b2550d4c6e16d40d`
9
+ - Commit date: `Wed Jan 31 13:40:37 2024 -0600`
10
+
11
+ ## exp/slu_train_asr_whisper_superb_raw_en_word_sp
12
+ ### WER
13
+
14
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
15
+ |---|---|---|---|---|---|---|---|---|
16
+ |decode_asr_ctc_slu_model_valid.cer_ctc.ave/test|3426|135368|88.9|6.9|4.2|3.5|14.6|91.9|
17
+
18
+ ### CER
19
+
20
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
21
+ |---|---|---|---|---|---|---|---|---|
22
+ |decode_asr_ctc_slu_model_valid.cer_ctc.ave/test|3426|591261|95.0|1.6|3.5|3.3|8.3|91.9|
23
+
24
+ ### TER
25
+
26
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
27
+ |---|---|---|---|---|---|---|---|---|
28
+ ## exp/slu_train_asr_whisper_superb_raw_en_word_sp/decode_asr_ctc_slu_model_valid.cer_ctc.ave
29
+ ### WER
30
+
31
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
32
+ |---|---|---|---|---|---|---|---|---|
33
+ |org/devel|1437|56031|90.5|6.0|3.6|3.0|12.5|89.8|
34
+
35
+ ### CER
36
+
37
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
38
+ |---|---|---|---|---|---|---|---|---|
39
+ |org/devel|1437|241556|95.9|1.2|2.9|2.9|6.9|89.8|
40
+
41
+ ### TER
42
+
43
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
44
+ |---|---|---|---|---|---|---|---|---|
exp/slu_train_asr_whisper_superb_raw_en_word_sp/config.yaml ADDED
@@ -0,0 +1,1219 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/tuning/train_asr_whisper_superb.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ drop_last_iter: false
5
+ dry_run: false
6
+ iterator_type: sequence
7
+ valid_iterator_type: null
8
+ output_dir: exp/slu_train_asr_whisper_superb_raw_en_word_sp
9
+ ngpu: 1
10
+ seed: 2022
11
+ num_workers: 2
12
+ num_att_plot: 3
13
+ dist_backend: nccl
14
+ dist_init_method: env://
15
+ dist_world_size: 4
16
+ dist_rank: 0
17
+ local_rank: 0
18
+ dist_master_addr: localhost
19
+ dist_master_port: 49737
20
+ dist_launcher: null
21
+ multiprocessing_distributed: true
22
+ unused_parameters: false
23
+ sharded_ddp: false
24
+ cudnn_enabled: true
25
+ cudnn_benchmark: false
26
+ cudnn_deterministic: true
27
+ collect_stats: false
28
+ write_collected_feats: false
29
+ max_epoch: 70
30
+ patience: null
31
+ val_scheduler_criterion:
32
+ - valid
33
+ - loss
34
+ early_stopping_criterion:
35
+ - valid
36
+ - loss
37
+ - min
38
+ best_model_criterion:
39
+ - - valid
40
+ - cer_ctc
41
+ - min
42
+ - - valid
43
+ - loss
44
+ - min
45
+ keep_nbest_models: 10
46
+ nbest_averaging_interval: 0
47
+ grad_clip: 5.0
48
+ grad_clip_type: 2.0
49
+ grad_noise: false
50
+ accum_grad: 1
51
+ no_forward_run: false
52
+ resume: true
53
+ train_dtype: float32
54
+ use_amp: false
55
+ log_interval: null
56
+ use_matplotlib: true
57
+ use_tensorboard: true
58
+ create_graph_in_tensorboard: false
59
+ use_wandb: false
60
+ wandb_project: null
61
+ wandb_id: null
62
+ wandb_entity: null
63
+ wandb_name: null
64
+ wandb_model_log_interval: -1
65
+ detect_anomaly: false
66
+ use_lora: false
67
+ save_lora_only: true
68
+ lora_conf: {}
69
+ pretrain_path: null
70
+ init_param: []
71
+ ignore_init_mismatch: false
72
+ freeze_param:
73
+ - encoder
74
+ num_iters_per_epoch: null
75
+ batch_size: 20
76
+ valid_batch_size: null
77
+ batch_bins: 12000000
78
+ valid_batch_bins: null
79
+ train_shape_file:
80
+ - exp/slu_stats_raw_en_word_sp/train/speech_shape
81
+ - exp/slu_stats_raw_en_word_sp/train/text_shape.word
82
+ valid_shape_file:
83
+ - exp/slu_stats_raw_en_word_sp/valid/speech_shape
84
+ - exp/slu_stats_raw_en_word_sp/valid/text_shape.word
85
+ batch_type: numel
86
+ valid_batch_type: null
87
+ fold_length:
88
+ - 80000
89
+ - 150
90
+ sort_in_batch: descending
91
+ shuffle_within_batch: false
92
+ sort_batch: descending
93
+ multiple_iterator: false
94
+ chunk_length: 500
95
+ chunk_shift_ratio: 0.5
96
+ num_cache_chunks: 1024
97
+ chunk_excluded_key_prefixes: []
98
+ chunk_default_fs: null
99
+ train_data_path_and_name_and_type:
100
+ - - dump/raw/train_sp/wav.scp
101
+ - speech
102
+ - sound
103
+ - - dump/raw/train_sp/text
104
+ - text
105
+ - text
106
+ valid_data_path_and_name_and_type:
107
+ - - dump/raw/devel/wav.scp
108
+ - speech
109
+ - sound
110
+ - - dump/raw/devel/text
111
+ - text
112
+ - text
113
+ allow_variable_data_keys: false
114
+ max_cache_size: 0.0
115
+ max_cache_fd: 32
116
+ allow_multi_rates: false
117
+ valid_max_cache_size: null
118
+ exclude_weight_decay: false
119
+ exclude_weight_decay_conf: {}
120
+ optim: adam
121
+ optim_conf:
122
+ lr: 0.005
123
+ weight_decay: 1.0e-06
124
+ scheduler: warmuplr
125
+ scheduler_conf:
126
+ warmup_steps: 5000
127
+ token_list:
128
+ - <blank>
129
+ - <unk>
130
+ - ▁i
131
+ - ▁and
132
+ - ''''
133
+ - s
134
+ - ▁the
135
+ - ▁a
136
+ - ▁it
137
+ - ▁to
138
+ - ▁you
139
+ - ▁that
140
+ - ▁of
141
+ - ▁in
142
+ - ▁was
143
+ - ▁uh
144
+ - ▁know
145
+ - t
146
+ - ▁so
147
+ - ▁we
148
+ - ▁he
149
+ - ing
150
+ - m
151
+ - ▁um
152
+ - ▁like
153
+ - ed
154
+ - ▁is
155
+ - ▁but
156
+ - ▁just
157
+ - ▁they
158
+ - re
159
+ - y
160
+ - ▁this
161
+ - ▁for
162
+ - ▁be
163
+ - ▁my
164
+ - er
165
+ - ▁with
166
+ - ▁on
167
+ - ▁think
168
+ - ▁have
169
+ - ▁p
170
+ - ▁she
171
+ - ▁me
172
+ - e
173
+ - ▁really
174
+ - ▁there
175
+ - ▁what
176
+ - al
177
+ - ▁m
178
+ - ▁do
179
+ - ▁all
180
+ - a
181
+ - ve
182
+ - ▁as
183
+ - c
184
+ - n
185
+ - ▁about
186
+ - ▁not
187
+ - i
188
+ - ▁at
189
+ - l
190
+ - ▁t
191
+ - ▁had
192
+ - ▁when
193
+ - ▁c
194
+ - g
195
+ - in
196
+ - ▁b
197
+ - d
198
+ - le
199
+ - en
200
+ - ▁out
201
+ - u
202
+ - ly
203
+ - ▁an
204
+ - or
205
+ - ▁people
206
+ - ar
207
+ - ll
208
+ - o
209
+ - ▁are
210
+ - ▁very
211
+ - ▁because
212
+ - es
213
+ - ▁can
214
+ - ▁don
215
+ - ▁s
216
+ - ▁or
217
+ - ▁up
218
+ - it
219
+ - b
220
+ - ▁e
221
+ - ▁one
222
+ - an
223
+ - st
224
+ - ▁if
225
+ - ▁f
226
+ - ▁were
227
+ - p
228
+ - ▁mean
229
+ - ▁d
230
+ - ▁who
231
+ - ▁then
232
+ - ic
233
+ - 'on'
234
+ - ▁no
235
+ - ▁go
236
+ - ▁her
237
+ - ▁g
238
+ - ▁st
239
+ - ▁kind
240
+ - ri
241
+ - ▁would
242
+ - ▁get
243
+ - at
244
+ - r
245
+ - ▁time
246
+ - v
247
+ - ent
248
+ - ▁re
249
+ - h
250
+ - ▁from
251
+ - ▁l
252
+ - ▁said
253
+ - ▁w
254
+ - ▁him
255
+ - ▁how
256
+ - ▁well
257
+ - ▁h
258
+ - ▁gonna
259
+ - ▁lot
260
+ - ▁see
261
+ - w
262
+ - ▁his
263
+ - ce
264
+ - ion
265
+ - ▁been
266
+ - f
267
+ - ▁great
268
+ - ▁yeah
269
+ - ▁love
270
+ - ▁which
271
+ - ▁got
272
+ - k
273
+ - ▁them
274
+ - ▁way
275
+ - ▁n
276
+ - id
277
+ - ▁show
278
+ - ▁some
279
+ - ▁your
280
+ - ▁did
281
+ - ▁sort
282
+ - et
283
+ - ▁has
284
+ - ▁things
285
+ - ▁back
286
+ - ▁where
287
+ - ▁something
288
+ - ir
289
+ - ▁thing
290
+ - ad
291
+ - ▁su
292
+ - il
293
+ - as
294
+ - ▁j
295
+ - ▁more
296
+ - ▁co
297
+ - se
298
+ - ▁say
299
+ - nd
300
+ - ▁much
301
+ - ▁come
302
+ - ▁always
303
+ - ine
304
+ - ▁r
305
+ - ation
306
+ - ▁other
307
+ - th
308
+ - ur
309
+ - ▁se
310
+ - ▁now
311
+ - ate
312
+ - ▁doing
313
+ - ▁work
314
+ - ow
315
+ - ▁could
316
+ - ally
317
+ - ▁these
318
+ - ▁good
319
+ - ▁any
320
+ - ▁cause
321
+ - ▁ex
322
+ - ▁ch
323
+ - ers
324
+ - ▁little
325
+ - ▁actually
326
+ - ▁into
327
+ - ▁make
328
+ - ▁first
329
+ - ▁being
330
+ - ra
331
+ - ▁our
332
+ - ▁al
333
+ - ▁by
334
+ - ▁didn
335
+ - ▁v
336
+ - ct
337
+ - ity
338
+ - ch
339
+ - un
340
+ - ▁part
341
+ - ▁de
342
+ - is
343
+ - ▁film
344
+ - ie
345
+ - ▁right
346
+ - ▁pro
347
+ - ▁off
348
+ - ol
349
+ - ▁two
350
+ - ▁never
351
+ - ▁o
352
+ - ▁
353
+ - ▁le
354
+ - ot
355
+ - ut
356
+ - ▁movie
357
+ - ▁play
358
+ - ge
359
+ - ies
360
+ - el
361
+ - ▁going
362
+ - ke
363
+ - ▁want
364
+ - ▁con
365
+ - ck
366
+ - ▁feel
367
+ - ive
368
+ - ro
369
+ - ▁mo
370
+ - im
371
+ - ▁different
372
+ - ▁life
373
+ - ci
374
+ - am
375
+ - ▁oh
376
+ - all
377
+ - ▁lo
378
+ - ard
379
+ - ▁went
380
+ - and
381
+ - ist
382
+ - ▁sh
383
+ - ▁even
384
+ - ry
385
+ - ▁years
386
+ - ▁look
387
+ - ▁k
388
+ - ▁us
389
+ - ant
390
+ - ▁te
391
+ - ▁li
392
+ - ▁happen
393
+ - ure
394
+ - ▁their
395
+ - ▁those
396
+ - ▁take
397
+ - ment
398
+ - ▁day
399
+ - ast
400
+ - ▁every
401
+ - ill
402
+ - ▁thought
403
+ - ou
404
+ - us
405
+ - ▁th
406
+ - ay
407
+ - ▁put
408
+ - ▁story
409
+ - ▁new
410
+ - ▁down
411
+ - ish
412
+ - ▁big
413
+ - ▁wanna
414
+ - red
415
+ - ▁ro
416
+ - ▁also
417
+ - ▁read
418
+ - ▁around
419
+ - ous
420
+ - ▁through
421
+ - ▁came
422
+ - ▁character
423
+ - ess
424
+ - te
425
+ - ver
426
+ - ▁will
427
+ - ag
428
+ - ss
429
+ - ▁fun
430
+ - ▁over
431
+ - ▁many
432
+ - ▁bl
433
+ - ▁cl
434
+ - ▁man
435
+ - ▁than
436
+ - ▁pre
437
+ - ▁world
438
+ - ▁person
439
+ - z
440
+ - ▁sp
441
+ - ven
442
+ - ▁wanted
443
+ - ▁bit
444
+ - ▁before
445
+ - ▁mar
446
+ - one
447
+ - ab
448
+ - ain
449
+ - ▁en
450
+ - ▁set
451
+ - ▁ha
452
+ - ▁find
453
+ - ul
454
+ - ▁end
455
+ - ▁un
456
+ - ▁sc
457
+ - ▁after
458
+ - een
459
+ - ▁working
460
+ - ▁why
461
+ - ter
462
+ - me
463
+ - ▁such
464
+ - ne
465
+ - ▁whole
466
+ - om
467
+ - ▁kinda
468
+ - pe
469
+ - ▁bo
470
+ - ▁fi
471
+ - x
472
+ - ▁most
473
+ - ▁ad
474
+ - ▁guy
475
+ - ▁spe
476
+ - ars
477
+ - op
478
+ - ▁am
479
+ - ful
480
+ - pt
481
+ - ▁together
482
+ - ▁let
483
+ - ▁quite
484
+ - ▁everything
485
+ - ▁made
486
+ - ig
487
+ - ▁old
488
+ - able
489
+ - ▁comp
490
+ - ▁tr
491
+ - ak
492
+ - ▁fo
493
+ - ▁po
494
+ - ore
495
+ - ice
496
+ - ▁real
497
+ - ▁bas
498
+ - ▁knew
499
+ - ▁hard
500
+ - pp
501
+ - age
502
+ - ated
503
+ - ▁same
504
+ - ▁start
505
+ - ▁ever
506
+ - ning
507
+ - ▁watch
508
+ - art
509
+ - ▁again
510
+ - ▁here
511
+ - are
512
+ - ght
513
+ - ong
514
+ - ▁done
515
+ - ▁only
516
+ - ▁live
517
+ - ▁wasn
518
+ - ▁ho
519
+ - ▁u
520
+ - ▁maybe
521
+ - ▁need
522
+ - ▁everybody
523
+ - ust
524
+ - ▁three
525
+ - ▁having
526
+ - ▁music
527
+ - ack
528
+ - ld
529
+ - ▁trying
530
+ - ▁guys
531
+ - rou
532
+ - ach
533
+ - ving
534
+ - ▁tell
535
+ - ▁should
536
+ - ff
537
+ - ide
538
+ - ▁four
539
+ - ▁started
540
+ - ass
541
+ - ▁long
542
+ - ▁fe
543
+ - ans
544
+ - ▁course
545
+ - ▁called
546
+ - ▁own
547
+ - ress
548
+ - ▁moment
549
+ - ▁pl
550
+ - ▁still
551
+ - ▁anything
552
+ - ▁family
553
+ - ▁fin
554
+ - ▁dan
555
+ - ▁bro
556
+ - 'no'
557
+ - ▁com
558
+ - ther
559
+ - ▁amazing
560
+ - ▁stuff
561
+ - os
562
+ - ▁per
563
+ - ▁jo
564
+ - ▁certain
565
+ - ▁talk
566
+ - ater
567
+ - per
568
+ - ▁help
569
+ - ▁too
570
+ - ▁year
571
+ - ight
572
+ - ▁fa
573
+ - self
574
+ - ces
575
+ - ▁br
576
+ - ▁bet
577
+ - ▁someone
578
+ - ▁di
579
+ - ▁sing
580
+ - nt
581
+ - ick
582
+ - ▁ph
583
+ - row
584
+ - ▁script
585
+ - ▁remember
586
+ - ▁try
587
+ - qu
588
+ - ite
589
+ - ▁young
590
+ - ▁wh
591
+ - ▁ser
592
+ - ▁ask
593
+ - um
594
+ - ▁book
595
+ - ▁each
596
+ - ▁wr
597
+ - ▁best
598
+ - ▁ag
599
+ - ▁women
600
+ - ose
601
+ - ions
602
+ - ved
603
+ - j
604
+ - ue
605
+ - ▁does
606
+ - ty
607
+ - ▁five
608
+ - ▁both
609
+ - ▁friends
610
+ - ▁act
611
+ - iz
612
+ - ind
613
+ - cess
614
+ - ▁somebody
615
+ - ft
616
+ - ▁nice
617
+ - ▁tur
618
+ - ▁myself
619
+ - mb
620
+ - fe
621
+ - ict
622
+ - ▁child
623
+ - ud
624
+ - ▁hope
625
+ - ▁fact
626
+ - ▁saying
627
+ - les
628
+ - ave
629
+ - icul
630
+ - au
631
+ - ris
632
+ - ▁twenty
633
+ - ▁school
634
+ - ▁doesn
635
+ - ▁able
636
+ - pect
637
+ - ▁last
638
+ - ▁song
639
+ - od
640
+ - ▁str
641
+ - ▁interesting
642
+ - lf
643
+ - ▁wor
644
+ - sp
645
+ - ap
646
+ - og
647
+ - ▁ra
648
+ - ▁dis
649
+ - ▁coming
650
+ - ▁ab
651
+ - ▁house
652
+ - ▁next
653
+ - ▁tra
654
+ - ▁okay
655
+ - ere
656
+ - ib
657
+ - ary
658
+ - ▁incredib
659
+ - ▁car
660
+ - ▁job
661
+ - ▁used
662
+ - ▁give
663
+ - ▁god
664
+ - ▁americ
665
+ - ▁characters
666
+ - ▁app
667
+ - ▁walk
668
+ - ▁yes
669
+ - rew
670
+ - ▁getting
671
+ - ▁six
672
+ - ▁chan
673
+ - ▁ne
674
+ - ale
675
+ - ▁pretty
676
+ - mp
677
+ - ang
678
+ - ▁creat
679
+ - ▁another
680
+ - ▁ter
681
+ - ▁kids
682
+ - ▁felt
683
+ - ▁sometimes
684
+ - ▁place
685
+ - ▁int
686
+ - ically
687
+ - out
688
+ - ▁funny
689
+ - ase
690
+ - ich
691
+ - act
692
+ - ▁days
693
+ - ▁bring
694
+ - ▁making
695
+ - ▁become
696
+ - ute
697
+ - ▁wonderful
698
+ - ron
699
+ - ▁saw
700
+ - ▁point
701
+ - ia
702
+ - ▁realiz
703
+ - ▁away
704
+ - ays
705
+ - ▁home
706
+ - ace
707
+ - ▁relationship
708
+ - day
709
+ - ▁woman
710
+ - ▁everyone
711
+ - ▁comes
712
+ - ▁high
713
+ - ▁wee
714
+ - dd
715
+ - ▁night
716
+ - ath
717
+ - ts
718
+ - ▁else
719
+ - vent
720
+ - ▁shoot
721
+ - vers
722
+ - ▁sure
723
+ - ried
724
+ - ned
725
+ - ▁obviously
726
+ - ▁dra
727
+ - co
728
+ - iew
729
+ - man
730
+ - ▁playing
731
+ - ▁important
732
+ - ort
733
+ - uck
734
+ - ision
735
+ - pport
736
+ - ▁nor
737
+ - ▁seen
738
+ - ▁fl
739
+ - est
740
+ - ▁inter
741
+ - ks
742
+ - ▁actor
743
+ - ▁lear
744
+ - ▁worked
745
+ - ▁believe
746
+ - ▁gen
747
+ - ▁keep
748
+ - ull
749
+ - ▁friend
750
+ - ▁sw
751
+ - ▁des
752
+ - ▁times
753
+ - ▁sur
754
+ - ms
755
+ - ▁sit
756
+ - ▁probably
757
+ - ok
758
+ - ▁took
759
+ - ep
760
+ - ough
761
+ - ip
762
+ - ood
763
+ - ▁sa
764
+ - ▁season
765
+ - vel
766
+ - wn
767
+ - ▁dec
768
+ - ▁excited
769
+ - ame
770
+ - ian
771
+ - ire
772
+ - ▁name
773
+ - ▁im
774
+ - ▁month
775
+ - ner
776
+ - ▁min
777
+ - ▁rel
778
+ - ating
779
+ - body
780
+ - ition
781
+ - ▁loved
782
+ - ▁aw
783
+ - ▁hear
784
+ - ph
785
+ - ▁cool
786
+ - ▁list
787
+ - ord
788
+ - pl
789
+ - ble
790
+ - our
791
+ - ▁game
792
+ - ub
793
+ - ▁might
794
+ - ▁kid
795
+ - ▁movies
796
+ - ical
797
+ - ▁bad
798
+ - ▁scene
799
+ - iv
800
+ - ▁enough
801
+ - ▁sm
802
+ - ▁fift
803
+ - ▁eight
804
+ - ▁experience
805
+ - ▁actors
806
+ - ▁understand
807
+ - ▁few
808
+ - gin
809
+ - ting
810
+ - ▁director
811
+ - ▁almost
812
+ - ▁open
813
+ - ren
814
+ - ▁star
815
+ - ▁room
816
+ - ▁call
817
+ - oy
818
+ - ▁goes
819
+ - ▁told
820
+ - ▁once
821
+ - ▁found
822
+ - arly
823
+ - ations
824
+ - ward
825
+ - ▁audience
826
+ - ird
827
+ - ▁qu
828
+ - ▁ar
829
+ - ▁definitely
830
+ - ious
831
+ - iting
832
+ - ▁pol
833
+ - ▁huge
834
+ - ▁makes
835
+ - aking
836
+ - ▁la
837
+ - ▁ac
838
+ - iter
839
+ - ▁run
840
+ - ▁gotta
841
+ - ▁gr
842
+ - ▁cam
843
+ - sh
844
+ - ▁gets
845
+ - ▁wa
846
+ - ully
847
+ - ▁says
848
+ - ▁cont
849
+ - side
850
+ - ▁bus
851
+ - ▁shows
852
+ - ▁dr
853
+ - ▁inv
854
+ - ▁idea
855
+ - ▁talking
856
+ - way
857
+ - ▁art
858
+ - ▁whatever
859
+ - ▁write
860
+ - ash
861
+ - itt
862
+ - ▁met
863
+ - ▁wants
864
+ - ▁role
865
+ - if
866
+ - ▁mu
867
+ - ▁boy
868
+ - ▁wrote
869
+ - ger
870
+ - ately
871
+ - ▁exc
872
+ - ▁gu
873
+ - ▁mother
874
+ - ▁produ
875
+ - ▁cra
876
+ - ates
877
+ - ▁though
878
+ - av
879
+ - ▁episode
880
+ - ▁sl
881
+ - ▁change
882
+ - be
883
+ - ▁voice
884
+ - ▁played
885
+ - ily
886
+ - ▁guess
887
+ - ves
888
+ - ▁hand
889
+ - ady
890
+ - ▁happy
891
+ - ith
892
+ - ny
893
+ - ▁gi
894
+ - med
895
+ - ▁looking
896
+ - lev
897
+ - ream
898
+ - ▁acting
899
+ - aught
900
+ - iss
901
+ - ount
902
+ - rom
903
+ - ▁tw
904
+ - ▁john
905
+ - ▁far
906
+ - ▁res
907
+ - ▁sense
908
+ - ake
909
+ - ▁meet
910
+ - ▁bre
911
+ - ens
912
+ - ety
913
+ - ▁girl
914
+ - ▁york
915
+ - ▁count
916
+ - ▁shot
917
+ - ise
918
+ - ject
919
+ - ▁tot
920
+ - ▁stud
921
+ - ▁feels
922
+ - ▁thinking
923
+ - ma
924
+ - ▁head
925
+ - ▁cast
926
+ - ▁writing
927
+ - ▁imp
928
+ - ▁rehe
929
+ - ▁written
930
+ - ▁perfor
931
+ - ▁fan
932
+ - der
933
+ - ect
934
+ - ▁sk
935
+ - ▁hour
936
+ - ▁father
937
+ - ered
938
+ - ▁hundred
939
+ - ▁ind
940
+ - ▁che
941
+ - ▁acc
942
+ - up
943
+ - ▁while
944
+ - fort
945
+ - ▁true
946
+ - itch
947
+ - ▁inst
948
+ - ▁second
949
+ - ▁pick
950
+ - ▁record
951
+ - ross
952
+ - ▁quest
953
+ - ged
954
+ - ▁career
955
+ - ▁reason
956
+ - ▁since
957
+ - ▁bu
958
+ - ▁bra
959
+ - ▁char
960
+ - ree
961
+ - ▁girls
962
+ - ▁dad
963
+ - ▁fant
964
+ - ▁extra
965
+ - ▁laugh
966
+ - ▁stand
967
+ - ▁honest
968
+ - na
969
+ - als
970
+ - ▁yet
971
+ - ▁human
972
+ - ▁couple
973
+ - dy
974
+ - ▁mind
975
+ - ▁sound
976
+ - ▁ke
977
+ - ▁pop
978
+ - ▁ent
979
+ - ory
980
+ - ▁war
981
+ - ▁ten
982
+ - ink
983
+ - ▁bec
984
+ - ▁direct
985
+ - reat
986
+ - round
987
+ - ien
988
+ - ▁under
989
+ - ile
990
+ - ▁diff
991
+ - ually
992
+ - thing
993
+ - sic
994
+ - ▁gon
995
+ - ather
996
+ - ▁aud
997
+ - ert
998
+ - for
999
+ - ▁scen
1000
+ - mber
1001
+ - atch
1002
+ - ▁sho
1003
+ - ever
1004
+ - tra
1005
+ - ▁pe
1006
+ - ▁hu
1007
+ - ild
1008
+ - int
1009
+ - ▁ob
1010
+ - ▁care
1011
+ - ▁fam
1012
+ - ▁ide
1013
+ - ade
1014
+ - right
1015
+ - ▁may
1016
+ - he
1017
+ - mo
1018
+ - ody
1019
+ - ense
1020
+ - ▁interest
1021
+ - ah
1022
+ - ork
1023
+ - ▁episod
1024
+ - ▁prob
1025
+ - ▁rec
1026
+ - ▁hop
1027
+ - ited
1028
+ - ▁exper
1029
+ - gh
1030
+ - ▁bel
1031
+ - ▁el
1032
+ - ▁stu
1033
+ - enty
1034
+ - ound
1035
+ - ▁gott
1036
+ - ▁id
1037
+ - ime
1038
+ - rie
1039
+ - ▁inc
1040
+ - ertain
1041
+ - ▁wo
1042
+ - ▁mon
1043
+ - az
1044
+ - xt
1045
+ - riend
1046
+ - now
1047
+ - ▁y
1048
+ - ple
1049
+ - ome
1050
+ - so
1051
+ - ause
1052
+ - ▁cou
1053
+ - iously
1054
+ - ▁sch
1055
+ - ▁vo
1056
+ - ▁fil
1057
+ - ▁op
1058
+ - ason
1059
+ - ▁mov
1060
+ - ▁hi
1061
+ - ▁pers
1062
+ - ▁ye
1063
+ - ▁def
1064
+ - ▁belie
1065
+ - fore
1066
+ - ix
1067
+ - very
1068
+ - ▁differe
1069
+ - ▁wonder
1070
+ - nder
1071
+ - ▁obv
1072
+ - ▁ep
1073
+ - ship
1074
+ - ▁lau
1075
+ - ience
1076
+ - ool
1077
+ - ▁sin
1078
+ - rect
1079
+ - ▁happ
1080
+ - ▁gir
1081
+ - ▁hel
1082
+ - du
1083
+ - ng
1084
+ - ▁underst
1085
+ - most
1086
+ - eric
1087
+ - ouse
1088
+ - time
1089
+ - ▁cour
1090
+ - ▁relation
1091
+ - rough
1092
+ - q
1093
+ - ▁defin
1094
+ - ▁reme
1095
+ - redib
1096
+ - ▁fir
1097
+ - anna
1098
+ - ways
1099
+ - itten
1100
+ - elt
1101
+ - ▁sometime
1102
+ - ':'
1103
+ - alk
1104
+ - ▁ok
1105
+ - ably
1106
+ - rote
1107
+ - gether
1108
+ - ▁definite
1109
+ - ▁import
1110
+ - '&'
1111
+ - new
1112
+ - fter
1113
+ - onest
1114
+ - erest
1115
+ - ▁amaz
1116
+ - ▁ano
1117
+ - <sos/eos>
1118
+ transcript_token_list: null
1119
+ two_pass: false
1120
+ pre_postencoder_norm: false
1121
+ init: null
1122
+ input_size: 1
1123
+ ctc_conf:
1124
+ dropout_rate: 0.0
1125
+ ctc_type: builtin
1126
+ reduce: true
1127
+ ignore_nan_grad: null
1128
+ zero_infinity: true
1129
+ brctc_risk_strategy: exp
1130
+ brctc_group_strategy: end
1131
+ brctc_risk_factor: 0.0
1132
+ joint_net_conf: null
1133
+ use_preprocessor: true
1134
+ token_type: word
1135
+ bpemodel: null
1136
+ non_linguistic_symbols: null
1137
+ cleaner: null
1138
+ g2p: null
1139
+ speech_volume_normalize: null
1140
+ rir_scp: null
1141
+ rir_apply_prob: 1.0
1142
+ noise_scp: null
1143
+ noise_apply_prob: 1.0
1144
+ noise_db_range: '13_15'
1145
+ short_noise_thres: 0.5
1146
+ frontend: null
1147
+ frontend_conf: {}
1148
+ specaug: null
1149
+ specaug_conf: {}
1150
+ normalize: null
1151
+ normalize_conf: {}
1152
+ model: espnet
1153
+ model_conf:
1154
+ ctc_weight: 1.0
1155
+ lsm_weight: 0.1
1156
+ length_normalized_loss: false
1157
+ weighted_sum: true
1158
+ extract_feats_in_collect_stats: false
1159
+ preencoder: null
1160
+ preencoder_conf: {}
1161
+ encoder: whisper
1162
+ encoder_conf:
1163
+ whisper_model: medium
1164
+ dropout_rate: 0.0
1165
+ use_specaug: true
1166
+ specaug_conf:
1167
+ apply_time_warp: true
1168
+ time_warp_window: 5
1169
+ time_warp_mode: bicubic
1170
+ apply_freq_mask: true
1171
+ freq_mask_width_range:
1172
+ - 0
1173
+ - 40
1174
+ num_freq_mask: 2
1175
+ apply_time_mask: true
1176
+ time_mask_width_ratio_range:
1177
+ - 0.0
1178
+ - 0.12
1179
+ num_time_mask: 5
1180
+ prepostencoder: linear
1181
+ prepostencoder_conf:
1182
+ input_size: 1024
1183
+ output_size: 80
1184
+ postencoder: conformer_full
1185
+ postencoder_conf:
1186
+ output_size: 256
1187
+ attention_heads: 4
1188
+ linear_units: 1024
1189
+ num_blocks: 2
1190
+ dropout_rate: 0.1
1191
+ positional_dropout_rate: 0.1
1192
+ attention_dropout_rate: 0.1
1193
+ input_layer: conv2d1
1194
+ normalize_before: true
1195
+ macaron_style: true
1196
+ rel_pos_type: latest
1197
+ pos_enc_layer_type: rel_pos
1198
+ selfattention_layer_type: rel_selfattn
1199
+ activation_type: swish
1200
+ use_cnn_module: true
1201
+ cnn_module_kernel: 31
1202
+ deliberationencoder: null
1203
+ deliberationencoder_conf: {}
1204
+ decoder: transformer
1205
+ decoder_conf:
1206
+ attention_heads: 4
1207
+ linear_units: 2048
1208
+ num_blocks: 6
1209
+ dropout_rate: 0.1
1210
+ positional_dropout_rate: 0.1
1211
+ self_attention_dropout_rate: 0.1
1212
+ src_attention_dropout_rate: 0.1
1213
+ postdecoder: null
1214
+ postdecoder_conf: {}
1215
+ required:
1216
+ - output_dir
1217
+ - token_list
1218
+ version: '202310'
1219
+ distributed: true
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/acc.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/backward_time.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/cer.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/cer_ctc.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/clip.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/forward_time.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/gpu_max_cached_mem_GB.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/grad_norm.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/iter_time.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/loss.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/loss_att.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/loss_ctc.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/loss_scale.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/optim0_lr0.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/optim_step_time.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/train_time.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/images/wer.png ADDED
exp/slu_train_asr_whisper_superb_raw_en_word_sp/valid.loss.ave_10best.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e850933a200d1eac3a7f07193e078303fb4124f45fbfa622ba1cd6423e76208d
3
+ size 1265398202
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: '202310'
2
+ files:
3
+ slu_model_file: exp/slu_train_asr_whisper_superb_raw_en_word_sp/valid.loss.ave_10best.pth
4
+ python: "3.9.13 (main, Aug 25 2022, 23:26:10) \n[GCC 11.2.0]"
5
+ timestamp: 1715351510.883123
6
+ torch: 2.1.0+cu121
7
+ yaml_files:
8
+ slu_train_config: exp/slu_train_asr_whisper_superb_raw_en_word_sp/config.yaml