“siddhu001” commited on
Commit
d7d8bca
1 Parent(s): b4b01fa

Update model

Browse files
Files changed (22) hide show
  1. README.md +1358 -0
  2. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/RESULTS.md +65 -0
  3. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/config.yaml +1219 -0
  4. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/acc.png +0 -0
  5. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/backward_time.png +0 -0
  6. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/cer.png +0 -0
  7. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/cer_ctc.png +0 -0
  8. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/clip.png +0 -0
  9. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/forward_time.png +0 -0
  10. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/gpu_max_cached_mem_GB.png +0 -0
  11. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/grad_norm.png +0 -0
  12. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/iter_time.png +0 -0
  13. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/loss.png +0 -0
  14. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/loss_att.png +0 -0
  15. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/loss_ctc.png +0 -0
  16. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/loss_scale.png +0 -0
  17. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/optim0_lr0.png +0 -0
  18. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/optim_step_time.png +0 -0
  19. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/train_time.png +0 -0
  20. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/wer.png +0 -0
  21. exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/valid.acc.ave_10best.pth +3 -0
  22. meta.yaml +8 -0
README.md ADDED
@@ -0,0 +1,1358 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - automatic-speech-recognition
6
+ language: en
7
+ datasets:
8
+ - slue-voxceleb
9
+ license: cc-by-4.0
10
+ ---
11
+
12
+ ## ESPnet2 ASR model
13
+
14
+ ### `espnet/sluevoxceleb_whisper_complex_slu`
15
+
16
+ This model was trained by “siddhu001” using slue-voxceleb recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ Follow the [ESPnet installation instructions](https://espnet.github.io/espnet/installation.html)
21
+ if you haven't done that already.
22
+
23
+ ```bash
24
+ cd espnet
25
+ git checkout e23ef85f0b3116ad5c60d0833f186da0deec0734
26
+ pip install -e .
27
+ cd egs2/slue-voxceleb/slu1_correct
28
+ ./run.sh --skip_data_prep false --skip_train true --download_model espnet/sluevoxceleb_whisper_complex_slu
29
+ ```
30
+
31
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
32
+ # RESULTS
33
+ ## Environments
34
+ - date: `Sat Feb 10 19:24:27 CST 2024`
35
+ - python version: `3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0]`
36
+ - espnet version: `espnet 202310`
37
+ - pytorch version: `pytorch 2.1.0+cu121`
38
+ - Git hash: `21d2105784e4da98397bf487b2550d4c6e16d40d`
39
+ - Commit date: `Wed Jan 31 13:40:37 2024 -0600`
40
+
41
+ ## exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp
42
+ ### WER
43
+
44
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
45
+ |---|---|---|---|---|---|---|---|---|
46
+ |decode_asr_ctc0.3_beam10_slu_model_valid.acc.ave_10best/test|3530|144908|87.2|8.5|4.3|3.0|15.8|93.4|
47
+ |decode_asr_slu_model_valid.acc.ave_10best/devel|1450|58104|81.2|11.1|7.6|5.3|24.1|94.6|
48
+ |decode_asr_slu_model_valid.acc.ave_10best/test|3530|144908|79.5|12.3|8.2|5.8|26.3|96.1|
49
+
50
+ ### CER
51
+
52
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
53
+ |---|---|---|---|---|---|---|---|---|
54
+ |decode_asr_ctc0.3_beam10_slu_model_valid.acc.ave_10best/test|3530|647097|93.9|2.4|3.7|2.8|8.9|93.4|
55
+ |decode_asr_slu_model_valid.acc.ave_10best/devel|1450|256305|89.6|3.5|6.9|4.7|15.2|94.6|
56
+ |decode_asr_slu_model_valid.acc.ave_10best/test|3530|647097|88.6|3.8|7.6|5.2|16.6|96.1|
57
+
58
+ ### TER
59
+
60
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
61
+ |---|---|---|---|---|---|---|---|---|
62
+ ## exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/decode_asr_ctc0.3_beam10_slu_model_valid.acc.ave_10best
63
+ ### WER
64
+
65
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
66
+ |---|---|---|---|---|---|---|---|---|
67
+ |org/devel|1451|58267|88.7|7.3|4.0|2.4|13.7|91.5|
68
+
69
+ ### CER
70
+
71
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
72
+ |---|---|---|---|---|---|---|---|---|
73
+ |org/devel|1451|256942|94.7|2.1|3.3|2.3|7.7|91.5|
74
+
75
+ ### TER
76
+
77
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
78
+ |---|---|---|---|---|---|---|---|---|
79
+ ## exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/decode_asr_slu_model_valid.acc.ave_10best
80
+ ### WER
81
+
82
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
83
+ |---|---|---|---|---|---|---|---|---|
84
+ |org/devel|1451|58267|81.2|11.1|7.7|5.3|24.2|94.6|
85
+
86
+ ### CER
87
+
88
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
89
+ |---|---|---|---|---|---|---|---|---|
90
+ |org/devel|1451|256942|89.5|3.5|7.0|4.7|15.2|94.6|
91
+
92
+ ### TER
93
+
94
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
95
+ |---|---|---|---|---|---|---|---|---|
96
+
97
+ ## ASR config
98
+
99
+ <details><summary>expand</summary>
100
+
101
+ ```
102
+ config: conf/tuning/train_asr_whisper_weighted_0.0005.yaml
103
+ print_config: false
104
+ log_level: INFO
105
+ drop_last_iter: false
106
+ dry_run: false
107
+ iterator_type: sequence
108
+ valid_iterator_type: null
109
+ output_dir: exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp
110
+ ngpu: 1
111
+ seed: 2022
112
+ num_workers: 2
113
+ num_att_plot: 3
114
+ dist_backend: nccl
115
+ dist_init_method: env://
116
+ dist_world_size: 4
117
+ dist_rank: 0
118
+ local_rank: 0
119
+ dist_master_addr: localhost
120
+ dist_master_port: 53071
121
+ dist_launcher: null
122
+ multiprocessing_distributed: true
123
+ unused_parameters: false
124
+ sharded_ddp: false
125
+ cudnn_enabled: true
126
+ cudnn_benchmark: false
127
+ cudnn_deterministic: true
128
+ collect_stats: false
129
+ write_collected_feats: false
130
+ max_epoch: 70
131
+ patience: null
132
+ val_scheduler_criterion:
133
+ - valid
134
+ - loss
135
+ early_stopping_criterion:
136
+ - valid
137
+ - loss
138
+ - min
139
+ best_model_criterion:
140
+ - - valid
141
+ - acc
142
+ - max
143
+ keep_nbest_models: 10
144
+ nbest_averaging_interval: 0
145
+ grad_clip: 5.0
146
+ grad_clip_type: 2.0
147
+ grad_noise: false
148
+ accum_grad: 2
149
+ no_forward_run: false
150
+ resume: true
151
+ train_dtype: float32
152
+ use_amp: false
153
+ log_interval: null
154
+ use_matplotlib: true
155
+ use_tensorboard: true
156
+ create_graph_in_tensorboard: false
157
+ use_wandb: false
158
+ wandb_project: null
159
+ wandb_id: null
160
+ wandb_entity: null
161
+ wandb_name: null
162
+ wandb_model_log_interval: -1
163
+ detect_anomaly: false
164
+ use_lora: false
165
+ save_lora_only: true
166
+ lora_conf: {}
167
+ pretrain_path: null
168
+ init_param: []
169
+ ignore_init_mismatch: false
170
+ freeze_param:
171
+ - encoder
172
+ num_iters_per_epoch: null
173
+ batch_size: 20
174
+ valid_batch_size: null
175
+ batch_bins: 6000000
176
+ valid_batch_bins: null
177
+ train_shape_file:
178
+ - exp/slu_stats_raw_en_word_sp/train/speech_shape
179
+ - exp/slu_stats_raw_en_word_sp/train/text_shape.word
180
+ valid_shape_file:
181
+ - exp/slu_stats_raw_en_word_sp/valid/speech_shape
182
+ - exp/slu_stats_raw_en_word_sp/valid/text_shape.word
183
+ batch_type: numel
184
+ valid_batch_type: null
185
+ fold_length:
186
+ - 80000
187
+ - 150
188
+ sort_in_batch: descending
189
+ shuffle_within_batch: false
190
+ sort_batch: descending
191
+ multiple_iterator: false
192
+ chunk_length: 500
193
+ chunk_shift_ratio: 0.5
194
+ num_cache_chunks: 1024
195
+ chunk_excluded_key_prefixes: []
196
+ chunk_default_fs: null
197
+ train_data_path_and_name_and_type:
198
+ - - dump/raw/train_sp/wav.scp
199
+ - speech
200
+ - sound
201
+ - - dump/raw/train_sp/text
202
+ - text
203
+ - text
204
+ valid_data_path_and_name_and_type:
205
+ - - dump/raw/devel/wav.scp
206
+ - speech
207
+ - sound
208
+ - - dump/raw/devel/text
209
+ - text
210
+ - text
211
+ allow_variable_data_keys: false
212
+ max_cache_size: 0.0
213
+ max_cache_fd: 32
214
+ allow_multi_rates: false
215
+ valid_max_cache_size: null
216
+ exclude_weight_decay: false
217
+ exclude_weight_decay_conf: {}
218
+ optim: adam
219
+ optim_conf:
220
+ lr: 0.0005
221
+ weight_decay: 1.0e-06
222
+ scheduler: warmuplr
223
+ scheduler_conf:
224
+ warmup_steps: 5000
225
+ token_list:
226
+ - <blank>
227
+ - <unk>
228
+ - ▁i
229
+ - ▁and
230
+ - ''''
231
+ - s
232
+ - ▁the
233
+ - ▁a
234
+ - ▁it
235
+ - Neutral
236
+ - ▁to
237
+ - ▁you
238
+ - ▁that
239
+ - ▁of
240
+ - ▁in
241
+ - ▁was
242
+ - ▁uh
243
+ - ▁know
244
+ - t
245
+ - ▁so
246
+ - ▁we
247
+ - ▁he
248
+ - ing
249
+ - ▁um
250
+ - ed
251
+ - m
252
+ - ▁like
253
+ - ▁is
254
+ - ▁but
255
+ - Positive
256
+ - y
257
+ - ▁just
258
+ - ▁they
259
+ - re
260
+ - ▁this
261
+ - ▁for
262
+ - ▁be
263
+ - ▁my
264
+ - er
265
+ - ▁with
266
+ - ▁on
267
+ - ▁think
268
+ - ▁p
269
+ - ▁have
270
+ - ▁she
271
+ - e
272
+ - ▁me
273
+ - ▁really
274
+ - ▁there
275
+ - ▁what
276
+ - ▁m
277
+ - a
278
+ - ▁do
279
+ - ▁all
280
+ - i
281
+ - al
282
+ - ve
283
+ - c
284
+ - ▁as
285
+ - ▁about
286
+ - ▁not
287
+ - ▁t
288
+ - n
289
+ - ▁at
290
+ - l
291
+ - ▁had
292
+ - ▁b
293
+ - ▁when
294
+ - ▁c
295
+ - g
296
+ - ar
297
+ - ▁out
298
+ - en
299
+ - ▁s
300
+ - ▁an
301
+ - ▁people
302
+ - or
303
+ - an
304
+ - d
305
+ - o
306
+ - ll
307
+ - ▁are
308
+ - in
309
+ - ▁very
310
+ - p
311
+ - b
312
+ - u
313
+ - ▁because
314
+ - es
315
+ - ▁can
316
+ - ▁don
317
+ - ▁or
318
+ - ▁up
319
+ - it
320
+ - ▁one
321
+ - ly
322
+ - ▁if
323
+ - ▁f
324
+ - st
325
+ - ▁were
326
+ - ▁mean
327
+ - ▁d
328
+ - ▁who
329
+ - ▁then
330
+ - ic
331
+ - 'on'
332
+ - ▁no
333
+ - ▁go
334
+ - ▁her
335
+ - ▁g
336
+ - ent
337
+ - ▁st
338
+ - ▁kind
339
+ - ri
340
+ - ▁would
341
+ - ▁get
342
+ - ▁e
343
+ - le
344
+ - at
345
+ - r
346
+ - ▁time
347
+ - ▁w
348
+ - ▁re
349
+ - h
350
+ - ▁from
351
+ - ▁l
352
+ - ▁said
353
+ - ▁him
354
+ - ▁how
355
+ - v
356
+ - ▁well
357
+ - ▁h
358
+ - ▁gonna
359
+ - ▁lot
360
+ - ▁see
361
+ - f
362
+ - ▁his
363
+ - et
364
+ - ion
365
+ - ▁been
366
+ - ▁great
367
+ - ▁yeah
368
+ - ▁love
369
+ - ▁which
370
+ - ▁got
371
+ - k
372
+ - ▁them
373
+ - ▁way
374
+ - id
375
+ - ▁show
376
+ - w
377
+ - ▁some
378
+ - ▁your
379
+ - ▁did
380
+ - ▁sort
381
+ - ▁has
382
+ - ▁things
383
+ - ▁back
384
+ - ▁where
385
+ - ▁something
386
+ - ir
387
+ - ▁thing
388
+ - ad
389
+ - ▁su
390
+ - ▁ch
391
+ - ▁n
392
+ - il
393
+ - as
394
+ - ▁j
395
+ - ▁more
396
+ - se
397
+ - ▁say
398
+ - ▁co
399
+ - nd
400
+ - ▁much
401
+ - ▁always
402
+ - ine
403
+ - ▁r
404
+ - ation
405
+ - ur
406
+ - ▁other
407
+ - th
408
+ - ▁
409
+ - ▁se
410
+ - ▁now
411
+ - ate
412
+ - ▁doing
413
+ - ▁work
414
+ - ow
415
+ - ▁could
416
+ - ally
417
+ - ▁these
418
+ - Negative
419
+ - ▁good
420
+ - ▁any
421
+ - ers
422
+ - ce
423
+ - ▁cause
424
+ - ▁ex
425
+ - ▁pro
426
+ - ▁little
427
+ - ▁actually
428
+ - ▁into
429
+ - ▁make
430
+ - ▁first
431
+ - ▁being
432
+ - ra
433
+ - ▁our
434
+ - ▁al
435
+ - ▁by
436
+ - ▁film
437
+ - ▁didn
438
+ - ▁v
439
+ - ct
440
+ - ity
441
+ - ch
442
+ - un
443
+ - ▁part
444
+ - ▁de
445
+ - ▁come
446
+ - is
447
+ - ie
448
+ - ▁right
449
+ - ▁o
450
+ - ▁off
451
+ - ol
452
+ - ▁two
453
+ - ▁never
454
+ - ▁le
455
+ - ot
456
+ - ut
457
+ - ▁movie
458
+ - ▁play
459
+ - ge
460
+ - ies
461
+ - el
462
+ - ▁con
463
+ - am
464
+ - ▁going
465
+ - ke
466
+ - ▁want
467
+ - im
468
+ - ▁feel
469
+ - ive
470
+ - ro
471
+ - ▁mo
472
+ - ▁different
473
+ - ck
474
+ - ▁life
475
+ - ist
476
+ - ▁oh
477
+ - all
478
+ - ▁lo
479
+ - ard
480
+ - ▁went
481
+ - and
482
+ - ▁sh
483
+ - ▁even
484
+ - ry
485
+ - ▁years
486
+ - ▁look
487
+ - ▁us
488
+ - ant
489
+ - ▁te
490
+ - ▁k
491
+ - ▁li
492
+ - ▁happen
493
+ - ure
494
+ - ▁their
495
+ - ▁those
496
+ - ▁take
497
+ - ment
498
+ - ▁day
499
+ - ble
500
+ - ast
501
+ - ▁every
502
+ - um
503
+ - ill
504
+ - op
505
+ - ▁thought
506
+ - ou
507
+ - us
508
+ - ay
509
+ - ▁th
510
+ - ▁put
511
+ - ▁story
512
+ - ▁new
513
+ - ▁down
514
+ - ish
515
+ - ▁big
516
+ - ▁wanna
517
+ - ▁ro
518
+ - ▁also
519
+ - ▁read
520
+ - ▁around
521
+ - ous
522
+ - ▁through
523
+ - red
524
+ - ▁came
525
+ - ▁character
526
+ - ess
527
+ - te
528
+ - ver
529
+ - ▁will
530
+ - ag
531
+ - ss
532
+ - ▁fun
533
+ - ▁over
534
+ - ▁many
535
+ - ▁bl
536
+ - ▁cl
537
+ - ▁man
538
+ - ▁than
539
+ - ▁pre
540
+ - ▁world
541
+ - ▁person
542
+ - z
543
+ - ▁sp
544
+ - ven
545
+ - ▁wanted
546
+ - ▁bit
547
+ - ▁before
548
+ - ▁mar
549
+ - one
550
+ - ab
551
+ - ▁en
552
+ - ci
553
+ - ▁set
554
+ - ▁ha
555
+ - ▁find
556
+ - ul
557
+ - ▁fi
558
+ - ▁end
559
+ - ▁un
560
+ - ▁sc
561
+ - ▁after
562
+ - ind
563
+ - ter
564
+ - ▁working
565
+ - ▁why
566
+ - om
567
+ - me
568
+ - ▁such
569
+ - ▁whole
570
+ - ▁kinda
571
+ - ne
572
+ - ▁bo
573
+ - x
574
+ - ▁most
575
+ - ▁ad
576
+ - ▁guy
577
+ - ▁spe
578
+ - ars
579
+ - ▁am
580
+ - ful
581
+ - ▁together
582
+ - ▁let
583
+ - ▁quite
584
+ - ain
585
+ - ▁everything
586
+ - ▁made
587
+ - ig
588
+ - ▁old
589
+ - able
590
+ - ▁tr
591
+ - ak
592
+ - ▁fo
593
+ - ▁po
594
+ - ore
595
+ - ice
596
+ - ▁real
597
+ - ▁knew
598
+ - ▁hard
599
+ - pp
600
+ - age
601
+ - ated
602
+ - ▁same
603
+ - ▁start
604
+ - ▁ever
605
+ - ning
606
+ - ▁watch
607
+ - art
608
+ - ▁again
609
+ - ▁here
610
+ - are
611
+ - ght
612
+ - ong
613
+ - ▁done
614
+ - ▁only
615
+ - ▁live
616
+ - ▁wasn
617
+ - ▁ho
618
+ - ▁u
619
+ - ▁maybe
620
+ - ▁need
621
+ - ▁everybody
622
+ - ust
623
+ - ans
624
+ - ▁three
625
+ - ▁having
626
+ - ▁music
627
+ - ack
628
+ - ld
629
+ - ▁trying
630
+ - ▁guys
631
+ - rou
632
+ - ach
633
+ - ving
634
+ - ▁tell
635
+ - ▁should
636
+ - ff
637
+ - ide
638
+ - ▁four
639
+ - ▁started
640
+ - ▁com
641
+ - ass
642
+ - ▁long
643
+ - ▁fe
644
+ - ▁course
645
+ - ▁called
646
+ - ▁own
647
+ - ress
648
+ - ▁moment
649
+ - ▁pl
650
+ - ▁still
651
+ - ▁anything
652
+ - ▁family
653
+ - ▁fin
654
+ - ▁dan
655
+ - ▁bro
656
+ - 'no'
657
+ - ther
658
+ - ▁per
659
+ - ▁amazing
660
+ - ▁stuff
661
+ - per
662
+ - ▁jo
663
+ - ▁certain
664
+ - os
665
+ - ▁talk
666
+ - ater
667
+ - ▁help
668
+ - ▁too
669
+ - ▁year
670
+ - ight
671
+ - ▁fa
672
+ - self
673
+ - ces
674
+ - ▁br
675
+ - ▁bet
676
+ - ▁someone
677
+ - ▁di
678
+ - ▁sing
679
+ - nt
680
+ - ick
681
+ - ▁ph
682
+ - row
683
+ - ▁script
684
+ - ▁remember
685
+ - ▁try
686
+ - qu
687
+ - ite
688
+ - ▁young
689
+ - ▁wh
690
+ - ▁ser
691
+ - ▁ask
692
+ - ▁book
693
+ - ▁each
694
+ - ▁wr
695
+ - ▁best
696
+ - ▁ag
697
+ - ▁women
698
+ - ose
699
+ - ions
700
+ - ved
701
+ - j
702
+ - ue
703
+ - ▁does
704
+ - ▁five
705
+ - ▁both
706
+ - ▁friends
707
+ - ▁act
708
+ - iz
709
+ - cess
710
+ - pt
711
+ - ▁somebody
712
+ - ft
713
+ - ▁nice
714
+ - ▁myself
715
+ - een
716
+ - fe
717
+ - sp
718
+ - ict
719
+ - ty
720
+ - ▁child
721
+ - ud
722
+ - pe
723
+ - ▁hope
724
+ - ▁fact
725
+ - ▁saying
726
+ - ave
727
+ - icul
728
+ - au
729
+ - ale
730
+ - ris
731
+ - ▁twenty
732
+ - ▁school
733
+ - ▁doesn
734
+ - ▁able
735
+ - pect
736
+ - ▁last
737
+ - ber
738
+ - ▁song
739
+ - od
740
+ - ▁str
741
+ - ▁interesting
742
+ - lf
743
+ - ▁em
744
+ - ▁wor
745
+ - ap
746
+ - og
747
+ - ▁ra
748
+ - ▁dis
749
+ - ▁coming
750
+ - ▁ab
751
+ - ▁house
752
+ - ▁next
753
+ - ▁tra
754
+ - ▁okay
755
+ - ere
756
+ - ary
757
+ - ▁incredi
758
+ - ▁car
759
+ - ▁job
760
+ - ▁used
761
+ - ▁give
762
+ - ▁god
763
+ - ▁americ
764
+ - ▁characters
765
+ - ▁app
766
+ - ▁walk
767
+ - ▁yes
768
+ - rew
769
+ - ▁getting
770
+ - ▁six
771
+ - ▁chan
772
+ - ▁ne
773
+ - ▁pretty
774
+ - ang
775
+ - ▁creat
776
+ - ▁another
777
+ - ▁ter
778
+ - ▁kids
779
+ - ▁felt
780
+ - ▁sometimes
781
+ - ▁place
782
+ - out
783
+ - ▁funny
784
+ - ase
785
+ - ich
786
+ - act
787
+ - ▁days
788
+ - ▁hum
789
+ - ▁bring
790
+ - ts
791
+ - ▁making
792
+ - ▁comp
793
+ - ▁become
794
+ - ute
795
+ - ▁wonderful
796
+ - ron
797
+ - les
798
+ - ▁saw
799
+ - ▁point
800
+ - ia
801
+ - ▁realiz
802
+ - ▁int
803
+ - ▁away
804
+ - ays
805
+ - ▁home
806
+ - ace
807
+ - ▁relationship
808
+ - ▁woman
809
+ - ▁everyone
810
+ - ▁comes
811
+ - ▁high
812
+ - dd
813
+ - ▁night
814
+ - ath
815
+ - ▁else
816
+ - vent
817
+ - ▁shoot
818
+ - vers
819
+ - day
820
+ - ▁sure
821
+ - ried
822
+ - ned
823
+ - ▁obviously
824
+ - ▁dra
825
+ - ▁inter
826
+ - co
827
+ - ▁playing
828
+ - ▁important
829
+ - ort
830
+ - uck
831
+ - ision
832
+ - pport
833
+ - ▁seen
834
+ - pl
835
+ - ▁fl
836
+ - ound
837
+ - ▁bas
838
+ - ull
839
+ - est
840
+ - ▁actor
841
+ - ▁lear
842
+ - ▁worked
843
+ - ▁believe
844
+ - ▁gen
845
+ - ▁keep
846
+ - ▁friend
847
+ - ▁sw
848
+ - ▁des
849
+ - ▁times
850
+ - ▁im
851
+ - ▁sur
852
+ - ▁sit
853
+ - ▁probably
854
+ - ok
855
+ - ▁took
856
+ - ep
857
+ - ough
858
+ - ip
859
+ - ood
860
+ - ▁sa
861
+ - ▁season
862
+ - vel
863
+ - wn
864
+ - ▁dec
865
+ - ▁excited
866
+ - ian
867
+ - ire
868
+ - ph
869
+ - ▁month
870
+ - ner
871
+ - ▁min
872
+ - ▁rel
873
+ - ating
874
+ - body
875
+ - ition
876
+ - ▁loved
877
+ - ▁aw
878
+ - ▁hear
879
+ - ple
880
+ - ▁cool
881
+ - ▁y
882
+ - ord
883
+ - our
884
+ - ▁game
885
+ - ms
886
+ - ub
887
+ - ▁might
888
+ - ▁kid
889
+ - ▁movies
890
+ - ical
891
+ - ▁bad
892
+ - ▁scene
893
+ - iv
894
+ - ▁enough
895
+ - ▁sm
896
+ - bly
897
+ - ▁fift
898
+ - ▁eight
899
+ - ▁experience
900
+ - ▁actors
901
+ - ▁cou
902
+ - ▁understand
903
+ - ▁week
904
+ - ▁few
905
+ - gin
906
+ - ting
907
+ - ▁director
908
+ - ▁almost
909
+ - ▁open
910
+ - ren
911
+ - ▁star
912
+ - ▁room
913
+ - ▁call
914
+ - oy
915
+ - ▁goes
916
+ - ▁told
917
+ - ▁once
918
+ - ▁found
919
+ - arly
920
+ - ations
921
+ - ward
922
+ - ▁audience
923
+ - ird
924
+ - if
925
+ - ▁qu
926
+ - ▁ar
927
+ - ▁definitely
928
+ - ious
929
+ - iting
930
+ - ▁pol
931
+ - ▁huge
932
+ - ▁makes
933
+ - aking
934
+ - ream
935
+ - ance
936
+ - be
937
+ - ▁la
938
+ - ▁ac
939
+ - iter
940
+ - ▁run
941
+ - ▁gotta
942
+ - ▁gr
943
+ - ▁cam
944
+ - sh
945
+ - ▁gets
946
+ - ully
947
+ - ▁says
948
+ - ame
949
+ - side
950
+ - ▁bus
951
+ - ▁shows
952
+ - ▁dr
953
+ - ▁inv
954
+ - ▁idea
955
+ - ▁talking
956
+ - ▁wa
957
+ - way
958
+ - ▁art
959
+ - ▁whatever
960
+ - ▁write
961
+ - ash
962
+ - itt
963
+ - ▁met
964
+ - ▁wants
965
+ - ▁role
966
+ - ▁mu
967
+ - ▁boy
968
+ - ▁wrote
969
+ - ger
970
+ - ately
971
+ - ▁exc
972
+ - ▁mother
973
+ - ▁produ
974
+ - ▁cra
975
+ - ates
976
+ - ▁though
977
+ - av
978
+ - ▁episode
979
+ - ▁sl
980
+ - ▁change
981
+ - ▁voice
982
+ - ▁played
983
+ - ily
984
+ - ▁guess
985
+ - ves
986
+ - ▁hand
987
+ - ady
988
+ - ▁happy
989
+ - ith
990
+ - ▁name
991
+ - ny
992
+ - ▁gi
993
+ - ▁looking
994
+ - lev
995
+ - ▁acting
996
+ - aught
997
+ - iss
998
+ - ount
999
+ - rom
1000
+ - ▁tw
1001
+ - ▁cont
1002
+ - ▁john
1003
+ - ▁far
1004
+ - ▁res
1005
+ - ▁sense
1006
+ - ake
1007
+ - ▁basically
1008
+ - ▁meet
1009
+ - ▁gu
1010
+ - ▁bre
1011
+ - ens
1012
+ - cept
1013
+ - ety
1014
+ - ▁girl
1015
+ - ▁york
1016
+ - ▁count
1017
+ - ▁shot
1018
+ - ise
1019
+ - ject
1020
+ - ▁tot
1021
+ - ▁stud
1022
+ - ▁feels
1023
+ - ▁thinking
1024
+ - ▁head
1025
+ - ▁cast
1026
+ - ▁writing
1027
+ - ▁rehe
1028
+ - ▁written
1029
+ - ▁perform
1030
+ - ▁fan
1031
+ - der
1032
+ - ect
1033
+ - ▁sk
1034
+ - ▁hour
1035
+ - ▁father
1036
+ - ered
1037
+ - ▁hundred
1038
+ - ▁ind
1039
+ - ▁norm
1040
+ - ▁acc
1041
+ - up
1042
+ - ▁while
1043
+ - fort
1044
+ - ▁nin
1045
+ - ▁true
1046
+ - itch
1047
+ - ▁inst
1048
+ - ▁second
1049
+ - ▁pick
1050
+ - ▁record
1051
+ - ross
1052
+ - ▁quest
1053
+ - ged
1054
+ - ▁career
1055
+ - ween
1056
+ - ▁bec
1057
+ - ▁reason
1058
+ - ▁since
1059
+ - ▁bra
1060
+ - ▁char
1061
+ - ▁imp
1062
+ - ree
1063
+ - ▁girls
1064
+ - ▁comple
1065
+ - ▁turn
1066
+ - ▁dad
1067
+ - ▁fant
1068
+ - ▁extra
1069
+ - ▁laugh
1070
+ - ▁stand
1071
+ - ▁honest
1072
+ - ▁comm
1073
+ - na
1074
+ - ▁listen
1075
+ - als
1076
+ - cial
1077
+ - spe
1078
+ - ▁ke
1079
+ - ory
1080
+ - view
1081
+ - ink
1082
+ - ▁direct
1083
+ - reat
1084
+ - round
1085
+ - ien
1086
+ - ▁under
1087
+ - ile
1088
+ - ▁diff
1089
+ - ually
1090
+ - ▁tur
1091
+ - thing
1092
+ - sic
1093
+ - ▁gon
1094
+ - ather
1095
+ - ▁aud
1096
+ - ▁scen
1097
+ - atch
1098
+ - ▁sho
1099
+ - ever
1100
+ - tra
1101
+ - ▁pe
1102
+ - mo
1103
+ - ild
1104
+ - ▁care
1105
+ - int
1106
+ - ▁fam
1107
+ - ▁ob
1108
+ - ▁ide
1109
+ - ade
1110
+ - right
1111
+ - ▁may
1112
+ - he
1113
+ - ody
1114
+ - ense
1115
+ - ▁interest
1116
+ - ah
1117
+ - form
1118
+ - ork
1119
+ - ▁episod
1120
+ - ▁rec
1121
+ - iew
1122
+ - ▁hop
1123
+ - ited
1124
+ - ▁exper
1125
+ - gh
1126
+ - ically
1127
+ - ▁bel
1128
+ - ▁el
1129
+ - enty
1130
+ - ▁gott
1131
+ - ▁stu
1132
+ - ▁id
1133
+ - rie
1134
+ - ▁nor
1135
+ - ▁inc
1136
+ - ertain
1137
+ - tain
1138
+ - ▁wo
1139
+ - ▁mon
1140
+ - az
1141
+ - xt
1142
+ - riend
1143
+ - now
1144
+ - ▁list
1145
+ - ime
1146
+ - ome
1147
+ - so
1148
+ - ause
1149
+ - iously
1150
+ - ▁sch
1151
+ - ▁vo
1152
+ - ▁op
1153
+ - ason
1154
+ - ▁mov
1155
+ - ▁hi
1156
+ - ▁pers
1157
+ - ▁ye
1158
+ - ▁def
1159
+ - orm
1160
+ - ▁belie
1161
+ - fore
1162
+ - ix
1163
+ - mber
1164
+ - very
1165
+ - ▁differe
1166
+ - ▁wonder
1167
+ - ek
1168
+ - nder
1169
+ - ▁obv
1170
+ - ▁ep
1171
+ - ship
1172
+ - ▁lau
1173
+ - ience
1174
+ - ool
1175
+ - ▁sin
1176
+ - rect
1177
+ - ▁happ
1178
+ - ▁gir
1179
+ - du
1180
+ - ng
1181
+ - ▁underst
1182
+ - most
1183
+ - eric
1184
+ - ouse
1185
+ - time
1186
+ - lm
1187
+ - ▁hel
1188
+ - redi
1189
+ - ▁cour
1190
+ - ▁relation
1191
+ - rough
1192
+ - q
1193
+ - ▁defin
1194
+ - ▁prob
1195
+ - ▁reme
1196
+ - ▁hu
1197
+ - ▁fir
1198
+ - anna
1199
+ - ways
1200
+ - itten
1201
+ - elt
1202
+ - ▁sometime
1203
+ - ':'
1204
+ - ▁kne
1205
+ - alk
1206
+ - ▁ok
1207
+ - ably
1208
+ - rote
1209
+ - gether
1210
+ - ▁definite
1211
+ - ▁import
1212
+ - '&'
1213
+ - fter
1214
+ - onest
1215
+ - erest
1216
+ - ▁amaz
1217
+ - ▁ano
1218
+ - <sos/eos>
1219
+ transcript_token_list: null
1220
+ two_pass: false
1221
+ pre_postencoder_norm: false
1222
+ init: null
1223
+ input_size: 1
1224
+ ctc_conf:
1225
+ dropout_rate: 0.0
1226
+ ctc_type: builtin
1227
+ reduce: true
1228
+ ignore_nan_grad: null
1229
+ zero_infinity: true
1230
+ brctc_risk_strategy: exp
1231
+ brctc_group_strategy: end
1232
+ brctc_risk_factor: 0.0
1233
+ joint_net_conf: null
1234
+ use_preprocessor: true
1235
+ token_type: word
1236
+ bpemodel: null
1237
+ non_linguistic_symbols: null
1238
+ cleaner: null
1239
+ g2p: null
1240
+ speech_volume_normalize: null
1241
+ rir_scp: null
1242
+ rir_apply_prob: 1.0
1243
+ noise_scp: null
1244
+ noise_apply_prob: 1.0
1245
+ noise_db_range: '13_15'
1246
+ short_noise_thres: 0.5
1247
+ frontend: null
1248
+ frontend_conf: {}
1249
+ specaug: null
1250
+ specaug_conf: {}
1251
+ normalize: null
1252
+ normalize_conf: {}
1253
+ model: espnet
1254
+ model_conf:
1255
+ ctc_weight: 0.3
1256
+ lsm_weight: 0.1
1257
+ length_normalized_loss: false
1258
+ weighted_sum: true
1259
+ extract_feats_in_collect_stats: false
1260
+ preencoder: null
1261
+ preencoder_conf: {}
1262
+ encoder: whisper
1263
+ encoder_conf:
1264
+ whisper_model: medium
1265
+ dropout_rate: 0.0
1266
+ use_specaug: true
1267
+ specaug_conf:
1268
+ apply_time_warp: true
1269
+ time_warp_window: 5
1270
+ time_warp_mode: bicubic
1271
+ apply_freq_mask: true
1272
+ freq_mask_width_range:
1273
+ - 0
1274
+ - 40
1275
+ num_freq_mask: 2
1276
+ apply_time_mask: true
1277
+ time_mask_width_ratio_range:
1278
+ - 0.0
1279
+ - 0.12
1280
+ num_time_mask: 5
1281
+ prepostencoder: linear
1282
+ prepostencoder_conf:
1283
+ input_size: 1024
1284
+ output_size: 80
1285
+ postencoder: conformer_full
1286
+ postencoder_conf:
1287
+ output_size: 256
1288
+ attention_heads: 4
1289
+ linear_units: 1024
1290
+ num_blocks: 12
1291
+ dropout_rate: 0.1
1292
+ positional_dropout_rate: 0.1
1293
+ attention_dropout_rate: 0.1
1294
+ input_layer: conv2d2
1295
+ normalize_before: true
1296
+ macaron_style: true
1297
+ rel_pos_type: latest
1298
+ pos_enc_layer_type: rel_pos
1299
+ selfattention_layer_type: rel_selfattn
1300
+ activation_type: swish
1301
+ use_cnn_module: true
1302
+ cnn_module_kernel: 31
1303
+ deliberationencoder: null
1304
+ deliberationencoder_conf: {}
1305
+ decoder: transformer
1306
+ decoder_conf:
1307
+ attention_heads: 4
1308
+ linear_units: 2048
1309
+ num_blocks: 6
1310
+ dropout_rate: 0.1
1311
+ positional_dropout_rate: 0.1
1312
+ self_attention_dropout_rate: 0.1
1313
+ src_attention_dropout_rate: 0.1
1314
+ postdecoder: null
1315
+ postdecoder_conf: {}
1316
+ required:
1317
+ - output_dir
1318
+ - token_list
1319
+ version: '202310'
1320
+ distributed: true
1321
+ ```
1322
+
1323
+ </details>
1324
+
1325
+
1326
+
1327
+ ### Citing ESPnet
1328
+
1329
+ ```BibTex
1330
+ @inproceedings{watanabe2018espnet,
1331
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
1332
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
1333
+ year={2018},
1334
+ booktitle={Proceedings of Interspeech},
1335
+ pages={2207--2211},
1336
+ doi={10.21437/Interspeech.2018-1456},
1337
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
1338
+ }
1339
+
1340
+
1341
+
1342
+
1343
+
1344
+
1345
+ ```
1346
+
1347
+ or arXiv:
1348
+
1349
+ ```bibtex
1350
+ @misc{watanabe2018espnet,
1351
+ title={ESPnet: End-to-End Speech Processing Toolkit},
1352
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
1353
+ year={2018},
1354
+ eprint={1804.00015},
1355
+ archivePrefix={arXiv},
1356
+ primaryClass={cs.CL}
1357
+ }
1358
+ ```
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/RESULTS.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_asr_result.sh -->
2
+ # RESULTS
3
+ ## Environments
4
+ - date: `Sat Feb 10 19:24:27 CST 2024`
5
+ - python version: `3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0]`
6
+ - espnet version: `espnet 202310`
7
+ - pytorch version: `pytorch 2.1.0+cu121`
8
+ - Git hash: `21d2105784e4da98397bf487b2550d4c6e16d40d`
9
+ - Commit date: `Wed Jan 31 13:40:37 2024 -0600`
10
+
11
+ ## exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp
12
+ ### WER
13
+
14
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
15
+ |---|---|---|---|---|---|---|---|---|
16
+ |decode_asr_ctc0.3_beam10_slu_model_valid.acc.ave_10best/test|3530|144908|87.2|8.5|4.3|3.0|15.8|93.4|
17
+ |decode_asr_slu_model_valid.acc.ave_10best/devel|1450|58104|81.2|11.1|7.6|5.3|24.1|94.6|
18
+ |decode_asr_slu_model_valid.acc.ave_10best/test|3530|144908|79.5|12.3|8.2|5.8|26.3|96.1|
19
+
20
+ ### CER
21
+
22
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
23
+ |---|---|---|---|---|---|---|---|---|
24
+ |decode_asr_ctc0.3_beam10_slu_model_valid.acc.ave_10best/test|3530|647097|93.9|2.4|3.7|2.8|8.9|93.4|
25
+ |decode_asr_slu_model_valid.acc.ave_10best/devel|1450|256305|89.6|3.5|6.9|4.7|15.2|94.6|
26
+ |decode_asr_slu_model_valid.acc.ave_10best/test|3530|647097|88.6|3.8|7.6|5.2|16.6|96.1|
27
+
28
+ ### TER
29
+
30
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
31
+ |---|---|---|---|---|---|---|---|---|
32
+ ## exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/decode_asr_ctc0.3_beam10_slu_model_valid.acc.ave_10best
33
+ ### WER
34
+
35
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
36
+ |---|---|---|---|---|---|---|---|---|
37
+ |org/devel|1451|58267|88.7|7.3|4.0|2.4|13.7|91.5|
38
+
39
+ ### CER
40
+
41
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
42
+ |---|---|---|---|---|---|---|---|---|
43
+ |org/devel|1451|256942|94.7|2.1|3.3|2.3|7.7|91.5|
44
+
45
+ ### TER
46
+
47
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
48
+ |---|---|---|---|---|---|---|---|---|
49
+ ## exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/decode_asr_slu_model_valid.acc.ave_10best
50
+ ### WER
51
+
52
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
53
+ |---|---|---|---|---|---|---|---|---|
54
+ |org/devel|1451|58267|81.2|11.1|7.7|5.3|24.2|94.6|
55
+
56
+ ### CER
57
+
58
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
59
+ |---|---|---|---|---|---|---|---|---|
60
+ |org/devel|1451|256942|89.5|3.5|7.0|4.7|15.2|94.6|
61
+
62
+ ### TER
63
+
64
+ |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
65
+ |---|---|---|---|---|---|---|---|---|
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/config.yaml ADDED
@@ -0,0 +1,1219 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/tuning/train_asr_whisper_weighted_0.0005.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ drop_last_iter: false
5
+ dry_run: false
6
+ iterator_type: sequence
7
+ valid_iterator_type: null
8
+ output_dir: exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp
9
+ ngpu: 1
10
+ seed: 2022
11
+ num_workers: 2
12
+ num_att_plot: 3
13
+ dist_backend: nccl
14
+ dist_init_method: env://
15
+ dist_world_size: 4
16
+ dist_rank: 0
17
+ local_rank: 0
18
+ dist_master_addr: localhost
19
+ dist_master_port: 53071
20
+ dist_launcher: null
21
+ multiprocessing_distributed: true
22
+ unused_parameters: false
23
+ sharded_ddp: false
24
+ cudnn_enabled: true
25
+ cudnn_benchmark: false
26
+ cudnn_deterministic: true
27
+ collect_stats: false
28
+ write_collected_feats: false
29
+ max_epoch: 70
30
+ patience: null
31
+ val_scheduler_criterion:
32
+ - valid
33
+ - loss
34
+ early_stopping_criterion:
35
+ - valid
36
+ - loss
37
+ - min
38
+ best_model_criterion:
39
+ - - valid
40
+ - acc
41
+ - max
42
+ keep_nbest_models: 10
43
+ nbest_averaging_interval: 0
44
+ grad_clip: 5.0
45
+ grad_clip_type: 2.0
46
+ grad_noise: false
47
+ accum_grad: 2
48
+ no_forward_run: false
49
+ resume: true
50
+ train_dtype: float32
51
+ use_amp: false
52
+ log_interval: null
53
+ use_matplotlib: true
54
+ use_tensorboard: true
55
+ create_graph_in_tensorboard: false
56
+ use_wandb: false
57
+ wandb_project: null
58
+ wandb_id: null
59
+ wandb_entity: null
60
+ wandb_name: null
61
+ wandb_model_log_interval: -1
62
+ detect_anomaly: false
63
+ use_lora: false
64
+ save_lora_only: true
65
+ lora_conf: {}
66
+ pretrain_path: null
67
+ init_param: []
68
+ ignore_init_mismatch: false
69
+ freeze_param:
70
+ - encoder
71
+ num_iters_per_epoch: null
72
+ batch_size: 20
73
+ valid_batch_size: null
74
+ batch_bins: 6000000
75
+ valid_batch_bins: null
76
+ train_shape_file:
77
+ - exp/slu_stats_raw_en_word_sp/train/speech_shape
78
+ - exp/slu_stats_raw_en_word_sp/train/text_shape.word
79
+ valid_shape_file:
80
+ - exp/slu_stats_raw_en_word_sp/valid/speech_shape
81
+ - exp/slu_stats_raw_en_word_sp/valid/text_shape.word
82
+ batch_type: numel
83
+ valid_batch_type: null
84
+ fold_length:
85
+ - 80000
86
+ - 150
87
+ sort_in_batch: descending
88
+ shuffle_within_batch: false
89
+ sort_batch: descending
90
+ multiple_iterator: false
91
+ chunk_length: 500
92
+ chunk_shift_ratio: 0.5
93
+ num_cache_chunks: 1024
94
+ chunk_excluded_key_prefixes: []
95
+ chunk_default_fs: null
96
+ train_data_path_and_name_and_type:
97
+ - - dump/raw/train_sp/wav.scp
98
+ - speech
99
+ - sound
100
+ - - dump/raw/train_sp/text
101
+ - text
102
+ - text
103
+ valid_data_path_and_name_and_type:
104
+ - - dump/raw/devel/wav.scp
105
+ - speech
106
+ - sound
107
+ - - dump/raw/devel/text
108
+ - text
109
+ - text
110
+ allow_variable_data_keys: false
111
+ max_cache_size: 0.0
112
+ max_cache_fd: 32
113
+ allow_multi_rates: false
114
+ valid_max_cache_size: null
115
+ exclude_weight_decay: false
116
+ exclude_weight_decay_conf: {}
117
+ optim: adam
118
+ optim_conf:
119
+ lr: 0.0005
120
+ weight_decay: 1.0e-06
121
+ scheduler: warmuplr
122
+ scheduler_conf:
123
+ warmup_steps: 5000
124
+ token_list:
125
+ - <blank>
126
+ - <unk>
127
+ - ▁i
128
+ - ▁and
129
+ - ''''
130
+ - s
131
+ - ▁the
132
+ - ▁a
133
+ - ▁it
134
+ - Neutral
135
+ - ▁to
136
+ - ▁you
137
+ - ▁that
138
+ - ▁of
139
+ - ▁in
140
+ - ▁was
141
+ - ▁uh
142
+ - ▁know
143
+ - t
144
+ - ▁so
145
+ - ▁we
146
+ - ▁he
147
+ - ing
148
+ - ▁um
149
+ - ed
150
+ - m
151
+ - ▁like
152
+ - ▁is
153
+ - ▁but
154
+ - Positive
155
+ - y
156
+ - ▁just
157
+ - ▁they
158
+ - re
159
+ - ▁this
160
+ - ▁for
161
+ - ▁be
162
+ - ▁my
163
+ - er
164
+ - ▁with
165
+ - ▁on
166
+ - ▁think
167
+ - ▁p
168
+ - ▁have
169
+ - ▁she
170
+ - e
171
+ - ▁me
172
+ - ▁really
173
+ - ▁there
174
+ - ▁what
175
+ - ▁m
176
+ - a
177
+ - ▁do
178
+ - ▁all
179
+ - i
180
+ - al
181
+ - ve
182
+ - c
183
+ - ▁as
184
+ - ▁about
185
+ - ▁not
186
+ - ▁t
187
+ - n
188
+ - ▁at
189
+ - l
190
+ - ▁had
191
+ - ▁b
192
+ - ▁when
193
+ - ▁c
194
+ - g
195
+ - ar
196
+ - ▁out
197
+ - en
198
+ - ▁s
199
+ - ▁an
200
+ - ▁people
201
+ - or
202
+ - an
203
+ - d
204
+ - o
205
+ - ll
206
+ - ▁are
207
+ - in
208
+ - ▁very
209
+ - p
210
+ - b
211
+ - u
212
+ - ▁because
213
+ - es
214
+ - ▁can
215
+ - ▁don
216
+ - ▁or
217
+ - ▁up
218
+ - it
219
+ - ▁one
220
+ - ly
221
+ - ▁if
222
+ - ▁f
223
+ - st
224
+ - ▁were
225
+ - ▁mean
226
+ - ▁d
227
+ - ▁who
228
+ - ▁then
229
+ - ic
230
+ - 'on'
231
+ - ▁no
232
+ - ▁go
233
+ - ▁her
234
+ - ▁g
235
+ - ent
236
+ - ▁st
237
+ - ▁kind
238
+ - ri
239
+ - ▁would
240
+ - ▁get
241
+ - ▁e
242
+ - le
243
+ - at
244
+ - r
245
+ - ▁time
246
+ - ▁w
247
+ - ▁re
248
+ - h
249
+ - ▁from
250
+ - ▁l
251
+ - ▁said
252
+ - ▁him
253
+ - ▁how
254
+ - v
255
+ - ▁well
256
+ - ▁h
257
+ - ▁gonna
258
+ - ▁lot
259
+ - ▁see
260
+ - f
261
+ - ▁his
262
+ - et
263
+ - ion
264
+ - ▁been
265
+ - ▁great
266
+ - ▁yeah
267
+ - ▁love
268
+ - ▁which
269
+ - ▁got
270
+ - k
271
+ - ▁them
272
+ - ▁way
273
+ - id
274
+ - ▁show
275
+ - w
276
+ - ▁some
277
+ - ▁your
278
+ - ▁did
279
+ - ▁sort
280
+ - ▁has
281
+ - ▁things
282
+ - ▁back
283
+ - ▁where
284
+ - ▁something
285
+ - ir
286
+ - ▁thing
287
+ - ad
288
+ - ▁su
289
+ - ▁ch
290
+ - ▁n
291
+ - il
292
+ - as
293
+ - ▁j
294
+ - ▁more
295
+ - se
296
+ - ▁say
297
+ - ▁co
298
+ - nd
299
+ - ▁much
300
+ - ▁always
301
+ - ine
302
+ - ▁r
303
+ - ation
304
+ - ur
305
+ - ▁other
306
+ - th
307
+ - ▁
308
+ - ▁se
309
+ - ▁now
310
+ - ate
311
+ - ▁doing
312
+ - ▁work
313
+ - ow
314
+ - ▁could
315
+ - ally
316
+ - ▁these
317
+ - Negative
318
+ - ▁good
319
+ - ▁any
320
+ - ers
321
+ - ce
322
+ - ▁cause
323
+ - ▁ex
324
+ - ▁pro
325
+ - ▁little
326
+ - ▁actually
327
+ - ▁into
328
+ - ▁make
329
+ - ▁first
330
+ - ▁being
331
+ - ra
332
+ - ▁our
333
+ - ▁al
334
+ - ▁by
335
+ - ▁film
336
+ - ▁didn
337
+ - ▁v
338
+ - ct
339
+ - ity
340
+ - ch
341
+ - un
342
+ - ▁part
343
+ - ▁de
344
+ - ▁come
345
+ - is
346
+ - ie
347
+ - ▁right
348
+ - ▁o
349
+ - ▁off
350
+ - ol
351
+ - ▁two
352
+ - ▁never
353
+ - ▁le
354
+ - ot
355
+ - ut
356
+ - ▁movie
357
+ - ▁play
358
+ - ge
359
+ - ies
360
+ - el
361
+ - ▁con
362
+ - am
363
+ - ▁going
364
+ - ke
365
+ - ▁want
366
+ - im
367
+ - ▁feel
368
+ - ive
369
+ - ro
370
+ - ▁mo
371
+ - ▁different
372
+ - ck
373
+ - ▁life
374
+ - ist
375
+ - ▁oh
376
+ - all
377
+ - ▁lo
378
+ - ard
379
+ - ▁went
380
+ - and
381
+ - ▁sh
382
+ - ▁even
383
+ - ry
384
+ - ▁years
385
+ - ▁look
386
+ - ▁us
387
+ - ant
388
+ - ▁te
389
+ - ▁k
390
+ - ▁li
391
+ - ▁happen
392
+ - ure
393
+ - ▁their
394
+ - ▁those
395
+ - ▁take
396
+ - ment
397
+ - ▁day
398
+ - ble
399
+ - ast
400
+ - ▁every
401
+ - um
402
+ - ill
403
+ - op
404
+ - ▁thought
405
+ - ou
406
+ - us
407
+ - ay
408
+ - ▁th
409
+ - ▁put
410
+ - ▁story
411
+ - ▁new
412
+ - ▁down
413
+ - ish
414
+ - ▁big
415
+ - ▁wanna
416
+ - ▁ro
417
+ - ▁also
418
+ - ▁read
419
+ - ▁around
420
+ - ous
421
+ - ▁through
422
+ - red
423
+ - ▁came
424
+ - ▁character
425
+ - ess
426
+ - te
427
+ - ver
428
+ - ▁will
429
+ - ag
430
+ - ss
431
+ - ▁fun
432
+ - ▁over
433
+ - ▁many
434
+ - ▁bl
435
+ - ▁cl
436
+ - ▁man
437
+ - ▁than
438
+ - ▁pre
439
+ - ▁world
440
+ - ▁person
441
+ - z
442
+ - ▁sp
443
+ - ven
444
+ - ▁wanted
445
+ - ▁bit
446
+ - ▁before
447
+ - ▁mar
448
+ - one
449
+ - ab
450
+ - ▁en
451
+ - ci
452
+ - ▁set
453
+ - ▁ha
454
+ - ▁find
455
+ - ul
456
+ - ▁fi
457
+ - ▁end
458
+ - ▁un
459
+ - ▁sc
460
+ - ▁after
461
+ - ind
462
+ - ter
463
+ - ▁working
464
+ - ▁why
465
+ - om
466
+ - me
467
+ - ▁such
468
+ - ▁whole
469
+ - ▁kinda
470
+ - ne
471
+ - ▁bo
472
+ - x
473
+ - ▁most
474
+ - ▁ad
475
+ - ▁guy
476
+ - ▁spe
477
+ - ars
478
+ - ▁am
479
+ - ful
480
+ - ▁together
481
+ - ▁let
482
+ - ▁quite
483
+ - ain
484
+ - ▁everything
485
+ - ▁made
486
+ - ig
487
+ - ▁old
488
+ - able
489
+ - ▁tr
490
+ - ak
491
+ - ▁fo
492
+ - ▁po
493
+ - ore
494
+ - ice
495
+ - ▁real
496
+ - ▁knew
497
+ - ▁hard
498
+ - pp
499
+ - age
500
+ - ated
501
+ - ▁same
502
+ - ▁start
503
+ - ▁ever
504
+ - ning
505
+ - ▁watch
506
+ - art
507
+ - ▁again
508
+ - ▁here
509
+ - are
510
+ - ght
511
+ - ong
512
+ - ▁done
513
+ - ▁only
514
+ - ▁live
515
+ - ▁wasn
516
+ - ▁ho
517
+ - ▁u
518
+ - ▁maybe
519
+ - ▁need
520
+ - ▁everybody
521
+ - ust
522
+ - ans
523
+ - ▁three
524
+ - ▁having
525
+ - ▁music
526
+ - ack
527
+ - ld
528
+ - ▁trying
529
+ - ▁guys
530
+ - rou
531
+ - ach
532
+ - ving
533
+ - ▁tell
534
+ - ▁should
535
+ - ff
536
+ - ide
537
+ - ▁four
538
+ - ▁started
539
+ - ▁com
540
+ - ass
541
+ - ▁long
542
+ - ▁fe
543
+ - ▁course
544
+ - ▁called
545
+ - ▁own
546
+ - ress
547
+ - ▁moment
548
+ - ▁pl
549
+ - ▁still
550
+ - ▁anything
551
+ - ▁family
552
+ - ▁fin
553
+ - ▁dan
554
+ - ▁bro
555
+ - 'no'
556
+ - ther
557
+ - ▁per
558
+ - ▁amazing
559
+ - ▁stuff
560
+ - per
561
+ - ▁jo
562
+ - ▁certain
563
+ - os
564
+ - ▁talk
565
+ - ater
566
+ - ▁help
567
+ - ▁too
568
+ - ▁year
569
+ - ight
570
+ - ▁fa
571
+ - self
572
+ - ces
573
+ - ▁br
574
+ - ▁bet
575
+ - ▁someone
576
+ - ▁di
577
+ - ▁sing
578
+ - nt
579
+ - ick
580
+ - ▁ph
581
+ - row
582
+ - ▁script
583
+ - ▁remember
584
+ - ▁try
585
+ - qu
586
+ - ite
587
+ - ▁young
588
+ - ▁wh
589
+ - ▁ser
590
+ - ▁ask
591
+ - ▁book
592
+ - ▁each
593
+ - ▁wr
594
+ - ▁best
595
+ - ▁ag
596
+ - ▁women
597
+ - ose
598
+ - ions
599
+ - ved
600
+ - j
601
+ - ue
602
+ - ▁does
603
+ - ▁five
604
+ - ▁both
605
+ - ▁friends
606
+ - ▁act
607
+ - iz
608
+ - cess
609
+ - pt
610
+ - ▁somebody
611
+ - ft
612
+ - ▁nice
613
+ - ▁myself
614
+ - een
615
+ - fe
616
+ - sp
617
+ - ict
618
+ - ty
619
+ - ▁child
620
+ - ud
621
+ - pe
622
+ - ▁hope
623
+ - ▁fact
624
+ - ▁saying
625
+ - ave
626
+ - icul
627
+ - au
628
+ - ale
629
+ - ris
630
+ - ▁twenty
631
+ - ▁school
632
+ - ▁doesn
633
+ - ▁able
634
+ - pect
635
+ - ▁last
636
+ - ber
637
+ - ▁song
638
+ - od
639
+ - ▁str
640
+ - ▁interesting
641
+ - lf
642
+ - ▁em
643
+ - ▁wor
644
+ - ap
645
+ - og
646
+ - ▁ra
647
+ - ▁dis
648
+ - ▁coming
649
+ - ▁ab
650
+ - ▁house
651
+ - ▁next
652
+ - ▁tra
653
+ - ▁okay
654
+ - ere
655
+ - ary
656
+ - ▁incredi
657
+ - ▁car
658
+ - ▁job
659
+ - ▁used
660
+ - ▁give
661
+ - ▁god
662
+ - ▁americ
663
+ - ▁characters
664
+ - ▁app
665
+ - ▁walk
666
+ - ▁yes
667
+ - rew
668
+ - ▁getting
669
+ - ▁six
670
+ - ▁chan
671
+ - ▁ne
672
+ - ▁pretty
673
+ - ang
674
+ - ▁creat
675
+ - ▁another
676
+ - ▁ter
677
+ - ▁kids
678
+ - ▁felt
679
+ - ▁sometimes
680
+ - ▁place
681
+ - out
682
+ - ▁funny
683
+ - ase
684
+ - ich
685
+ - act
686
+ - ▁days
687
+ - ▁hum
688
+ - ▁bring
689
+ - ts
690
+ - ▁making
691
+ - ▁comp
692
+ - ▁become
693
+ - ute
694
+ - ▁wonderful
695
+ - ron
696
+ - les
697
+ - ▁saw
698
+ - ▁point
699
+ - ia
700
+ - ▁realiz
701
+ - ▁int
702
+ - ▁away
703
+ - ays
704
+ - ▁home
705
+ - ace
706
+ - ▁relationship
707
+ - ▁woman
708
+ - ▁everyone
709
+ - ▁comes
710
+ - ▁high
711
+ - dd
712
+ - ▁night
713
+ - ath
714
+ - ▁else
715
+ - vent
716
+ - ▁shoot
717
+ - vers
718
+ - day
719
+ - ▁sure
720
+ - ried
721
+ - ned
722
+ - ▁obviously
723
+ - ▁dra
724
+ - ▁inter
725
+ - co
726
+ - ▁playing
727
+ - ▁important
728
+ - ort
729
+ - uck
730
+ - ision
731
+ - pport
732
+ - ▁seen
733
+ - pl
734
+ - ▁fl
735
+ - ound
736
+ - ▁bas
737
+ - ull
738
+ - est
739
+ - ▁actor
740
+ - ▁lear
741
+ - ▁worked
742
+ - ▁believe
743
+ - ▁gen
744
+ - ▁keep
745
+ - ▁friend
746
+ - ▁sw
747
+ - ▁des
748
+ - ▁times
749
+ - ▁im
750
+ - ▁sur
751
+ - ▁sit
752
+ - ▁probably
753
+ - ok
754
+ - ▁took
755
+ - ep
756
+ - ough
757
+ - ip
758
+ - ood
759
+ - ▁sa
760
+ - ▁season
761
+ - vel
762
+ - wn
763
+ - ▁dec
764
+ - ▁excited
765
+ - ian
766
+ - ire
767
+ - ph
768
+ - ▁month
769
+ - ner
770
+ - ▁min
771
+ - ▁rel
772
+ - ating
773
+ - body
774
+ - ition
775
+ - ▁loved
776
+ - ▁aw
777
+ - ▁hear
778
+ - ple
779
+ - ▁cool
780
+ - ▁y
781
+ - ord
782
+ - our
783
+ - ▁game
784
+ - ms
785
+ - ub
786
+ - ▁might
787
+ - ▁kid
788
+ - ▁movies
789
+ - ical
790
+ - ▁bad
791
+ - ▁scene
792
+ - iv
793
+ - ▁enough
794
+ - ▁sm
795
+ - bly
796
+ - ▁fift
797
+ - ▁eight
798
+ - ▁experience
799
+ - ▁actors
800
+ - ▁cou
801
+ - ▁understand
802
+ - ▁week
803
+ - ▁few
804
+ - gin
805
+ - ting
806
+ - ▁director
807
+ - ▁almost
808
+ - ▁open
809
+ - ren
810
+ - ▁star
811
+ - ▁room
812
+ - ▁call
813
+ - oy
814
+ - ▁goes
815
+ - ▁told
816
+ - ▁once
817
+ - ▁found
818
+ - arly
819
+ - ations
820
+ - ward
821
+ - ▁audience
822
+ - ird
823
+ - if
824
+ - ▁qu
825
+ - ▁ar
826
+ - ▁definitely
827
+ - ious
828
+ - iting
829
+ - ▁pol
830
+ - ▁huge
831
+ - ▁makes
832
+ - aking
833
+ - ream
834
+ - ance
835
+ - be
836
+ - ▁la
837
+ - ▁ac
838
+ - iter
839
+ - ▁run
840
+ - ▁gotta
841
+ - ▁gr
842
+ - ▁cam
843
+ - sh
844
+ - ▁gets
845
+ - ully
846
+ - ▁says
847
+ - ame
848
+ - side
849
+ - ▁bus
850
+ - ▁shows
851
+ - ▁dr
852
+ - ▁inv
853
+ - ▁idea
854
+ - ▁talking
855
+ - ▁wa
856
+ - way
857
+ - ▁art
858
+ - ▁whatever
859
+ - ▁write
860
+ - ash
861
+ - itt
862
+ - ▁met
863
+ - ▁wants
864
+ - ▁role
865
+ - ▁mu
866
+ - ▁boy
867
+ - ▁wrote
868
+ - ger
869
+ - ately
870
+ - ▁exc
871
+ - ▁mother
872
+ - ▁produ
873
+ - ▁cra
874
+ - ates
875
+ - ▁though
876
+ - av
877
+ - ▁episode
878
+ - ▁sl
879
+ - ▁change
880
+ - ▁voice
881
+ - ▁played
882
+ - ily
883
+ - ▁guess
884
+ - ves
885
+ - ▁hand
886
+ - ady
887
+ - ▁happy
888
+ - ith
889
+ - ▁name
890
+ - ny
891
+ - ▁gi
892
+ - ▁looking
893
+ - lev
894
+ - ▁acting
895
+ - aught
896
+ - iss
897
+ - ount
898
+ - rom
899
+ - ▁tw
900
+ - ▁cont
901
+ - ▁john
902
+ - ▁far
903
+ - ▁res
904
+ - ▁sense
905
+ - ake
906
+ - ▁basically
907
+ - ▁meet
908
+ - ▁gu
909
+ - ▁bre
910
+ - ens
911
+ - cept
912
+ - ety
913
+ - ▁girl
914
+ - ▁york
915
+ - ▁count
916
+ - ▁shot
917
+ - ise
918
+ - ject
919
+ - ▁tot
920
+ - ▁stud
921
+ - ▁feels
922
+ - ▁thinking
923
+ - ▁head
924
+ - ▁cast
925
+ - ▁writing
926
+ - ▁rehe
927
+ - ▁written
928
+ - ▁perform
929
+ - ▁fan
930
+ - der
931
+ - ect
932
+ - ▁sk
933
+ - ▁hour
934
+ - ▁father
935
+ - ered
936
+ - ▁hundred
937
+ - ▁ind
938
+ - ▁norm
939
+ - ▁acc
940
+ - up
941
+ - ▁while
942
+ - fort
943
+ - ▁nin
944
+ - ▁true
945
+ - itch
946
+ - ▁inst
947
+ - ▁second
948
+ - ▁pick
949
+ - ▁record
950
+ - ross
951
+ - ▁quest
952
+ - ged
953
+ - ▁career
954
+ - ween
955
+ - ▁bec
956
+ - ▁reason
957
+ - ▁since
958
+ - ▁bra
959
+ - ▁char
960
+ - ▁imp
961
+ - ree
962
+ - ▁girls
963
+ - ▁comple
964
+ - ▁turn
965
+ - ▁dad
966
+ - ▁fant
967
+ - ▁extra
968
+ - ▁laugh
969
+ - ▁stand
970
+ - ▁honest
971
+ - ▁comm
972
+ - na
973
+ - ▁listen
974
+ - als
975
+ - cial
976
+ - spe
977
+ - ▁ke
978
+ - ory
979
+ - view
980
+ - ink
981
+ - ▁direct
982
+ - reat
983
+ - round
984
+ - ien
985
+ - ▁under
986
+ - ile
987
+ - ▁diff
988
+ - ually
989
+ - ▁tur
990
+ - thing
991
+ - sic
992
+ - ▁gon
993
+ - ather
994
+ - ▁aud
995
+ - ▁scen
996
+ - atch
997
+ - ▁sho
998
+ - ever
999
+ - tra
1000
+ - ▁pe
1001
+ - mo
1002
+ - ild
1003
+ - ▁care
1004
+ - int
1005
+ - ▁fam
1006
+ - ▁ob
1007
+ - ▁ide
1008
+ - ade
1009
+ - right
1010
+ - ▁may
1011
+ - he
1012
+ - ody
1013
+ - ense
1014
+ - ▁interest
1015
+ - ah
1016
+ - form
1017
+ - ork
1018
+ - ▁episod
1019
+ - ▁rec
1020
+ - iew
1021
+ - ▁hop
1022
+ - ited
1023
+ - ▁exper
1024
+ - gh
1025
+ - ically
1026
+ - ▁bel
1027
+ - ▁el
1028
+ - enty
1029
+ - ▁gott
1030
+ - ▁stu
1031
+ - ▁id
1032
+ - rie
1033
+ - ▁nor
1034
+ - ▁inc
1035
+ - ertain
1036
+ - tain
1037
+ - ▁wo
1038
+ - ▁mon
1039
+ - az
1040
+ - xt
1041
+ - riend
1042
+ - now
1043
+ - ▁list
1044
+ - ime
1045
+ - ome
1046
+ - so
1047
+ - ause
1048
+ - iously
1049
+ - ▁sch
1050
+ - ▁vo
1051
+ - ▁op
1052
+ - ason
1053
+ - ▁mov
1054
+ - ▁hi
1055
+ - ▁pers
1056
+ - ▁ye
1057
+ - ▁def
1058
+ - orm
1059
+ - ▁belie
1060
+ - fore
1061
+ - ix
1062
+ - mber
1063
+ - very
1064
+ - ▁differe
1065
+ - ▁wonder
1066
+ - ek
1067
+ - nder
1068
+ - ▁obv
1069
+ - ▁ep
1070
+ - ship
1071
+ - ▁lau
1072
+ - ience
1073
+ - ool
1074
+ - ▁sin
1075
+ - rect
1076
+ - ▁happ
1077
+ - ▁gir
1078
+ - du
1079
+ - ng
1080
+ - ▁underst
1081
+ - most
1082
+ - eric
1083
+ - ouse
1084
+ - time
1085
+ - lm
1086
+ - ▁hel
1087
+ - redi
1088
+ - ▁cour
1089
+ - ▁relation
1090
+ - rough
1091
+ - q
1092
+ - ▁defin
1093
+ - ▁prob
1094
+ - ▁reme
1095
+ - ▁hu
1096
+ - ▁fir
1097
+ - anna
1098
+ - ways
1099
+ - itten
1100
+ - elt
1101
+ - ▁sometime
1102
+ - ':'
1103
+ - ▁kne
1104
+ - alk
1105
+ - ▁ok
1106
+ - ably
1107
+ - rote
1108
+ - gether
1109
+ - ▁definite
1110
+ - ▁import
1111
+ - '&'
1112
+ - fter
1113
+ - onest
1114
+ - erest
1115
+ - ▁amaz
1116
+ - ▁ano
1117
+ - <sos/eos>
1118
+ transcript_token_list: null
1119
+ two_pass: false
1120
+ pre_postencoder_norm: false
1121
+ init: null
1122
+ input_size: 1
1123
+ ctc_conf:
1124
+ dropout_rate: 0.0
1125
+ ctc_type: builtin
1126
+ reduce: true
1127
+ ignore_nan_grad: null
1128
+ zero_infinity: true
1129
+ brctc_risk_strategy: exp
1130
+ brctc_group_strategy: end
1131
+ brctc_risk_factor: 0.0
1132
+ joint_net_conf: null
1133
+ use_preprocessor: true
1134
+ token_type: word
1135
+ bpemodel: null
1136
+ non_linguistic_symbols: null
1137
+ cleaner: null
1138
+ g2p: null
1139
+ speech_volume_normalize: null
1140
+ rir_scp: null
1141
+ rir_apply_prob: 1.0
1142
+ noise_scp: null
1143
+ noise_apply_prob: 1.0
1144
+ noise_db_range: '13_15'
1145
+ short_noise_thres: 0.5
1146
+ frontend: null
1147
+ frontend_conf: {}
1148
+ specaug: null
1149
+ specaug_conf: {}
1150
+ normalize: null
1151
+ normalize_conf: {}
1152
+ model: espnet
1153
+ model_conf:
1154
+ ctc_weight: 0.3
1155
+ lsm_weight: 0.1
1156
+ length_normalized_loss: false
1157
+ weighted_sum: true
1158
+ extract_feats_in_collect_stats: false
1159
+ preencoder: null
1160
+ preencoder_conf: {}
1161
+ encoder: whisper
1162
+ encoder_conf:
1163
+ whisper_model: medium
1164
+ dropout_rate: 0.0
1165
+ use_specaug: true
1166
+ specaug_conf:
1167
+ apply_time_warp: true
1168
+ time_warp_window: 5
1169
+ time_warp_mode: bicubic
1170
+ apply_freq_mask: true
1171
+ freq_mask_width_range:
1172
+ - 0
1173
+ - 40
1174
+ num_freq_mask: 2
1175
+ apply_time_mask: true
1176
+ time_mask_width_ratio_range:
1177
+ - 0.0
1178
+ - 0.12
1179
+ num_time_mask: 5
1180
+ prepostencoder: linear
1181
+ prepostencoder_conf:
1182
+ input_size: 1024
1183
+ output_size: 80
1184
+ postencoder: conformer_full
1185
+ postencoder_conf:
1186
+ output_size: 256
1187
+ attention_heads: 4
1188
+ linear_units: 1024
1189
+ num_blocks: 12
1190
+ dropout_rate: 0.1
1191
+ positional_dropout_rate: 0.1
1192
+ attention_dropout_rate: 0.1
1193
+ input_layer: conv2d2
1194
+ normalize_before: true
1195
+ macaron_style: true
1196
+ rel_pos_type: latest
1197
+ pos_enc_layer_type: rel_pos
1198
+ selfattention_layer_type: rel_selfattn
1199
+ activation_type: swish
1200
+ use_cnn_module: true
1201
+ cnn_module_kernel: 31
1202
+ deliberationencoder: null
1203
+ deliberationencoder_conf: {}
1204
+ decoder: transformer
1205
+ decoder_conf:
1206
+ attention_heads: 4
1207
+ linear_units: 2048
1208
+ num_blocks: 6
1209
+ dropout_rate: 0.1
1210
+ positional_dropout_rate: 0.1
1211
+ self_attention_dropout_rate: 0.1
1212
+ src_attention_dropout_rate: 0.1
1213
+ postdecoder: null
1214
+ postdecoder_conf: {}
1215
+ required:
1216
+ - output_dir
1217
+ - token_list
1218
+ version: '202310'
1219
+ distributed: true
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/acc.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/backward_time.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/cer.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/cer_ctc.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/clip.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/forward_time.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/gpu_max_cached_mem_GB.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/grad_norm.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/iter_time.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/loss.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/loss_att.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/loss_ctc.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/loss_scale.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/optim0_lr0.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/optim_step_time.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/train_time.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/images/wer.png ADDED
exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/valid.acc.ave_10best.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:636268417d582b45cae8a4af8c7f4f85f8a3276331f28a3f4dfcecf804618186
3
+ size 1358901978
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: '202310'
2
+ files:
3
+ slu_model_file: exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/valid.acc.ave_10best.pth
4
+ python: "3.9.13 (main, Aug 25 2022, 23:26:10) \n[GCC 11.2.0]"
5
+ timestamp: 1715356620.081744
6
+ torch: 2.1.0+cu121
7
+ yaml_files:
8
+ slu_train_config: exp/slu_train_asr_whisper_weighted_0.0005_raw_en_word_sp/config.yaml