add model
Browse files- README.md +904 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/RESULTS.md +29 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/config.yaml +806 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/acc.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/backward_time.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/cer.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/cer_ctc.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/forward_time.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/gpu_max_cached_mem_GB.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/iter_time.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/loss.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/loss_att.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/loss_ctc.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/optim0_lr0.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/optim_step_time.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/train_time.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/wer.png +0 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/score.log +46 -0
- exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/valid.acc.ave_10best.pth +3 -0
- meta.yaml +8 -0
README.md
ADDED
@@ -0,0 +1,904 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- espnet
|
4 |
+
- audio
|
5 |
+
- automatic-speech-recognition
|
6 |
+
language: en
|
7 |
+
datasets:
|
8 |
+
- slurp_entity
|
9 |
+
license: cc-by-4.0
|
10 |
+
---
|
11 |
+
|
12 |
+
## ESPnet2 ASR model
|
13 |
+
|
14 |
+
### `pyf98/slurp_entity_branchformer`
|
15 |
+
|
16 |
+
This model was trained by Yifan Peng using slurp_entity recipe in [espnet](https://github.com/espnet/espnet/).
|
17 |
+
|
18 |
+
### Demo: How to use in ESPnet2
|
19 |
+
|
20 |
+
```bash
|
21 |
+
cd espnet
|
22 |
+
git checkout 55b6cc387fd0252d1a06db2042fd101bcea7bb34
|
23 |
+
pip install -e .
|
24 |
+
cd egs2/slurp_entity/asr1
|
25 |
+
./run.sh --skip_data_prep false --skip_train true --download_model pyf98/slurp_entity_branchformer
|
26 |
+
```
|
27 |
+
|
28 |
+
<!-- Generated by scripts/utils/show_asr_result.sh -->
|
29 |
+
# RESULTS
|
30 |
+
## Environments
|
31 |
+
- date: `Fri May 27 03:41:59 EDT 2022`
|
32 |
+
- python version: `3.9.12 (main, Apr 5 2022, 06:56:58) [GCC 7.5.0]`
|
33 |
+
- espnet version: `espnet 202204`
|
34 |
+
- pytorch version: `pytorch 1.11.0`
|
35 |
+
- Git hash: `4f36236ed7c8a25c2f869e518614e1ad4a8b50d6`
|
36 |
+
- Commit date: `Thu May 26 00:22:45 2022 -0400`
|
37 |
+
|
38 |
+
## asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word
|
39 |
+
### WER
|
40 |
+
|
41 |
+
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|
42 |
+
|---|---|---|---|---|---|---|---|---|
|
43 |
+
|decode_asr_asr_model_valid.acc.ave_10best/devel|8690|178058|83.7|7.6|8.8|2.8|19.2|50.5|
|
44 |
+
|decode_asr_asr_model_valid.acc.ave_10best/test|13078|262176|82.6|7.9|9.5|2.7|20.1|49.2|
|
45 |
+
|
46 |
+
### CER
|
47 |
+
|
48 |
+
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|
49 |
+
|---|---|---|---|---|---|---|---|---|
|
50 |
+
|decode_asr_asr_model_valid.acc.ave_10best/devel|8690|847400|90.1|3.0|6.9|3.3|13.2|50.5|
|
51 |
+
|decode_asr_asr_model_valid.acc.ave_10best/test|13078|1245475|89.0|3.2|7.8|3.1|14.1|49.2|
|
52 |
+
|
53 |
+
### TER
|
54 |
+
|
55 |
+
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|
56 |
+
|---|---|---|---|---|---|---|---|---|
|
57 |
+
|
58 |
+
## ASR config
|
59 |
+
|
60 |
+
<details><summary>expand</summary>
|
61 |
+
|
62 |
+
```
|
63 |
+
config: conf/tuning/train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k.yaml
|
64 |
+
print_config: false
|
65 |
+
log_level: INFO
|
66 |
+
dry_run: false
|
67 |
+
iterator_type: sequence
|
68 |
+
output_dir: exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word
|
69 |
+
ngpu: 1
|
70 |
+
seed: 0
|
71 |
+
num_workers: 1
|
72 |
+
num_att_plot: 3
|
73 |
+
dist_backend: nccl
|
74 |
+
dist_init_method: env://
|
75 |
+
dist_world_size: null
|
76 |
+
dist_rank: null
|
77 |
+
local_rank: 0
|
78 |
+
dist_master_addr: null
|
79 |
+
dist_master_port: null
|
80 |
+
dist_launcher: null
|
81 |
+
multiprocessing_distributed: false
|
82 |
+
unused_parameters: false
|
83 |
+
sharded_ddp: false
|
84 |
+
cudnn_enabled: true
|
85 |
+
cudnn_benchmark: false
|
86 |
+
cudnn_deterministic: true
|
87 |
+
collect_stats: false
|
88 |
+
write_collected_feats: false
|
89 |
+
max_epoch: 50
|
90 |
+
patience: null
|
91 |
+
val_scheduler_criterion:
|
92 |
+
- valid
|
93 |
+
- loss
|
94 |
+
early_stopping_criterion:
|
95 |
+
- valid
|
96 |
+
- loss
|
97 |
+
- min
|
98 |
+
best_model_criterion:
|
99 |
+
- - valid
|
100 |
+
- acc
|
101 |
+
- max
|
102 |
+
keep_nbest_models: 10
|
103 |
+
nbest_averaging_interval: 0
|
104 |
+
grad_clip: 5.0
|
105 |
+
grad_clip_type: 2.0
|
106 |
+
grad_noise: false
|
107 |
+
accum_grad: 1
|
108 |
+
no_forward_run: false
|
109 |
+
resume: true
|
110 |
+
train_dtype: float32
|
111 |
+
use_amp: false
|
112 |
+
log_interval: null
|
113 |
+
use_matplotlib: true
|
114 |
+
use_tensorboard: true
|
115 |
+
use_wandb: false
|
116 |
+
wandb_project: null
|
117 |
+
wandb_id: null
|
118 |
+
wandb_entity: null
|
119 |
+
wandb_name: null
|
120 |
+
wandb_model_log_interval: -1
|
121 |
+
detect_anomaly: false
|
122 |
+
pretrain_path: null
|
123 |
+
init_param: []
|
124 |
+
ignore_init_mismatch: false
|
125 |
+
freeze_param: []
|
126 |
+
num_iters_per_epoch: null
|
127 |
+
batch_size: 64
|
128 |
+
valid_batch_size: null
|
129 |
+
batch_bins: 1000000
|
130 |
+
valid_batch_bins: null
|
131 |
+
train_shape_file:
|
132 |
+
- exp/asr_stats_raw_en_word/train/speech_shape
|
133 |
+
- exp/asr_stats_raw_en_word/train/text_shape.word
|
134 |
+
valid_shape_file:
|
135 |
+
- exp/asr_stats_raw_en_word/valid/speech_shape
|
136 |
+
- exp/asr_stats_raw_en_word/valid/text_shape.word
|
137 |
+
batch_type: folded
|
138 |
+
valid_batch_type: null
|
139 |
+
fold_length:
|
140 |
+
- 80000
|
141 |
+
- 150
|
142 |
+
sort_in_batch: descending
|
143 |
+
sort_batch: descending
|
144 |
+
multiple_iterator: false
|
145 |
+
chunk_length: 500
|
146 |
+
chunk_shift_ratio: 0.5
|
147 |
+
num_cache_chunks: 1024
|
148 |
+
train_data_path_and_name_and_type:
|
149 |
+
- - dump/raw/train/wav.scp
|
150 |
+
- speech
|
151 |
+
- kaldi_ark
|
152 |
+
- - dump/raw/train/text
|
153 |
+
- text
|
154 |
+
- text
|
155 |
+
valid_data_path_and_name_and_type:
|
156 |
+
- - dump/raw/devel/wav.scp
|
157 |
+
- speech
|
158 |
+
- kaldi_ark
|
159 |
+
- - dump/raw/devel/text
|
160 |
+
- text
|
161 |
+
- text
|
162 |
+
allow_variable_data_keys: false
|
163 |
+
max_cache_size: 0.0
|
164 |
+
max_cache_fd: 32
|
165 |
+
valid_max_cache_size: null
|
166 |
+
optim: adam
|
167 |
+
optim_conf:
|
168 |
+
lr: 0.001
|
169 |
+
weight_decay: 1.0e-06
|
170 |
+
scheduler: warmuplr
|
171 |
+
scheduler_conf:
|
172 |
+
warmup_steps: 35000
|
173 |
+
token_list:
|
174 |
+
- <blank>
|
175 |
+
- <unk>
|
176 |
+
- βSEP
|
177 |
+
- βFILL
|
178 |
+
- s
|
179 |
+
- βthe
|
180 |
+
- a
|
181 |
+
- βto
|
182 |
+
- βi
|
183 |
+
- βme
|
184 |
+
- e
|
185 |
+
- βs
|
186 |
+
- βa
|
187 |
+
- i
|
188 |
+
- βyou
|
189 |
+
- βwhat
|
190 |
+
- er
|
191 |
+
- ing
|
192 |
+
- u
|
193 |
+
- βis
|
194 |
+
- ''''
|
195 |
+
- o
|
196 |
+
- p
|
197 |
+
- βin
|
198 |
+
- βp
|
199 |
+
- y
|
200 |
+
- βmy
|
201 |
+
- βplease
|
202 |
+
- d
|
203 |
+
- c
|
204 |
+
- m
|
205 |
+
- βb
|
206 |
+
- l
|
207 |
+
- βm
|
208 |
+
- βc
|
209 |
+
- st
|
210 |
+
- date
|
211 |
+
- n
|
212 |
+
- βd
|
213 |
+
- le
|
214 |
+
- b
|
215 |
+
- βfor
|
216 |
+
- re
|
217 |
+
- t
|
218 |
+
- βon
|
219 |
+
- en
|
220 |
+
- h
|
221 |
+
- 'on'
|
222 |
+
- ar
|
223 |
+
- person
|
224 |
+
- βre
|
225 |
+
- βf
|
226 |
+
- βg
|
227 |
+
- βof
|
228 |
+
- an
|
229 |
+
- β
|
230 |
+
- g
|
231 |
+
- βtoday
|
232 |
+
- βt
|
233 |
+
- or
|
234 |
+
- βit
|
235 |
+
- βthis
|
236 |
+
- βh
|
237 |
+
- r
|
238 |
+
- f
|
239 |
+
- at
|
240 |
+
- ch
|
241 |
+
- ce
|
242 |
+
- place_name
|
243 |
+
- βemail
|
244 |
+
- βdo
|
245 |
+
- es
|
246 |
+
- ri
|
247 |
+
- βe
|
248 |
+
- βw
|
249 |
+
- ic
|
250 |
+
- in
|
251 |
+
- βthat
|
252 |
+
- event_name
|
253 |
+
- βplay
|
254 |
+
- βand
|
255 |
+
- al
|
256 |
+
- βn
|
257 |
+
- βcan
|
258 |
+
- email_query
|
259 |
+
- ve
|
260 |
+
- βnew
|
261 |
+
- day
|
262 |
+
- it
|
263 |
+
- ate
|
264 |
+
- βfrom
|
265 |
+
- βhave
|
266 |
+
- k
|
267 |
+
- time
|
268 |
+
- βam
|
269 |
+
- media_type
|
270 |
+
- email_sendemail
|
271 |
+
- ent
|
272 |
+
- βolly
|
273 |
+
- qa_factoid
|
274 |
+
- se
|
275 |
+
- v
|
276 |
+
- et
|
277 |
+
- ck
|
278 |
+
- βany
|
279 |
+
- calendar_set
|
280 |
+
- ly
|
281 |
+
- th
|
282 |
+
- βhow
|
283 |
+
- βmeeting
|
284 |
+
- ed
|
285 |
+
- βtell
|
286 |
+
- βst
|
287 |
+
- x
|
288 |
+
- ur
|
289 |
+
- ro
|
290 |
+
- βat
|
291 |
+
- nd
|
292 |
+
- βlist
|
293 |
+
- w
|
294 |
+
- βu
|
295 |
+
- ou
|
296 |
+
- βnot
|
297 |
+
- βabout
|
298 |
+
- βan
|
299 |
+
- βo
|
300 |
+
- general_negate
|
301 |
+
- ut
|
302 |
+
- βtime
|
303 |
+
- βbe
|
304 |
+
- βch
|
305 |
+
- βare
|
306 |
+
- social_post
|
307 |
+
- business_name
|
308 |
+
- la
|
309 |
+
- ty
|
310 |
+
- play_music
|
311 |
+
- ot
|
312 |
+
- general_quirky
|
313 |
+
- βl
|
314 |
+
- βsh
|
315 |
+
- βtweet
|
316 |
+
- om
|
317 |
+
- βweek
|
318 |
+
- um
|
319 |
+
- βone
|
320 |
+
- ter
|
321 |
+
- βhe
|
322 |
+
- βup
|
323 |
+
- βcom
|
324 |
+
- general_praise
|
325 |
+
- weather_query
|
326 |
+
- βnext
|
327 |
+
- βth
|
328 |
+
- βcheck
|
329 |
+
- calendar_query
|
330 |
+
- βlast
|
331 |
+
- βro
|
332 |
+
- ad
|
333 |
+
- is
|
334 |
+
- βwith
|
335 |
+
- ay
|
336 |
+
- βsend
|
337 |
+
- pe
|
338 |
+
- βpm
|
339 |
+
- βtomorrow
|
340 |
+
- βj
|
341 |
+
- un
|
342 |
+
- βtrain
|
343 |
+
- general_explain
|
344 |
+
- βv
|
345 |
+
- one
|
346 |
+
- βr
|
347 |
+
- ra
|
348 |
+
- news_query
|
349 |
+
- ation
|
350 |
+
- βemails
|
351 |
+
- us
|
352 |
+
- if
|
353 |
+
- ct
|
354 |
+
- βco
|
355 |
+
- βadd
|
356 |
+
- βwill
|
357 |
+
- βse
|
358 |
+
- nt
|
359 |
+
- βwas
|
360 |
+
- ine
|
361 |
+
- βde
|
362 |
+
- βset
|
363 |
+
- βex
|
364 |
+
- βwould
|
365 |
+
- ir
|
366 |
+
- ow
|
367 |
+
- ber
|
368 |
+
- general_repeat
|
369 |
+
- ight
|
370 |
+
- ook
|
371 |
+
- βagain
|
372 |
+
- βsong
|
373 |
+
- currency_name
|
374 |
+
- ll
|
375 |
+
- βha
|
376 |
+
- βgo
|
377 |
+
- relation
|
378 |
+
- te
|
379 |
+
- ion
|
380 |
+
- and
|
381 |
+
- βy
|
382 |
+
- βye
|
383 |
+
- general_affirm
|
384 |
+
- general_confirm
|
385 |
+
- ery
|
386 |
+
- βpo
|
387 |
+
- ff
|
388 |
+
- βwe
|
389 |
+
- βturn
|
390 |
+
- βdid
|
391 |
+
- βmar
|
392 |
+
- βalarm
|
393 |
+
- βlike
|
394 |
+
- datetime_query
|
395 |
+
- ers
|
396 |
+
- βall
|
397 |
+
- βremind
|
398 |
+
- βso
|
399 |
+
- qa_definition
|
400 |
+
- βcalendar
|
401 |
+
- end
|
402 |
+
- βsaid
|
403 |
+
- ci
|
404 |
+
- βoff
|
405 |
+
- βjohn
|
406 |
+
- βday
|
407 |
+
- ss
|
408 |
+
- pla
|
409 |
+
- ume
|
410 |
+
- βget
|
411 |
+
- ail
|
412 |
+
- pp
|
413 |
+
- z
|
414 |
+
- ry
|
415 |
+
- am
|
416 |
+
- βneed
|
417 |
+
- as
|
418 |
+
- βthank
|
419 |
+
- βwh
|
420 |
+
- βwant
|
421 |
+
- βright
|
422 |
+
- βjo
|
423 |
+
- βfacebook
|
424 |
+
- βk
|
425 |
+
- ge
|
426 |
+
- ld
|
427 |
+
- βfri
|
428 |
+
- βtwo
|
429 |
+
- general_dontcare
|
430 |
+
- βnews
|
431 |
+
- ol
|
432 |
+
- oo
|
433 |
+
- ant
|
434 |
+
- βfive
|
435 |
+
- βevent
|
436 |
+
- ake
|
437 |
+
- definition_word
|
438 |
+
- transport_type
|
439 |
+
- βyour
|
440 |
+
- vi
|
441 |
+
- orn
|
442 |
+
- op
|
443 |
+
- βweather
|
444 |
+
- ome
|
445 |
+
- βapp
|
446 |
+
- βlo
|
447 |
+
- de
|
448 |
+
- βmusic
|
449 |
+
- weather_descriptor
|
450 |
+
- ak
|
451 |
+
- ke
|
452 |
+
- βthere
|
453 |
+
- βsi
|
454 |
+
- βlights
|
455 |
+
- βnow
|
456 |
+
- βmo
|
457 |
+
- calendar_remove
|
458 |
+
- our
|
459 |
+
- βdollar
|
460 |
+
- food_type
|
461 |
+
- me
|
462 |
+
- βmore
|
463 |
+
- βno
|
464 |
+
- βbirthday
|
465 |
+
- orrect
|
466 |
+
- βrep
|
467 |
+
- βshow
|
468 |
+
- play_radio
|
469 |
+
- βmon
|
470 |
+
- βdoes
|
471 |
+
- ood
|
472 |
+
- ag
|
473 |
+
- li
|
474 |
+
- βsto
|
475 |
+
- βcontact
|
476 |
+
- cket
|
477 |
+
- email_querycontact
|
478 |
+
- βev
|
479 |
+
- βcould
|
480 |
+
- ange
|
481 |
+
- βjust
|
482 |
+
- out
|
483 |
+
- ame
|
484 |
+
- .
|
485 |
+
- βja
|
486 |
+
- βconfirm
|
487 |
+
- qa_currency
|
488 |
+
- βman
|
489 |
+
- βlate
|
490 |
+
- βthink
|
491 |
+
- βsome
|
492 |
+
- timeofday
|
493 |
+
- βbo
|
494 |
+
- qa_stock
|
495 |
+
- ong
|
496 |
+
- βstart
|
497 |
+
- βwork
|
498 |
+
- βten
|
499 |
+
- int
|
500 |
+
- βcommand
|
501 |
+
- all
|
502 |
+
- βmake
|
503 |
+
- βla
|
504 |
+
- j
|
505 |
+
- βansw
|
506 |
+
- βhour
|
507 |
+
- βcle
|
508 |
+
- ah
|
509 |
+
- βfind
|
510 |
+
- βservice
|
511 |
+
- βfa
|
512 |
+
- qu
|
513 |
+
- general_commandstop
|
514 |
+
- ai
|
515 |
+
- βwhen
|
516 |
+
- βte
|
517 |
+
- βby
|
518 |
+
- social_query
|
519 |
+
- ard
|
520 |
+
- βtw
|
521 |
+
- ul
|
522 |
+
- id
|
523 |
+
- βseven
|
524 |
+
- βwhere
|
525 |
+
- βmuch
|
526 |
+
- art
|
527 |
+
- βappointment
|
528 |
+
- ver
|
529 |
+
- artist_name
|
530 |
+
- el
|
531 |
+
- device_type
|
532 |
+
- βknow
|
533 |
+
- βthree
|
534 |
+
- βevents
|
535 |
+
- βtr
|
536 |
+
- βli
|
537 |
+
- ork
|
538 |
+
- red
|
539 |
+
- ect
|
540 |
+
- βlet
|
541 |
+
- βrespon
|
542 |
+
- βpar
|
543 |
+
- zz
|
544 |
+
- βgive
|
545 |
+
- βtwenty
|
546 |
+
- βti
|
547 |
+
- βcurre
|
548 |
+
- play_podcasts
|
549 |
+
- βradio
|
550 |
+
- cooking_recipe
|
551 |
+
- transport_query
|
552 |
+
- βcon
|
553 |
+
- gh
|
554 |
+
- βle
|
555 |
+
- lists_query
|
556 |
+
- βrem
|
557 |
+
- recommendation_events
|
558 |
+
- house_place
|
559 |
+
- alarm_set
|
560 |
+
- play_audiobook
|
561 |
+
- ist
|
562 |
+
- ase
|
563 |
+
- music_genre
|
564 |
+
- ive
|
565 |
+
- ast
|
566 |
+
- player_setting
|
567 |
+
- ort
|
568 |
+
- lly
|
569 |
+
- news_topic
|
570 |
+
- list_name
|
571 |
+
- βplaylist
|
572 |
+
- βne
|
573 |
+
- business_type
|
574 |
+
- personal_info
|
575 |
+
- ind
|
576 |
+
- ust
|
577 |
+
- di
|
578 |
+
- ress
|
579 |
+
- recommendation_locations
|
580 |
+
- lists_createoradd
|
581 |
+
- iot_hue_lightoff
|
582 |
+
- lists_remove
|
583 |
+
- ord
|
584 |
+
- βlight
|
585 |
+
- ere
|
586 |
+
- alarm_query
|
587 |
+
- audio_volume_mute
|
588 |
+
- music_query
|
589 |
+
- βaudio
|
590 |
+
- rain
|
591 |
+
- βdate
|
592 |
+
- βorder
|
593 |
+
- audio_volume_up
|
594 |
+
- βar
|
595 |
+
- βpodcast
|
596 |
+
- transport_ticket
|
597 |
+
- mail
|
598 |
+
- iot_hue_lightchange
|
599 |
+
- iot_coffee
|
600 |
+
- radio_name
|
601 |
+
- ill
|
602 |
+
- βri
|
603 |
+
- '@'
|
604 |
+
- takeaway_query
|
605 |
+
- song_name
|
606 |
+
- takeaway_order
|
607 |
+
- βra
|
608 |
+
- email_addcontact
|
609 |
+
- play_game
|
610 |
+
- book
|
611 |
+
- transport_traffic
|
612 |
+
- βhouse
|
613 |
+
- music_likeness
|
614 |
+
- her
|
615 |
+
- transport_taxi
|
616 |
+
- iot_hue_lightdim
|
617 |
+
- ment
|
618 |
+
- ght
|
619 |
+
- fo
|
620 |
+
- order_type
|
621 |
+
- color_type
|
622 |
+
- '1'
|
623 |
+
- ven
|
624 |
+
- ould
|
625 |
+
- general_joke
|
626 |
+
- ess
|
627 |
+
- ain
|
628 |
+
- qa_maths
|
629 |
+
- βplace
|
630 |
+
- βtwe
|
631 |
+
- cast
|
632 |
+
- iot_cleaning
|
633 |
+
- βche
|
634 |
+
- βcont
|
635 |
+
- ith
|
636 |
+
- audiobook_name
|
637 |
+
- email_address
|
638 |
+
- game_name
|
639 |
+
- βcal
|
640 |
+
- general_frequency
|
641 |
+
- βtom
|
642 |
+
- βfood
|
643 |
+
- act
|
644 |
+
- iot_hue_lightup
|
645 |
+
- '2'
|
646 |
+
- alarm_remove
|
647 |
+
- podcast_descriptor
|
648 |
+
- βdefinition
|
649 |
+
- audio_volume_down
|
650 |
+
- βmedia
|
651 |
+
- email_folder
|
652 |
+
- dia
|
653 |
+
- meal_type
|
654 |
+
- βmus
|
655 |
+
- recommendation_movies
|
656 |
+
- βad
|
657 |
+
- ree
|
658 |
+
- pt
|
659 |
+
- now
|
660 |
+
- playlist_name
|
661 |
+
- βperson
|
662 |
+
- change_amount
|
663 |
+
- βpla
|
664 |
+
- escri
|
665 |
+
- datetime_convert
|
666 |
+
- podcast_name
|
667 |
+
- βab
|
668 |
+
- time_zone
|
669 |
+
- βdef
|
670 |
+
- ting
|
671 |
+
- iot_wemo_on
|
672 |
+
- music_settings
|
673 |
+
- iot_wemo_off
|
674 |
+
- orre
|
675 |
+
- cy
|
676 |
+
- ank
|
677 |
+
- music_descriptor
|
678 |
+
- lar
|
679 |
+
- app_name
|
680 |
+
- row
|
681 |
+
- joke_type
|
682 |
+
- xt
|
683 |
+
- of
|
684 |
+
- ition
|
685 |
+
- βmeet
|
686 |
+
- ink
|
687 |
+
- βconfir
|
688 |
+
- transport_agency
|
689 |
+
- general_greet
|
690 |
+
- βbusiness
|
691 |
+
- βart
|
692 |
+
- βag
|
693 |
+
- urn
|
694 |
+
- escript
|
695 |
+
- rom
|
696 |
+
- βrel
|
697 |
+
- βau
|
698 |
+
- βcurrency
|
699 |
+
- audio_volume_other
|
700 |
+
- iot_hue_lighton
|
701 |
+
- βartist
|
702 |
+
- '?'
|
703 |
+
- βbus
|
704 |
+
- cooking_type
|
705 |
+
- movie_name
|
706 |
+
- coffee_type
|
707 |
+
- ingredient
|
708 |
+
- ather
|
709 |
+
- music_dislikeness
|
710 |
+
- sp
|
711 |
+
- q
|
712 |
+
- βser
|
713 |
+
- esc
|
714 |
+
- βbir
|
715 |
+
- βcur
|
716 |
+
- name
|
717 |
+
- βtran
|
718 |
+
- βhou
|
719 |
+
- ek
|
720 |
+
- uch
|
721 |
+
- βconf
|
722 |
+
- βface
|
723 |
+
- '9'
|
724 |
+
- βbirth
|
725 |
+
- I
|
726 |
+
- sw
|
727 |
+
- transport_descriptor
|
728 |
+
- βcomm
|
729 |
+
- lease
|
730 |
+
- transport_name
|
731 |
+
- aid
|
732 |
+
- movie_type
|
733 |
+
- βdevice
|
734 |
+
- alarm_type
|
735 |
+
- audiobook_author
|
736 |
+
- '5'
|
737 |
+
- drink_type
|
738 |
+
- βjoh
|
739 |
+
- βdefin
|
740 |
+
- word
|
741 |
+
- βcurren
|
742 |
+
- order
|
743 |
+
- iness
|
744 |
+
- W
|
745 |
+
- cooking_query
|
746 |
+
- sport_type
|
747 |
+
- βrelation
|
748 |
+
- oint
|
749 |
+
- H
|
750 |
+
- '8'
|
751 |
+
- A
|
752 |
+
- '0'
|
753 |
+
- βdol
|
754 |
+
- vice
|
755 |
+
- βpers
|
756 |
+
- '&'
|
757 |
+
- T
|
758 |
+
- βappoint
|
759 |
+
- _
|
760 |
+
- '7'
|
761 |
+
- '3'
|
762 |
+
- '-'
|
763 |
+
- game_type
|
764 |
+
- βpod
|
765 |
+
- N
|
766 |
+
- M
|
767 |
+
- E
|
768 |
+
- list
|
769 |
+
- music_album
|
770 |
+
- dio
|
771 |
+
- βtransport
|
772 |
+
- qa_query
|
773 |
+
- C
|
774 |
+
- O
|
775 |
+
- U
|
776 |
+
- query_detail
|
777 |
+
- ']'
|
778 |
+
- '['
|
779 |
+
- descriptor
|
780 |
+
- ':'
|
781 |
+
- spon
|
782 |
+
- <sos/eos>
|
783 |
+
init: null
|
784 |
+
input_size: null
|
785 |
+
ctc_conf:
|
786 |
+
dropout_rate: 0.0
|
787 |
+
ctc_type: builtin
|
788 |
+
reduce: true
|
789 |
+
ignore_nan_grad: true
|
790 |
+
joint_net_conf: null
|
791 |
+
use_preprocessor: true
|
792 |
+
token_type: word
|
793 |
+
bpemodel: null
|
794 |
+
non_linguistic_symbols: null
|
795 |
+
cleaner: null
|
796 |
+
g2p: null
|
797 |
+
speech_volume_normalize: null
|
798 |
+
rir_scp: null
|
799 |
+
rir_apply_prob: 1.0
|
800 |
+
noise_scp: null
|
801 |
+
noise_apply_prob: 1.0
|
802 |
+
noise_db_range: '13_15'
|
803 |
+
frontend: default
|
804 |
+
frontend_conf:
|
805 |
+
fs: 16k
|
806 |
+
specaug: specaug
|
807 |
+
specaug_conf:
|
808 |
+
apply_time_warp: true
|
809 |
+
time_warp_window: 5
|
810 |
+
time_warp_mode: bicubic
|
811 |
+
apply_freq_mask: true
|
812 |
+
freq_mask_width_range:
|
813 |
+
- 0
|
814 |
+
- 30
|
815 |
+
num_freq_mask: 2
|
816 |
+
apply_time_mask: true
|
817 |
+
time_mask_width_range:
|
818 |
+
- 0
|
819 |
+
- 40
|
820 |
+
num_time_mask: 2
|
821 |
+
normalize: utterance_mvn
|
822 |
+
normalize_conf: {}
|
823 |
+
model: espnet
|
824 |
+
model_conf:
|
825 |
+
ctc_weight: 0.3
|
826 |
+
lsm_weight: 0.1
|
827 |
+
length_normalized_loss: false
|
828 |
+
extract_feats_in_collect_stats: false
|
829 |
+
preencoder: null
|
830 |
+
preencoder_conf: {}
|
831 |
+
encoder: branchformer
|
832 |
+
encoder_conf:
|
833 |
+
output_size: 512
|
834 |
+
use_attn: true
|
835 |
+
attention_heads: 8
|
836 |
+
attention_layer_type: rel_selfattn
|
837 |
+
pos_enc_layer_type: rel_pos
|
838 |
+
rel_pos_type: latest
|
839 |
+
use_cgmlp: true
|
840 |
+
cgmlp_linear_units: 2048
|
841 |
+
cgmlp_conv_kernel: 31
|
842 |
+
use_linear_after_conv: false
|
843 |
+
gate_activation: identity
|
844 |
+
merge_method: concat
|
845 |
+
cgmlp_weight: 0.5
|
846 |
+
attn_branch_drop_rate: 0.0
|
847 |
+
num_blocks: 18
|
848 |
+
dropout_rate: 0.1
|
849 |
+
positional_dropout_rate: 0.1
|
850 |
+
attention_dropout_rate: 0.1
|
851 |
+
input_layer: conv2d
|
852 |
+
stochastic_depth_rate: 0.0
|
853 |
+
postencoder: null
|
854 |
+
postencoder_conf: {}
|
855 |
+
decoder: transformer
|
856 |
+
decoder_conf:
|
857 |
+
attention_heads: 8
|
858 |
+
linear_units: 2048
|
859 |
+
num_blocks: 6
|
860 |
+
dropout_rate: 0.1
|
861 |
+
positional_dropout_rate: 0.1
|
862 |
+
self_attention_dropout_rate: 0.1
|
863 |
+
src_attention_dropout_rate: 0.1
|
864 |
+
required:
|
865 |
+
- output_dir
|
866 |
+
- token_list
|
867 |
+
version: '202204'
|
868 |
+
distributed: false
|
869 |
+
```
|
870 |
+
|
871 |
+
</details>
|
872 |
+
|
873 |
+
|
874 |
+
|
875 |
+
### Citing ESPnet
|
876 |
+
|
877 |
+
```BibTex
|
878 |
+
@inproceedings{watanabe2018espnet,
|
879 |
+
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
|
880 |
+
title={{ESPnet}: End-to-End Speech Processing Toolkit},
|
881 |
+
year={2018},
|
882 |
+
booktitle={Proceedings of Interspeech},
|
883 |
+
pages={2207--2211},
|
884 |
+
doi={10.21437/Interspeech.2018-1456},
|
885 |
+
url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
|
886 |
+
}
|
887 |
+
|
888 |
+
|
889 |
+
|
890 |
+
|
891 |
+
```
|
892 |
+
|
893 |
+
or arXiv:
|
894 |
+
|
895 |
+
```bibtex
|
896 |
+
@misc{watanabe2018espnet,
|
897 |
+
title={ESPnet: End-to-End Speech Processing Toolkit},
|
898 |
+
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
|
899 |
+
year={2018},
|
900 |
+
eprint={1804.00015},
|
901 |
+
archivePrefix={arXiv},
|
902 |
+
primaryClass={cs.CL}
|
903 |
+
}
|
904 |
+
```
|
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/RESULTS.md
ADDED
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<!-- Generated by scripts/utils/show_asr_result.sh -->
|
2 |
+
# RESULTS
|
3 |
+
## Environments
|
4 |
+
- date: `Fri May 27 03:41:59 EDT 2022`
|
5 |
+
- python version: `3.9.12 (main, Apr 5 2022, 06:56:58) [GCC 7.5.0]`
|
6 |
+
- espnet version: `espnet 202204`
|
7 |
+
- pytorch version: `pytorch 1.11.0`
|
8 |
+
- Git hash: `4f36236ed7c8a25c2f869e518614e1ad4a8b50d6`
|
9 |
+
- Commit date: `Thu May 26 00:22:45 2022 -0400`
|
10 |
+
|
11 |
+
## asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word
|
12 |
+
### WER
|
13 |
+
|
14 |
+
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|
15 |
+
|---|---|---|---|---|---|---|---|---|
|
16 |
+
|decode_asr_asr_model_valid.acc.ave_10best/devel|8690|178058|83.7|7.6|8.8|2.8|19.2|50.5|
|
17 |
+
|decode_asr_asr_model_valid.acc.ave_10best/test|13078|262176|82.6|7.9|9.5|2.7|20.1|49.2|
|
18 |
+
|
19 |
+
### CER
|
20 |
+
|
21 |
+
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|
22 |
+
|---|---|---|---|---|---|---|---|---|
|
23 |
+
|decode_asr_asr_model_valid.acc.ave_10best/devel|8690|847400|90.1|3.0|6.9|3.3|13.2|50.5|
|
24 |
+
|decode_asr_asr_model_valid.acc.ave_10best/test|13078|1245475|89.0|3.2|7.8|3.1|14.1|49.2|
|
25 |
+
|
26 |
+
### TER
|
27 |
+
|
28 |
+
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|
29 |
+
|---|---|---|---|---|---|---|---|---|
|
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/config.yaml
ADDED
@@ -0,0 +1,806 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
config: conf/tuning/train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k.yaml
|
2 |
+
print_config: false
|
3 |
+
log_level: INFO
|
4 |
+
dry_run: false
|
5 |
+
iterator_type: sequence
|
6 |
+
output_dir: exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word
|
7 |
+
ngpu: 1
|
8 |
+
seed: 0
|
9 |
+
num_workers: 1
|
10 |
+
num_att_plot: 3
|
11 |
+
dist_backend: nccl
|
12 |
+
dist_init_method: env://
|
13 |
+
dist_world_size: null
|
14 |
+
dist_rank: null
|
15 |
+
local_rank: 0
|
16 |
+
dist_master_addr: null
|
17 |
+
dist_master_port: null
|
18 |
+
dist_launcher: null
|
19 |
+
multiprocessing_distributed: false
|
20 |
+
unused_parameters: false
|
21 |
+
sharded_ddp: false
|
22 |
+
cudnn_enabled: true
|
23 |
+
cudnn_benchmark: false
|
24 |
+
cudnn_deterministic: true
|
25 |
+
collect_stats: false
|
26 |
+
write_collected_feats: false
|
27 |
+
max_epoch: 50
|
28 |
+
patience: null
|
29 |
+
val_scheduler_criterion:
|
30 |
+
- valid
|
31 |
+
- loss
|
32 |
+
early_stopping_criterion:
|
33 |
+
- valid
|
34 |
+
- loss
|
35 |
+
- min
|
36 |
+
best_model_criterion:
|
37 |
+
- - valid
|
38 |
+
- acc
|
39 |
+
- max
|
40 |
+
keep_nbest_models: 10
|
41 |
+
nbest_averaging_interval: 0
|
42 |
+
grad_clip: 5.0
|
43 |
+
grad_clip_type: 2.0
|
44 |
+
grad_noise: false
|
45 |
+
accum_grad: 1
|
46 |
+
no_forward_run: false
|
47 |
+
resume: true
|
48 |
+
train_dtype: float32
|
49 |
+
use_amp: false
|
50 |
+
log_interval: null
|
51 |
+
use_matplotlib: true
|
52 |
+
use_tensorboard: true
|
53 |
+
use_wandb: false
|
54 |
+
wandb_project: null
|
55 |
+
wandb_id: null
|
56 |
+
wandb_entity: null
|
57 |
+
wandb_name: null
|
58 |
+
wandb_model_log_interval: -1
|
59 |
+
detect_anomaly: false
|
60 |
+
pretrain_path: null
|
61 |
+
init_param: []
|
62 |
+
ignore_init_mismatch: false
|
63 |
+
freeze_param: []
|
64 |
+
num_iters_per_epoch: null
|
65 |
+
batch_size: 64
|
66 |
+
valid_batch_size: null
|
67 |
+
batch_bins: 1000000
|
68 |
+
valid_batch_bins: null
|
69 |
+
train_shape_file:
|
70 |
+
- exp/asr_stats_raw_en_word/train/speech_shape
|
71 |
+
- exp/asr_stats_raw_en_word/train/text_shape.word
|
72 |
+
valid_shape_file:
|
73 |
+
- exp/asr_stats_raw_en_word/valid/speech_shape
|
74 |
+
- exp/asr_stats_raw_en_word/valid/text_shape.word
|
75 |
+
batch_type: folded
|
76 |
+
valid_batch_type: null
|
77 |
+
fold_length:
|
78 |
+
- 80000
|
79 |
+
- 150
|
80 |
+
sort_in_batch: descending
|
81 |
+
sort_batch: descending
|
82 |
+
multiple_iterator: false
|
83 |
+
chunk_length: 500
|
84 |
+
chunk_shift_ratio: 0.5
|
85 |
+
num_cache_chunks: 1024
|
86 |
+
train_data_path_and_name_and_type:
|
87 |
+
- - dump/raw/train/wav.scp
|
88 |
+
- speech
|
89 |
+
- kaldi_ark
|
90 |
+
- - dump/raw/train/text
|
91 |
+
- text
|
92 |
+
- text
|
93 |
+
valid_data_path_and_name_and_type:
|
94 |
+
- - dump/raw/devel/wav.scp
|
95 |
+
- speech
|
96 |
+
- kaldi_ark
|
97 |
+
- - dump/raw/devel/text
|
98 |
+
- text
|
99 |
+
- text
|
100 |
+
allow_variable_data_keys: false
|
101 |
+
max_cache_size: 0.0
|
102 |
+
max_cache_fd: 32
|
103 |
+
valid_max_cache_size: null
|
104 |
+
optim: adam
|
105 |
+
optim_conf:
|
106 |
+
lr: 0.001
|
107 |
+
weight_decay: 1.0e-06
|
108 |
+
scheduler: warmuplr
|
109 |
+
scheduler_conf:
|
110 |
+
warmup_steps: 35000
|
111 |
+
token_list:
|
112 |
+
- <blank>
|
113 |
+
- <unk>
|
114 |
+
- βSEP
|
115 |
+
- βFILL
|
116 |
+
- s
|
117 |
+
- βthe
|
118 |
+
- a
|
119 |
+
- βto
|
120 |
+
- βi
|
121 |
+
- βme
|
122 |
+
- e
|
123 |
+
- βs
|
124 |
+
- βa
|
125 |
+
- i
|
126 |
+
- βyou
|
127 |
+
- βwhat
|
128 |
+
- er
|
129 |
+
- ing
|
130 |
+
- u
|
131 |
+
- βis
|
132 |
+
- ''''
|
133 |
+
- o
|
134 |
+
- p
|
135 |
+
- βin
|
136 |
+
- βp
|
137 |
+
- y
|
138 |
+
- βmy
|
139 |
+
- βplease
|
140 |
+
- d
|
141 |
+
- c
|
142 |
+
- m
|
143 |
+
- βb
|
144 |
+
- l
|
145 |
+
- βm
|
146 |
+
- βc
|
147 |
+
- st
|
148 |
+
- date
|
149 |
+
- n
|
150 |
+
- βd
|
151 |
+
- le
|
152 |
+
- b
|
153 |
+
- βfor
|
154 |
+
- re
|
155 |
+
- t
|
156 |
+
- βon
|
157 |
+
- en
|
158 |
+
- h
|
159 |
+
- 'on'
|
160 |
+
- ar
|
161 |
+
- person
|
162 |
+
- βre
|
163 |
+
- βf
|
164 |
+
- βg
|
165 |
+
- βof
|
166 |
+
- an
|
167 |
+
- β
|
168 |
+
- g
|
169 |
+
- βtoday
|
170 |
+
- βt
|
171 |
+
- or
|
172 |
+
- βit
|
173 |
+
- βthis
|
174 |
+
- βh
|
175 |
+
- r
|
176 |
+
- f
|
177 |
+
- at
|
178 |
+
- ch
|
179 |
+
- ce
|
180 |
+
- place_name
|
181 |
+
- βemail
|
182 |
+
- βdo
|
183 |
+
- es
|
184 |
+
- ri
|
185 |
+
- βe
|
186 |
+
- βw
|
187 |
+
- ic
|
188 |
+
- in
|
189 |
+
- βthat
|
190 |
+
- event_name
|
191 |
+
- βplay
|
192 |
+
- βand
|
193 |
+
- al
|
194 |
+
- βn
|
195 |
+
- βcan
|
196 |
+
- email_query
|
197 |
+
- ve
|
198 |
+
- βnew
|
199 |
+
- day
|
200 |
+
- it
|
201 |
+
- ate
|
202 |
+
- βfrom
|
203 |
+
- βhave
|
204 |
+
- k
|
205 |
+
- time
|
206 |
+
- βam
|
207 |
+
- media_type
|
208 |
+
- email_sendemail
|
209 |
+
- ent
|
210 |
+
- βolly
|
211 |
+
- qa_factoid
|
212 |
+
- se
|
213 |
+
- v
|
214 |
+
- et
|
215 |
+
- ck
|
216 |
+
- βany
|
217 |
+
- calendar_set
|
218 |
+
- ly
|
219 |
+
- th
|
220 |
+
- βhow
|
221 |
+
- βmeeting
|
222 |
+
- ed
|
223 |
+
- βtell
|
224 |
+
- βst
|
225 |
+
- x
|
226 |
+
- ur
|
227 |
+
- ro
|
228 |
+
- βat
|
229 |
+
- nd
|
230 |
+
- βlist
|
231 |
+
- w
|
232 |
+
- βu
|
233 |
+
- ou
|
234 |
+
- βnot
|
235 |
+
- βabout
|
236 |
+
- βan
|
237 |
+
- βo
|
238 |
+
- general_negate
|
239 |
+
- ut
|
240 |
+
- βtime
|
241 |
+
- βbe
|
242 |
+
- βch
|
243 |
+
- βare
|
244 |
+
- social_post
|
245 |
+
- business_name
|
246 |
+
- la
|
247 |
+
- ty
|
248 |
+
- play_music
|
249 |
+
- ot
|
250 |
+
- general_quirky
|
251 |
+
- βl
|
252 |
+
- βsh
|
253 |
+
- βtweet
|
254 |
+
- om
|
255 |
+
- βweek
|
256 |
+
- um
|
257 |
+
- βone
|
258 |
+
- ter
|
259 |
+
- βhe
|
260 |
+
- βup
|
261 |
+
- βcom
|
262 |
+
- general_praise
|
263 |
+
- weather_query
|
264 |
+
- βnext
|
265 |
+
- βth
|
266 |
+
- βcheck
|
267 |
+
- calendar_query
|
268 |
+
- βlast
|
269 |
+
- βro
|
270 |
+
- ad
|
271 |
+
- is
|
272 |
+
- βwith
|
273 |
+
- ay
|
274 |
+
- βsend
|
275 |
+
- pe
|
276 |
+
- βpm
|
277 |
+
- βtomorrow
|
278 |
+
- βj
|
279 |
+
- un
|
280 |
+
- βtrain
|
281 |
+
- general_explain
|
282 |
+
- βv
|
283 |
+
- one
|
284 |
+
- βr
|
285 |
+
- ra
|
286 |
+
- news_query
|
287 |
+
- ation
|
288 |
+
- βemails
|
289 |
+
- us
|
290 |
+
- if
|
291 |
+
- ct
|
292 |
+
- βco
|
293 |
+
- βadd
|
294 |
+
- βwill
|
295 |
+
- βse
|
296 |
+
- nt
|
297 |
+
- βwas
|
298 |
+
- ine
|
299 |
+
- βde
|
300 |
+
- βset
|
301 |
+
- βex
|
302 |
+
- βwould
|
303 |
+
- ir
|
304 |
+
- ow
|
305 |
+
- ber
|
306 |
+
- general_repeat
|
307 |
+
- ight
|
308 |
+
- ook
|
309 |
+
- βagain
|
310 |
+
- βsong
|
311 |
+
- currency_name
|
312 |
+
- ll
|
313 |
+
- βha
|
314 |
+
- βgo
|
315 |
+
- relation
|
316 |
+
- te
|
317 |
+
- ion
|
318 |
+
- and
|
319 |
+
- βy
|
320 |
+
- βye
|
321 |
+
- general_affirm
|
322 |
+
- general_confirm
|
323 |
+
- ery
|
324 |
+
- βpo
|
325 |
+
- ff
|
326 |
+
- βwe
|
327 |
+
- βturn
|
328 |
+
- βdid
|
329 |
+
- βmar
|
330 |
+
- βalarm
|
331 |
+
- βlike
|
332 |
+
- datetime_query
|
333 |
+
- ers
|
334 |
+
- βall
|
335 |
+
- βremind
|
336 |
+
- βso
|
337 |
+
- qa_definition
|
338 |
+
- βcalendar
|
339 |
+
- end
|
340 |
+
- βsaid
|
341 |
+
- ci
|
342 |
+
- βoff
|
343 |
+
- βjohn
|
344 |
+
- βday
|
345 |
+
- ss
|
346 |
+
- pla
|
347 |
+
- ume
|
348 |
+
- βget
|
349 |
+
- ail
|
350 |
+
- pp
|
351 |
+
- z
|
352 |
+
- ry
|
353 |
+
- am
|
354 |
+
- βneed
|
355 |
+
- as
|
356 |
+
- βthank
|
357 |
+
- βwh
|
358 |
+
- βwant
|
359 |
+
- βright
|
360 |
+
- βjo
|
361 |
+
- βfacebook
|
362 |
+
- βk
|
363 |
+
- ge
|
364 |
+
- ld
|
365 |
+
- βfri
|
366 |
+
- βtwo
|
367 |
+
- general_dontcare
|
368 |
+
- βnews
|
369 |
+
- ol
|
370 |
+
- oo
|
371 |
+
- ant
|
372 |
+
- βfive
|
373 |
+
- βevent
|
374 |
+
- ake
|
375 |
+
- definition_word
|
376 |
+
- transport_type
|
377 |
+
- βyour
|
378 |
+
- vi
|
379 |
+
- orn
|
380 |
+
- op
|
381 |
+
- βweather
|
382 |
+
- ome
|
383 |
+
- βapp
|
384 |
+
- βlo
|
385 |
+
- de
|
386 |
+
- βmusic
|
387 |
+
- weather_descriptor
|
388 |
+
- ak
|
389 |
+
- ke
|
390 |
+
- βthere
|
391 |
+
- βsi
|
392 |
+
- βlights
|
393 |
+
- βnow
|
394 |
+
- βmo
|
395 |
+
- calendar_remove
|
396 |
+
- our
|
397 |
+
- βdollar
|
398 |
+
- food_type
|
399 |
+
- me
|
400 |
+
- βmore
|
401 |
+
- βno
|
402 |
+
- βbirthday
|
403 |
+
- orrect
|
404 |
+
- βrep
|
405 |
+
- βshow
|
406 |
+
- play_radio
|
407 |
+
- βmon
|
408 |
+
- βdoes
|
409 |
+
- ood
|
410 |
+
- ag
|
411 |
+
- li
|
412 |
+
- βsto
|
413 |
+
- βcontact
|
414 |
+
- cket
|
415 |
+
- email_querycontact
|
416 |
+
- βev
|
417 |
+
- βcould
|
418 |
+
- ange
|
419 |
+
- βjust
|
420 |
+
- out
|
421 |
+
- ame
|
422 |
+
- .
|
423 |
+
- βja
|
424 |
+
- βconfirm
|
425 |
+
- qa_currency
|
426 |
+
- βman
|
427 |
+
- βlate
|
428 |
+
- βthink
|
429 |
+
- βsome
|
430 |
+
- timeofday
|
431 |
+
- βbo
|
432 |
+
- qa_stock
|
433 |
+
- ong
|
434 |
+
- βstart
|
435 |
+
- βwork
|
436 |
+
- βten
|
437 |
+
- int
|
438 |
+
- βcommand
|
439 |
+
- all
|
440 |
+
- βmake
|
441 |
+
- βla
|
442 |
+
- j
|
443 |
+
- βansw
|
444 |
+
- βhour
|
445 |
+
- βcle
|
446 |
+
- ah
|
447 |
+
- βfind
|
448 |
+
- βservice
|
449 |
+
- βfa
|
450 |
+
- qu
|
451 |
+
- general_commandstop
|
452 |
+
- ai
|
453 |
+
- βwhen
|
454 |
+
- βte
|
455 |
+
- βby
|
456 |
+
- social_query
|
457 |
+
- ard
|
458 |
+
- βtw
|
459 |
+
- ul
|
460 |
+
- id
|
461 |
+
- βseven
|
462 |
+
- βwhere
|
463 |
+
- βmuch
|
464 |
+
- art
|
465 |
+
- βappointment
|
466 |
+
- ver
|
467 |
+
- artist_name
|
468 |
+
- el
|
469 |
+
- device_type
|
470 |
+
- βknow
|
471 |
+
- βthree
|
472 |
+
- βevents
|
473 |
+
- βtr
|
474 |
+
- βli
|
475 |
+
- ork
|
476 |
+
- red
|
477 |
+
- ect
|
478 |
+
- βlet
|
479 |
+
- βrespon
|
480 |
+
- βpar
|
481 |
+
- zz
|
482 |
+
- βgive
|
483 |
+
- βtwenty
|
484 |
+
- βti
|
485 |
+
- βcurre
|
486 |
+
- play_podcasts
|
487 |
+
- βradio
|
488 |
+
- cooking_recipe
|
489 |
+
- transport_query
|
490 |
+
- βcon
|
491 |
+
- gh
|
492 |
+
- βle
|
493 |
+
- lists_query
|
494 |
+
- βrem
|
495 |
+
- recommendation_events
|
496 |
+
- house_place
|
497 |
+
- alarm_set
|
498 |
+
- play_audiobook
|
499 |
+
- ist
|
500 |
+
- ase
|
501 |
+
- music_genre
|
502 |
+
- ive
|
503 |
+
- ast
|
504 |
+
- player_setting
|
505 |
+
- ort
|
506 |
+
- lly
|
507 |
+
- news_topic
|
508 |
+
- list_name
|
509 |
+
- βplaylist
|
510 |
+
- βne
|
511 |
+
- business_type
|
512 |
+
- personal_info
|
513 |
+
- ind
|
514 |
+
- ust
|
515 |
+
- di
|
516 |
+
- ress
|
517 |
+
- recommendation_locations
|
518 |
+
- lists_createoradd
|
519 |
+
- iot_hue_lightoff
|
520 |
+
- lists_remove
|
521 |
+
- ord
|
522 |
+
- βlight
|
523 |
+
- ere
|
524 |
+
- alarm_query
|
525 |
+
- audio_volume_mute
|
526 |
+
- music_query
|
527 |
+
- βaudio
|
528 |
+
- rain
|
529 |
+
- βdate
|
530 |
+
- βorder
|
531 |
+
- audio_volume_up
|
532 |
+
- βar
|
533 |
+
- βpodcast
|
534 |
+
- transport_ticket
|
535 |
+
- mail
|
536 |
+
- iot_hue_lightchange
|
537 |
+
- iot_coffee
|
538 |
+
- radio_name
|
539 |
+
- ill
|
540 |
+
- βri
|
541 |
+
- '@'
|
542 |
+
- takeaway_query
|
543 |
+
- song_name
|
544 |
+
- takeaway_order
|
545 |
+
- βra
|
546 |
+
- email_addcontact
|
547 |
+
- play_game
|
548 |
+
- book
|
549 |
+
- transport_traffic
|
550 |
+
- βhouse
|
551 |
+
- music_likeness
|
552 |
+
- her
|
553 |
+
- transport_taxi
|
554 |
+
- iot_hue_lightdim
|
555 |
+
- ment
|
556 |
+
- ght
|
557 |
+
- fo
|
558 |
+
- order_type
|
559 |
+
- color_type
|
560 |
+
- '1'
|
561 |
+
- ven
|
562 |
+
- ould
|
563 |
+
- general_joke
|
564 |
+
- ess
|
565 |
+
- ain
|
566 |
+
- qa_maths
|
567 |
+
- βplace
|
568 |
+
- βtwe
|
569 |
+
- cast
|
570 |
+
- iot_cleaning
|
571 |
+
- βche
|
572 |
+
- βcont
|
573 |
+
- ith
|
574 |
+
- audiobook_name
|
575 |
+
- email_address
|
576 |
+
- game_name
|
577 |
+
- βcal
|
578 |
+
- general_frequency
|
579 |
+
- βtom
|
580 |
+
- βfood
|
581 |
+
- act
|
582 |
+
- iot_hue_lightup
|
583 |
+
- '2'
|
584 |
+
- alarm_remove
|
585 |
+
- podcast_descriptor
|
586 |
+
- βdefinition
|
587 |
+
- audio_volume_down
|
588 |
+
- βmedia
|
589 |
+
- email_folder
|
590 |
+
- dia
|
591 |
+
- meal_type
|
592 |
+
- βmus
|
593 |
+
- recommendation_movies
|
594 |
+
- βad
|
595 |
+
- ree
|
596 |
+
- pt
|
597 |
+
- now
|
598 |
+
- playlist_name
|
599 |
+
- βperson
|
600 |
+
- change_amount
|
601 |
+
- βpla
|
602 |
+
- escri
|
603 |
+
- datetime_convert
|
604 |
+
- podcast_name
|
605 |
+
- βab
|
606 |
+
- time_zone
|
607 |
+
- βdef
|
608 |
+
- ting
|
609 |
+
- iot_wemo_on
|
610 |
+
- music_settings
|
611 |
+
- iot_wemo_off
|
612 |
+
- orre
|
613 |
+
- cy
|
614 |
+
- ank
|
615 |
+
- music_descriptor
|
616 |
+
- lar
|
617 |
+
- app_name
|
618 |
+
- row
|
619 |
+
- joke_type
|
620 |
+
- xt
|
621 |
+
- of
|
622 |
+
- ition
|
623 |
+
- βmeet
|
624 |
+
- ink
|
625 |
+
- βconfir
|
626 |
+
- transport_agency
|
627 |
+
- general_greet
|
628 |
+
- βbusiness
|
629 |
+
- βart
|
630 |
+
- βag
|
631 |
+
- urn
|
632 |
+
- escript
|
633 |
+
- rom
|
634 |
+
- βrel
|
635 |
+
- βau
|
636 |
+
- βcurrency
|
637 |
+
- audio_volume_other
|
638 |
+
- iot_hue_lighton
|
639 |
+
- βartist
|
640 |
+
- '?'
|
641 |
+
- βbus
|
642 |
+
- cooking_type
|
643 |
+
- movie_name
|
644 |
+
- coffee_type
|
645 |
+
- ingredient
|
646 |
+
- ather
|
647 |
+
- music_dislikeness
|
648 |
+
- sp
|
649 |
+
- q
|
650 |
+
- βser
|
651 |
+
- esc
|
652 |
+
- βbir
|
653 |
+
- βcur
|
654 |
+
- name
|
655 |
+
- βtran
|
656 |
+
- βhou
|
657 |
+
- ek
|
658 |
+
- uch
|
659 |
+
- βconf
|
660 |
+
- βface
|
661 |
+
- '9'
|
662 |
+
- βbirth
|
663 |
+
- I
|
664 |
+
- sw
|
665 |
+
- transport_descriptor
|
666 |
+
- βcomm
|
667 |
+
- lease
|
668 |
+
- transport_name
|
669 |
+
- aid
|
670 |
+
- movie_type
|
671 |
+
- βdevice
|
672 |
+
- alarm_type
|
673 |
+
- audiobook_author
|
674 |
+
- '5'
|
675 |
+
- drink_type
|
676 |
+
- βjoh
|
677 |
+
- βdefin
|
678 |
+
- word
|
679 |
+
- βcurren
|
680 |
+
- order
|
681 |
+
- iness
|
682 |
+
- W
|
683 |
+
- cooking_query
|
684 |
+
- sport_type
|
685 |
+
- βrelation
|
686 |
+
- oint
|
687 |
+
- H
|
688 |
+
- '8'
|
689 |
+
- A
|
690 |
+
- '0'
|
691 |
+
- βdol
|
692 |
+
- vice
|
693 |
+
- βpers
|
694 |
+
- '&'
|
695 |
+
- T
|
696 |
+
- βappoint
|
697 |
+
- _
|
698 |
+
- '7'
|
699 |
+
- '3'
|
700 |
+
- '-'
|
701 |
+
- game_type
|
702 |
+
- βpod
|
703 |
+
- N
|
704 |
+
- M
|
705 |
+
- E
|
706 |
+
- list
|
707 |
+
- music_album
|
708 |
+
- dio
|
709 |
+
- βtransport
|
710 |
+
- qa_query
|
711 |
+
- C
|
712 |
+
- O
|
713 |
+
- U
|
714 |
+
- query_detail
|
715 |
+
- ']'
|
716 |
+
- '['
|
717 |
+
- descriptor
|
718 |
+
- ':'
|
719 |
+
- spon
|
720 |
+
- <sos/eos>
|
721 |
+
init: null
|
722 |
+
input_size: null
|
723 |
+
ctc_conf:
|
724 |
+
dropout_rate: 0.0
|
725 |
+
ctc_type: builtin
|
726 |
+
reduce: true
|
727 |
+
ignore_nan_grad: true
|
728 |
+
joint_net_conf: null
|
729 |
+
use_preprocessor: true
|
730 |
+
token_type: word
|
731 |
+
bpemodel: null
|
732 |
+
non_linguistic_symbols: null
|
733 |
+
cleaner: null
|
734 |
+
g2p: null
|
735 |
+
speech_volume_normalize: null
|
736 |
+
rir_scp: null
|
737 |
+
rir_apply_prob: 1.0
|
738 |
+
noise_scp: null
|
739 |
+
noise_apply_prob: 1.0
|
740 |
+
noise_db_range: '13_15'
|
741 |
+
frontend: default
|
742 |
+
frontend_conf:
|
743 |
+
fs: 16k
|
744 |
+
specaug: specaug
|
745 |
+
specaug_conf:
|
746 |
+
apply_time_warp: true
|
747 |
+
time_warp_window: 5
|
748 |
+
time_warp_mode: bicubic
|
749 |
+
apply_freq_mask: true
|
750 |
+
freq_mask_width_range:
|
751 |
+
- 0
|
752 |
+
- 30
|
753 |
+
num_freq_mask: 2
|
754 |
+
apply_time_mask: true
|
755 |
+
time_mask_width_range:
|
756 |
+
- 0
|
757 |
+
- 40
|
758 |
+
num_time_mask: 2
|
759 |
+
normalize: utterance_mvn
|
760 |
+
normalize_conf: {}
|
761 |
+
model: espnet
|
762 |
+
model_conf:
|
763 |
+
ctc_weight: 0.3
|
764 |
+
lsm_weight: 0.1
|
765 |
+
length_normalized_loss: false
|
766 |
+
extract_feats_in_collect_stats: false
|
767 |
+
preencoder: null
|
768 |
+
preencoder_conf: {}
|
769 |
+
encoder: branchformer
|
770 |
+
encoder_conf:
|
771 |
+
output_size: 512
|
772 |
+
use_attn: true
|
773 |
+
attention_heads: 8
|
774 |
+
attention_layer_type: rel_selfattn
|
775 |
+
pos_enc_layer_type: rel_pos
|
776 |
+
rel_pos_type: latest
|
777 |
+
use_cgmlp: true
|
778 |
+
cgmlp_linear_units: 2048
|
779 |
+
cgmlp_conv_kernel: 31
|
780 |
+
use_linear_after_conv: false
|
781 |
+
gate_activation: identity
|
782 |
+
merge_method: concat
|
783 |
+
cgmlp_weight: 0.5
|
784 |
+
attn_branch_drop_rate: 0.0
|
785 |
+
num_blocks: 18
|
786 |
+
dropout_rate: 0.1
|
787 |
+
positional_dropout_rate: 0.1
|
788 |
+
attention_dropout_rate: 0.1
|
789 |
+
input_layer: conv2d
|
790 |
+
stochastic_depth_rate: 0.0
|
791 |
+
postencoder: null
|
792 |
+
postencoder_conf: {}
|
793 |
+
decoder: transformer
|
794 |
+
decoder_conf:
|
795 |
+
attention_heads: 8
|
796 |
+
linear_units: 2048
|
797 |
+
num_blocks: 6
|
798 |
+
dropout_rate: 0.1
|
799 |
+
positional_dropout_rate: 0.1
|
800 |
+
self_attention_dropout_rate: 0.1
|
801 |
+
src_attention_dropout_rate: 0.1
|
802 |
+
required:
|
803 |
+
- output_dir
|
804 |
+
- token_list
|
805 |
+
version: '202204'
|
806 |
+
distributed: false
|
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/acc.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/backward_time.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/cer.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/cer_ctc.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/forward_time.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/gpu_max_cached_mem_GB.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/iter_time.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/loss.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/loss_att.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/loss_ctc.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/optim0_lr0.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/optim_step_time.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/train_time.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/images/wer.png
ADDED
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/score.log
ADDED
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Valid Intent Classification Result
|
2 |
+
0.8727272727272728
|
3 |
+
Test Intent Classification Result
|
4 |
+
0.8653463832390274
|
5 |
+
ββββββββββββββ€ββββββββββββββ€βββββββββββ€ββββββββββββββ
|
6 |
+
β Scenario β Precision β Recall β F-Measure β
|
7 |
+
ββββββββββββββͺββββββββββββββͺβββββββββββͺββββββββββββββ‘
|
8 |
+
β OVERALL β 0.9048 β 0.9048 β 0.9048 β
|
9 |
+
ββββββββββββββ§ββββββββββββββ§βββββββββββ§ββββββββββββββ
|
10 |
+
|
11 |
+
ββββββββββββ€ββββββββββββββ€βββββββββββ€ββββββββββββββ
|
12 |
+
β Action β Precision β Recall β F-Measure β
|
13 |
+
ββββββββββββͺββββββββββββββͺβββββββββββͺββββββββββββββ‘
|
14 |
+
β OVERALL β 0.8761 β 0.8761 β 0.8761 β
|
15 |
+
ββββββββββββ§ββββββββββββββ§βββββββββββ§ββββββββββββββ
|
16 |
+
|
17 |
+
βββββββββββββββββββββββ€ββββββββββββββ€βββββββββββ€ββββββββββββββ
|
18 |
+
β Intent (scen_act) β Precision β Recall β F-Measure β
|
19 |
+
βββββββββββββββββββββββͺββββββββββββββͺβββββββββββͺββββββββββββββ‘
|
20 |
+
β OVERALL β 0.8653 β 0.8653 β 0.8653 β
|
21 |
+
βββββββββββββββββββββββ§ββββββββββββββ§βββββββββββ§ββββββββββββββ
|
22 |
+
|
23 |
+
ββββββββββββββ€ββββββββββββββ€βββββββββββ€ββββββββββββββ
|
24 |
+
β Entities β Precision β Recall β F-Measure β
|
25 |
+
ββββββββββββββͺββββββββββββββͺβββββββββββͺββββββββββββββ‘
|
26 |
+
β OVERALL β 0.7419 β 0.7007 β 0.7207 β
|
27 |
+
ββββββββββββββ§ββββββββββββββ§βββββββββββ§ββββββββββββββ
|
28 |
+
|
29 |
+
ββββββββββββββββββββββββββββββ€ββββββββββββββ€βββββββββββ€ββββββββββββββ
|
30 |
+
β Entities (distance word) β Precision β Recall β F-Measure β
|
31 |
+
ββββββββββββββββββββββββββββββͺββββββββββββββͺβββββββββββͺββββββββββββββ‘
|
32 |
+
β OVERALL β 0.7805 β 0.7414 β 0.7604 β
|
33 |
+
ββββββββββββββββββββββββββββββ§ββββββββββββββ§βββββββββββ§ββββββββββββββ
|
34 |
+
|
35 |
+
ββββββββββββββββββββββββββββββ€ββββββββββββββ€βββββββββββ€ββββββββββββββ
|
36 |
+
β Entities (distance char) β Precision β Recall β F-Measure β
|
37 |
+
ββββββββββββββββββββββββββββββͺββββββββββββββͺβββββββββββͺββββββββββββββ‘
|
38 |
+
β OVERALL β 0.8146 β 0.7721 β 0.7928 β
|
39 |
+
ββββββββββββββββββββββββββββββ§ββββββββββββββ§βββββββββββ§ββββββββββββββ
|
40 |
+
|
41 |
+
ββββββββββββ€ββββββββββββββ€βββββββββββ€ββββββββββββββ
|
42 |
+
β Slu f1 β Precision β Recall β F-Measure β
|
43 |
+
ββββββββββββͺββββββββββββββͺβββββββββββͺββββββββββββββ‘
|
44 |
+
β OVERALL β 0.7972 β 0.7564 β 0.7763 β
|
45 |
+
ββββββββββββ§ββββββββββββββ§βββββββββββ§ββββββββββββββ
|
46 |
+
|
exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/valid.acc.ave_10best.pth
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:eb1c501888c63379c68912c3f3c25185122a01aa7d12fd4f050dc003fe89e6c1
|
3 |
+
size 382894173
|
meta.yaml
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
espnet: '202204'
|
2 |
+
files:
|
3 |
+
asr_model_file: exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/valid.acc.ave_10best.pth
|
4 |
+
python: "3.9.12 (main, Apr 5 2022, 06:56:58) \n[GCC 7.5.0]"
|
5 |
+
timestamp: 1653637321.587523
|
6 |
+
torch: 1.11.0
|
7 |
+
yaml_files:
|
8 |
+
asr_train_config: exp/asr_train_asr_branchformer_e18_d6_size512_lr1e-3_warmup35k_raw_en_word/config.yaml
|