csukuangfj
commited on
Commit
•
fefae8c
1
Parent(s):
1bd2a85
Add model.
Browse files- .gitattributes +3 -0
- README.md +82 -6
- exp/pretrained.pt +3 -0
.gitattributes
CHANGED
@@ -25,3 +25,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
25 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
26 |
*.zstandard filter=lfs diff=lfs merge=lfs -text
|
27 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
25 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
26 |
*.zstandard filter=lfs diff=lfs merge=lfs -text
|
27 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
28 |
+
exp/cpu_jit.pt filter=lfs diff=lfs merge=lfs -text
|
29 |
+
exp/pretrained.pt filter=lfs diff=lfs merge=lfs -text
|
30 |
+
exp filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -51,7 +51,8 @@ The command for decoding is:
|
|
51 |
--nbest-scale 0.5
|
52 |
```
|
53 |
|
54 |
-
You can find the log in this
|
|
|
55 |
|
56 |
The best WER for the librispeech test dataset is:
|
57 |
|
@@ -59,7 +60,7 @@ The best WER for the librispeech test dataset is:
|
|
59 |
|-----|------------|------------|
|
60 |
| WER | 2.42 | 5.73 |
|
61 |
|
62 |
-
|
63 |
|
64 |
| ngram_lm_scale | attention_scale |
|
65 |
|----------------|-----------------|
|
@@ -68,13 +69,14 @@ The best scale values are:
|
|
68 |
|
69 |
# File description
|
70 |
|
71 |
-
-
|
72 |
-
-
|
|
|
73 |
|
74 |
Note: For the `data/lm` directory, we provide only `G_4_gram.pt`. If you need other files
|
75 |
in this directory, please run `./prepare.sh`.
|
76 |
|
77 |
-
-
|
78 |
|
79 |
`exp/pretrained.pt` is generated by the following command:
|
80 |
```
|
@@ -86,6 +88,14 @@ in this directory, please run `./prepare.sh`.
|
|
86 |
--exp-dir conformer_ctc/exp_500_att0.8
|
87 |
```
|
88 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
89 |
`exp/cpu_jit.pt` is generated by the following command:
|
90 |
```
|
91 |
./conformer_ctc/export.py \
|
@@ -108,7 +118,73 @@ git checkout v2.0-pre
|
|
108 |
mkdir build_release
|
109 |
cd build_release
|
110 |
cmake -DCMAKE_BUILD_TYPE=Release ..
|
111 |
-
make -j ctc_decode ngram_lm_rescore attention_rescore
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
112 |
```
|
113 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
114 |
[icefall]: https://github.com/k2-fsa/icefall
|
|
|
51 |
--nbest-scale 0.5
|
52 |
```
|
53 |
|
54 |
+
You can find the decoding log for the above command in this
|
55 |
+
repo: [log/log-decode-2021-11-09-17-38-28](log/log-decode-2021-11-09-17-38-28).
|
56 |
|
57 |
The best WER for the librispeech test dataset is:
|
58 |
|
|
|
60 |
|-----|------------|------------|
|
61 |
| WER | 2.42 | 5.73 |
|
62 |
|
63 |
+
Scale values used in n-gram LM rescoring and attention rescoring for the best WERs are:
|
64 |
|
65 |
| ngram_lm_scale | attention_scale |
|
66 |
|----------------|-----------------|
|
|
|
69 |
|
70 |
# File description
|
71 |
|
72 |
+
- [log/](log), this directory contains the decoding log
|
73 |
+
- [test_wavs](test_wavs), this directory contains wave files for testing the pre-trained model
|
74 |
+
- [data/](data), this directory contains files generated by `./prepare.sh`
|
75 |
|
76 |
Note: For the `data/lm` directory, we provide only `G_4_gram.pt`. If you need other files
|
77 |
in this directory, please run `./prepare.sh`.
|
78 |
|
79 |
+
- [exp](exp), this directory contains two files: `preprained.pt` and `cpu_jit.pt`.
|
80 |
|
81 |
`exp/pretrained.pt` is generated by the following command:
|
82 |
```
|
|
|
88 |
--exp-dir conformer_ctc/exp_500_att0.8
|
89 |
```
|
90 |
|
91 |
+
**HINT**: To use `pre-trained.pt` to compute the WER for test-clean and test-other,
|
92 |
+
just do the following:
|
93 |
+
```
|
94 |
+
cp exp/pretrained
|
95 |
+
cp icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/pretrained.pt /path/to/icefall/egs/librispeech/ASR/conformer_ctc/exp/epoch-999.pt
|
96 |
+
```
|
97 |
+
and pass `--epoch 999 --avg 1` to `conformer_ctc/decode.py`.
|
98 |
+
|
99 |
`exp/cpu_jit.pt` is generated by the following command:
|
100 |
```
|
101 |
./conformer_ctc/export.py \
|
|
|
118 |
mkdir build_release
|
119 |
cd build_release
|
120 |
cmake -DCMAKE_BUILD_TYPE=Release ..
|
121 |
+
make -j ctc_decode hlg_decode ngram_lm_rescore attention_rescore
|
122 |
+
```
|
123 |
+
|
124 |
+
## CTC decoding
|
125 |
+
```
|
126 |
+
cd k2/build_release
|
127 |
+
./bin/ctc_decode \
|
128 |
+
--use_gpu true \
|
129 |
+
--nn_model ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/cpu_jit.pt \
|
130 |
+
--bpe_model ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/bpe.model \
|
131 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav \
|
132 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav \
|
133 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav
|
134 |
+
```
|
135 |
+
|
136 |
+
## HLG decoding
|
137 |
+
|
138 |
+
```
|
139 |
+
./bin/hlg_decode \
|
140 |
+
--use_gpu true \
|
141 |
+
--nn_model ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/cpu_jit.pt \
|
142 |
+
--hlg ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt \
|
143 |
+
--word_table ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/words.txt \
|
144 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav \
|
145 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav \
|
146 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav
|
147 |
```
|
148 |
|
149 |
+
## HLG decoding + n-gram LM rescoring
|
150 |
+
|
151 |
+
**NOTE**: V100 GPU with 16 GB RAM is known NOT to work because of OOM.
|
152 |
+
V100 GPU with 32 GB RAM is known to work.
|
153 |
+
|
154 |
+
```
|
155 |
+
./bin/ngram_lm_rescore \
|
156 |
+
--use_gpu true \
|
157 |
+
--nn_model ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/cpu_jit.pt \
|
158 |
+
--hlg ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt \
|
159 |
+
--g ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lm/G_4_gram.pt \
|
160 |
+
--ngram_lm_scale 1.0 \
|
161 |
+
--word_table ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/words.txt \
|
162 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav \
|
163 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav \
|
164 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav
|
165 |
+
```
|
166 |
+
|
167 |
+
## HLG decoding + n-gram LM rescoring + attention decoder rescoring
|
168 |
+
|
169 |
+
```
|
170 |
+
./bin/attention_rescore \
|
171 |
+
--use_gpu true \
|
172 |
+
--nn_model ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/exp/cpu_jit.pt \
|
173 |
+
--hlg ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/HLG.pt \
|
174 |
+
--g ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lm/G_4_gram.pt \
|
175 |
+
--ngram_lm_scale 2.0 \
|
176 |
+
--attention_scale 2.0 \
|
177 |
+
--num_paths 100 \
|
178 |
+
--nbest_scale 0.5 \
|
179 |
+
--word_table ./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/data/lang_bpe_500/words.txt \
|
180 |
+
--sos_id 1 \
|
181 |
+
--eos_id 1 \
|
182 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1089-134686-0001.wav \
|
183 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0001.wav \
|
184 |
+
./icefall-asr-librispeech-conformer-ctc-jit-bpe-500-2021-11-09/test_wavs/1221-135766-0002.wav
|
185 |
+
```
|
186 |
+
|
187 |
+
**NOTE**: V100 GPU with 16 GB RAM is known NOT to work because of OOM.
|
188 |
+
V100 GPU with 32 GB RAM is known to work.
|
189 |
+
|
190 |
[icefall]: https://github.com/k2-fsa/icefall
|
exp/pretrained.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:11b6dd6bc02557030840d729923b9ae3e6db3ae665f048fb0056609c205a9ef9
|
3 |
+
size 437166367
|