luomingshuang
commited on
Commit
•
70628de
1
Parent(s):
6e17241
add tedlium3-pruned-transducer-stateless files
Browse files- README.md +39 -0
- data/lang_bpe_500/bpe.model +3 -0
- exp/pretrained_average_17_to_29.pt +3 -0
- log/beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
- log/beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
- log/beam_search/log-decode-epoch-29-avg-13-beam-4-2022-03-21-17-40-20 +69 -0
- log/beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
- log/beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
- log/beam_search/wer-summary-dev-beam_4-epoch-29-avg-13-beam-4.txt +2 -0
- log/beam_search/wer-summary-test-beam_4-epoch-29-avg-13-beam-4.txt +2 -0
- log/greedy_search/errs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +0 -0
- log/greedy_search/errs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +0 -0
- log/greedy_search/log-decode-epoch-29-avg-13-context-2-max-sym-per-frame-3-2022-03-21-16-33-45 +27 -0
- log/greedy_search/recogs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +0 -0
- log/greedy_search/recogs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +0 -0
- log/greedy_search/wer-summary-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +2 -0
- log/greedy_search/wer-summary-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +2 -0
- log/modified_beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
- log/modified_beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
- log/modified_beam_search/log-decode-epoch-29-avg-13-beam-4-2022-03-21-16-57-34 +69 -0
- log/modified_beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
- log/modified_beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
- log/modified_beam_search/wer-summary-dev-beam_4-epoch-29-avg-13-beam-4.txt +2 -0
- log/modified_beam_search/wer-summary-test-beam_4-epoch-29-avg-13-beam-4.txt +2 -0
- test_wavs/RESULTS.md +28 -0
- test_wavs/RichBenjamin_2015W01.wav +0 -0
- test_wavs/RichBenjamin_2015W02.wav +0 -0
- test_wavs/RichBenjamin_2015W03.wav +0 -0
README.md
ADDED
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Note: This recipe is trained with the codes from this PR https://github.com/k2-fsa/icefall/pull/261
|
2 |
+
And the SpecAugment codes from this PR https://github.com/lhotse-speech/lhotse/pull/604.
|
3 |
+
|
4 |
+
# Pre-trained Transducer-Stateless models for the TEDLium3 dataset with icefall.
|
5 |
+
The model was trained on full [TEDLium3](https://www.openslr.org/51) with the scripts in [icefall](https://github.com/k2-fsa/icefall).
|
6 |
+
## Training procedure
|
7 |
+
The main repositories are list below, we will update the training and decoding scripts with the update of version.
|
8 |
+
k2: https://github.com/k2-fsa/k2
|
9 |
+
icefall: https://github.com/k2-fsa/icefall
|
10 |
+
lhotse: https://github.com/lhotse-speech/lhotse
|
11 |
+
* Install k2 and lhotse, k2 installation guide refers to https://k2.readthedocs.io/en/latest/installation/index.html, lhotse refers to https://lhotse.readthedocs.io/en/latest/getting-started.html#installation. I think the latest version would be ok. And please also install the requirements listed in icefall.
|
12 |
+
* Clone icefall(https://github.com/k2-fsa/icefall) and check to the commit showed above.
|
13 |
+
```
|
14 |
+
git clone https://github.com/k2-fsa/icefall
|
15 |
+
cd icefall
|
16 |
+
```
|
17 |
+
* Preparing data.
|
18 |
+
```
|
19 |
+
cd egs/tedlium3/ASR
|
20 |
+
bash ./prepare.sh
|
21 |
+
```
|
22 |
+
* Training
|
23 |
+
```
|
24 |
+
export CUDA_VISIBLE_DEVICES="0,1,2,3"
|
25 |
+
./pruned_transducer_stateless/train.py \
|
26 |
+
--world-size 4 \
|
27 |
+
--num-epochs 30 \
|
28 |
+
--start-epoch 0 \
|
29 |
+
--exp-dir pruned_transducer_stateless/exp \
|
30 |
+
--max-duration 300
|
31 |
+
```
|
32 |
+
## Evaluation results
|
33 |
+
The decoding results (WER%) on TEDLium3 (dev and test) are listed below, we got this result by averaging models from epoch 17 to 29.
|
34 |
+
The WERs are
|
35 |
+
| | dev | test | comment |
|
36 |
+
|------------------------------------|------------|------------|------------------------------------------|
|
37 |
+
| greedy search | 7.27 | 6.69 | --epoch 29, --avg 13, --max-duration 100 |
|
38 |
+
| beam search (beam size 4) | 6.70 | 6.04 | --epoch 29, --avg 13, --max-duration 100 |
|
39 |
+
| modified beam search (beam size 4) | 6.72 | 6.12 | --epoch 29, --avg 13, --max-duration 100 |
|
data/lang_bpe_500/bpe.model
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f950ca4200a0611ae3a2b2cb561f34ed7f39ae554512dce54134c55aa29d7188
|
3 |
+
size 244890
|
exp/pretrained_average_17_to_29.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d2743a86f6d631d11183a2aca89428c71e76d749587a64eb5043598bb9c32aa5
|
3 |
+
size 1014598105
|
log/beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/beam_search/log-decode-epoch-29-avg-13-beam-4-2022-03-21-17-40-20
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2022-03-21 17:40:20,878 INFO [decode.py:427] Decoding started
|
2 |
+
2022-03-21 17:40:20,878 INFO [decode.py:433] Device: cuda:0
|
3 |
+
2022-03-21 17:40:20,880 INFO [decode.py:443] {'feature_dim': 80, 'subsampling_factor': 4, 'attention_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'vgg_frontend': False, 'embedding_dim': 512, 'env_info': {'k2-version': '1.13', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '5ee082ea55f50e8bd42203ba266945ea5a236ab8', 'k2-git-date': 'Sun Feb 27 09:00:48 2022', 'lhotse-version': '1.0.0.dev+git.d917411.clean', 'torch-cuda-available': True, 'torch-cuda-version': '10.1', 'python-version': '3.8', 'icefall-git-branch': 'tedlium3-pruned-transducer-stateless-recipe', 'icefall-git-sha1': 'ad28c8c-dirty', 'icefall-git-date': 'Fri Mar 18 11:39:06 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-meixu/luomingshuang/k2/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.0.0.dev0+git.d917411.clean-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0307200233-b554c565c-lf9qd', 'IP address': '10.177.74.201'}, 'epoch': 29, 'avg': 13, 'exp_dir': PosixPath('pruned_transducer_stateless/exp'), 'bpe_model': 'data/lang_bpe_500/bpe.model', 'decoding_method': 'beam_search', 'beam_size': 4, 'context_size': 2, 'max_sym_per_frame': 3, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'res_dir': PosixPath('pruned_transducer_stateless/exp/beam_search'), 'suffix': 'epoch-29-avg-13-beam-4', 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
|
4 |
+
2022-03-21 17:40:20,880 INFO [decode.py:445] About to create model
|
5 |
+
2022-03-21 17:40:21,494 INFO [decode.py:456] averaging ['pruned_transducer_stateless/exp/epoch-17.pt', 'pruned_transducer_stateless/exp/epoch-18.pt', 'pruned_transducer_stateless/exp/epoch-19.pt', 'pruned_transducer_stateless/exp/epoch-20.pt', 'pruned_transducer_stateless/exp/epoch-21.pt', 'pruned_transducer_stateless/exp/epoch-22.pt', 'pruned_transducer_stateless/exp/epoch-23.pt', 'pruned_transducer_stateless/exp/epoch-24.pt', 'pruned_transducer_stateless/exp/epoch-25.pt', 'pruned_transducer_stateless/exp/epoch-26.pt', 'pruned_transducer_stateless/exp/epoch-27.pt', 'pruned_transducer_stateless/exp/epoch-28.pt', 'pruned_transducer_stateless/exp/epoch-29.pt']
|
6 |
+
2022-03-21 17:41:03,798 INFO [decode.py:465] Number of model parameters: 84514780
|
7 |
+
2022-03-21 17:41:03,798 INFO [asr_datamodule.py:357] About to get dev cuts
|
8 |
+
2022-03-21 17:41:03,824 INFO [asr_datamodule.py:362] About to get test cuts
|
9 |
+
2022-03-21 17:41:03,877 INFO [asr_datamodule.py:300] About to create dev dataset
|
10 |
+
2022-03-21 17:41:03,878 INFO [asr_datamodule.py:319] About to create dev dataloader
|
11 |
+
2022-03-21 17:41:18,426 INFO [decode.py:352] batch 0/?, cuts processed until now is 22
|
12 |
+
2022-03-21 17:41:47,198 INFO [decode.py:352] batch 2/?, cuts processed until now is 69
|
13 |
+
2022-03-21 17:42:17,014 INFO [decode.py:352] batch 4/?, cuts processed until now is 94
|
14 |
+
2022-03-21 17:42:38,356 INFO [decode.py:352] batch 6/?, cuts processed until now is 126
|
15 |
+
2022-03-21 17:43:07,763 INFO [decode.py:352] batch 8/?, cuts processed until now is 148
|
16 |
+
2022-03-21 17:43:34,568 INFO [decode.py:352] batch 10/?, cuts processed until now is 188
|
17 |
+
2022-03-21 17:44:08,157 INFO [decode.py:352] batch 12/?, cuts processed until now is 201
|
18 |
+
2022-03-21 17:44:38,968 INFO [decode.py:352] batch 14/?, cuts processed until now is 224
|
19 |
+
2022-03-21 17:44:59,567 INFO [decode.py:352] batch 16/?, cuts processed until now is 243
|
20 |
+
2022-03-21 17:45:28,356 INFO [decode.py:352] batch 18/?, cuts processed until now is 278
|
21 |
+
2022-03-21 17:45:57,868 INFO [decode.py:352] batch 20/?, cuts processed until now is 314
|
22 |
+
2022-03-21 17:46:24,161 INFO [decode.py:352] batch 22/?, cuts processed until now is 359
|
23 |
+
2022-03-21 17:46:48,186 INFO [decode.py:352] batch 24/?, cuts processed until now is 380
|
24 |
+
2022-03-21 17:47:12,656 INFO [decode.py:352] batch 26/?, cuts processed until now is 401
|
25 |
+
2022-03-21 17:47:32,501 INFO [decode.py:352] batch 28/?, cuts processed until now is 425
|
26 |
+
2022-03-21 17:47:59,243 INFO [decode.py:352] batch 30/?, cuts processed until now is 445
|
27 |
+
2022-03-21 17:48:18,201 INFO [decode.py:352] batch 32/?, cuts processed until now is 457
|
28 |
+
2022-03-21 17:48:27,738 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt
|
29 |
+
2022-03-21 17:48:27,768 INFO [utils.py:406] [dev-beam_4] %WER 6.70% [1221 / 18226, 187 ins, 378 del, 656 sub ]
|
30 |
+
2022-03-21 17:48:27,851 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt
|
31 |
+
2022-03-21 17:48:27,852 INFO [decode.py:399]
|
32 |
+
For dev, WER of different settings are:
|
33 |
+
beam_4 6.7 best for dev
|
34 |
+
|
35 |
+
2022-03-21 17:48:42,194 INFO [decode.py:352] batch 0/?, cuts processed until now is 29
|
36 |
+
2022-03-21 17:49:09,860 INFO [decode.py:352] batch 2/?, cuts processed until now is 107
|
37 |
+
2022-03-21 17:49:37,820 INFO [decode.py:352] batch 4/?, cuts processed until now is 140
|
38 |
+
2022-03-21 17:50:03,444 INFO [decode.py:352] batch 6/?, cuts processed until now is 216
|
39 |
+
2022-03-21 17:50:32,865 INFO [decode.py:352] batch 8/?, cuts processed until now is 246
|
40 |
+
2022-03-21 17:50:59,370 INFO [decode.py:352] batch 10/?, cuts processed until now is 304
|
41 |
+
2022-03-21 17:51:30,258 INFO [decode.py:352] batch 12/?, cuts processed until now is 320
|
42 |
+
2022-03-21 17:51:59,659 INFO [decode.py:352] batch 14/?, cuts processed until now is 351
|
43 |
+
2022-03-21 17:52:27,258 INFO [decode.py:352] batch 16/?, cuts processed until now is 393
|
44 |
+
2022-03-21 17:52:54,338 INFO [decode.py:352] batch 18/?, cuts processed until now is 434
|
45 |
+
2022-03-21 17:53:23,627 INFO [decode.py:352] batch 20/?, cuts processed until now is 465
|
46 |
+
2022-03-21 17:53:46,037 INFO [decode.py:352] batch 22/?, cuts processed until now is 583
|
47 |
+
2022-03-21 17:54:13,663 INFO [decode.py:352] batch 24/?, cuts processed until now is 624
|
48 |
+
2022-03-21 17:54:41,276 INFO [decode.py:352] batch 26/?, cuts processed until now is 658
|
49 |
+
2022-03-21 17:55:09,769 INFO [decode.py:352] batch 28/?, cuts processed until now is 699
|
50 |
+
2022-03-21 17:55:37,868 INFO [decode.py:352] batch 30/?, cuts processed until now is 738
|
51 |
+
2022-03-21 17:56:05,526 INFO [decode.py:352] batch 32/?, cuts processed until now is 794
|
52 |
+
2022-03-21 17:56:26,775 INFO [decode.py:352] batch 34/?, cuts processed until now is 836
|
53 |
+
2022-03-21 17:56:54,645 INFO [decode.py:352] batch 36/?, cuts processed until now is 881
|
54 |
+
2022-03-21 17:57:23,873 INFO [decode.py:352] batch 38/?, cuts processed until now is 907
|
55 |
+
2022-03-21 17:57:38,814 INFO [decode.py:352] batch 40/?, cuts processed until now is 943
|
56 |
+
2022-03-21 17:58:01,999 INFO [decode.py:352] batch 42/?, cuts processed until now is 979
|
57 |
+
2022-03-21 17:58:30,187 INFO [decode.py:352] batch 44/?, cuts processed until now is 1011
|
58 |
+
2022-03-21 17:58:50,432 INFO [decode.py:352] batch 46/?, cuts processed until now is 1068
|
59 |
+
2022-03-21 17:59:06,546 INFO [decode.py:352] batch 48/?, cuts processed until now is 1084
|
60 |
+
2022-03-21 17:59:31,094 INFO [decode.py:352] batch 50/?, cuts processed until now is 1113
|
61 |
+
2022-03-21 17:59:47,050 INFO [decode.py:352] batch 52/?, cuts processed until now is 1155
|
62 |
+
2022-03-21 17:59:47,184 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt
|
63 |
+
2022-03-21 17:59:47,231 INFO [utils.py:406] [test-beam_4] %WER 6.04% [1717 / 28430, 219 ins, 602 del, 896 sub ]
|
64 |
+
2022-03-21 17:59:47,357 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt
|
65 |
+
2022-03-21 17:59:47,358 INFO [decode.py:399]
|
66 |
+
For test, WER of different settings are:
|
67 |
+
beam_4 6.04 best for test
|
68 |
+
|
69 |
+
2022-03-21 17:59:47,358 INFO [decode.py:491] Done!
|
log/beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/beam_search/wer-summary-dev-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
settings WER
|
2 |
+
beam_4 6.7
|
log/beam_search/wer-summary-test-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
settings WER
|
2 |
+
beam_4 6.04
|
log/greedy_search/errs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/greedy_search/errs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/greedy_search/log-decode-epoch-29-avg-13-context-2-max-sym-per-frame-3-2022-03-21-16-33-45
ADDED
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2022-03-21 16:33:45,405 INFO [decode.py:427] Decoding started
|
2 |
+
2022-03-21 16:33:45,405 INFO [decode.py:433] Device: cuda:0
|
3 |
+
2022-03-21 16:33:45,411 INFO [decode.py:443] {'feature_dim': 80, 'subsampling_factor': 4, 'attention_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'vgg_frontend': False, 'embedding_dim': 512, 'env_info': {'k2-version': '1.13', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '5ee082ea55f50e8bd42203ba266945ea5a236ab8', 'k2-git-date': 'Sun Feb 27 09:00:48 2022', 'lhotse-version': '1.0.0.dev+git.d917411.clean', 'torch-cuda-available': True, 'torch-cuda-version': '10.1', 'python-version': '3.8', 'icefall-git-branch': 'tedlium3-pruned-transducer-stateless-recipe', 'icefall-git-sha1': 'ad28c8c-dirty', 'icefall-git-date': 'Fri Mar 18 11:39:06 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-meixu/luomingshuang/k2/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.0.0.dev0+git.d917411.clean-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0307200233-b554c565c-lf9qd', 'IP address': '10.177.74.201'}, 'epoch': 29, 'avg': 13, 'exp_dir': PosixPath('pruned_transducer_stateless/exp'), 'bpe_model': 'data/lang_bpe_500/bpe.model', 'decoding_method': 'greedy_search', 'beam_size': 4, 'context_size': 2, 'max_sym_per_frame': 3, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'res_dir': PosixPath('pruned_transducer_stateless/exp/greedy_search'), 'suffix': 'epoch-29-avg-13-context-2-max-sym-per-frame-3', 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
|
4 |
+
2022-03-21 16:33:45,411 INFO [decode.py:445] About to create model
|
5 |
+
2022-03-21 16:33:46,108 INFO [decode.py:456] averaging ['pruned_transducer_stateless/exp/epoch-17.pt', 'pruned_transducer_stateless/exp/epoch-18.pt', 'pruned_transducer_stateless/exp/epoch-19.pt', 'pruned_transducer_stateless/exp/epoch-20.pt', 'pruned_transducer_stateless/exp/epoch-21.pt', 'pruned_transducer_stateless/exp/epoch-22.pt', 'pruned_transducer_stateless/exp/epoch-23.pt', 'pruned_transducer_stateless/exp/epoch-24.pt', 'pruned_transducer_stateless/exp/epoch-25.pt', 'pruned_transducer_stateless/exp/epoch-26.pt', 'pruned_transducer_stateless/exp/epoch-27.pt', 'pruned_transducer_stateless/exp/epoch-28.pt', 'pruned_transducer_stateless/exp/epoch-29.pt']
|
6 |
+
2022-03-21 16:34:07,582 INFO [decode.py:465] Number of model parameters: 84514780
|
7 |
+
2022-03-21 16:34:07,582 INFO [asr_datamodule.py:357] About to get dev cuts
|
8 |
+
2022-03-21 16:34:07,617 INFO [asr_datamodule.py:362] About to get test cuts
|
9 |
+
2022-03-21 16:34:07,686 INFO [asr_datamodule.py:300] About to create dev dataset
|
10 |
+
2022-03-21 16:34:07,688 INFO [asr_datamodule.py:319] About to create dev dataloader
|
11 |
+
2022-03-21 16:34:11,811 INFO [decode.py:352] batch 0/?, cuts processed until now is 22
|
12 |
+
2022-03-21 16:35:58,689 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/greedy_search/recogs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
|
13 |
+
2022-03-21 16:35:58,722 INFO [utils.py:406] [dev-greedy_search] %WER 7.27% [1325 / 18226, 171 ins, 491 del, 663 sub ]
|
14 |
+
2022-03-21 16:35:58,806 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/greedy_search/errs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
|
15 |
+
2022-03-21 16:35:58,807 INFO [decode.py:399]
|
16 |
+
For dev, WER of different settings are:
|
17 |
+
greedy_search 7.27 best for dev
|
18 |
+
|
19 |
+
2022-03-21 16:36:02,945 INFO [decode.py:352] batch 0/?, cuts processed until now is 29
|
20 |
+
2022-03-21 16:38:54,255 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/greedy_search/recogs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
|
21 |
+
2022-03-21 16:38:54,300 INFO [utils.py:406] [test-greedy_search] %WER 6.69% [1902 / 28430, 197 ins, 802 del, 903 sub ]
|
22 |
+
2022-03-21 16:38:54,399 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/greedy_search/errs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
|
23 |
+
2022-03-21 16:38:54,400 INFO [decode.py:399]
|
24 |
+
For test, WER of different settings are:
|
25 |
+
greedy_search 6.69 best for test
|
26 |
+
|
27 |
+
2022-03-21 16:38:54,400 INFO [decode.py:491] Done!
|
log/greedy_search/recogs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/greedy_search/recogs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/greedy_search/wer-summary-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
settings WER
|
2 |
+
greedy_search 7.27
|
log/greedy_search/wer-summary-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
settings WER
|
2 |
+
greedy_search 6.69
|
log/modified_beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/modified_beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/modified_beam_search/log-decode-epoch-29-avg-13-beam-4-2022-03-21-16-57-34
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2022-03-21 16:57:34,038 INFO [decode.py:427] Decoding started
|
2 |
+
2022-03-21 16:57:34,039 INFO [decode.py:433] Device: cuda:0
|
3 |
+
2022-03-21 16:57:34,041 INFO [decode.py:443] {'feature_dim': 80, 'subsampling_factor': 4, 'attention_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'vgg_frontend': False, 'embedding_dim': 512, 'env_info': {'k2-version': '1.13', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '5ee082ea55f50e8bd42203ba266945ea5a236ab8', 'k2-git-date': 'Sun Feb 27 09:00:48 2022', 'lhotse-version': '1.0.0.dev+git.d917411.clean', 'torch-cuda-available': True, 'torch-cuda-version': '10.1', 'python-version': '3.8', 'icefall-git-branch': 'tedlium3-pruned-transducer-stateless-recipe', 'icefall-git-sha1': 'ad28c8c-dirty', 'icefall-git-date': 'Fri Mar 18 11:39:06 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-meixu/luomingshuang/k2/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.0.0.dev0+git.d917411.clean-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0307200233-b554c565c-lf9qd', 'IP address': '10.177.74.201'}, 'epoch': 29, 'avg': 13, 'exp_dir': PosixPath('pruned_transducer_stateless/exp'), 'bpe_model': 'data/lang_bpe_500/bpe.model', 'decoding_method': 'modified_beam_search', 'beam_size': 4, 'context_size': 2, 'max_sym_per_frame': 3, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'res_dir': PosixPath('pruned_transducer_stateless/exp/modified_beam_search'), 'suffix': 'epoch-29-avg-13-beam-4', 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
|
4 |
+
2022-03-21 16:57:34,042 INFO [decode.py:445] About to create model
|
5 |
+
2022-03-21 16:57:34,689 INFO [decode.py:456] averaging ['pruned_transducer_stateless/exp/epoch-17.pt', 'pruned_transducer_stateless/exp/epoch-18.pt', 'pruned_transducer_stateless/exp/epoch-19.pt', 'pruned_transducer_stateless/exp/epoch-20.pt', 'pruned_transducer_stateless/exp/epoch-21.pt', 'pruned_transducer_stateless/exp/epoch-22.pt', 'pruned_transducer_stateless/exp/epoch-23.pt', 'pruned_transducer_stateless/exp/epoch-24.pt', 'pruned_transducer_stateless/exp/epoch-25.pt', 'pruned_transducer_stateless/exp/epoch-26.pt', 'pruned_transducer_stateless/exp/epoch-27.pt', 'pruned_transducer_stateless/exp/epoch-28.pt', 'pruned_transducer_stateless/exp/epoch-29.pt']
|
6 |
+
2022-03-21 16:58:25,221 INFO [decode.py:465] Number of model parameters: 84514780
|
7 |
+
2022-03-21 16:58:25,222 INFO [asr_datamodule.py:357] About to get dev cuts
|
8 |
+
2022-03-21 16:58:25,253 INFO [asr_datamodule.py:362] About to get test cuts
|
9 |
+
2022-03-21 16:58:25,323 INFO [asr_datamodule.py:300] About to create dev dataset
|
10 |
+
2022-03-21 16:58:25,324 INFO [asr_datamodule.py:319] About to create dev dataloader
|
11 |
+
2022-03-21 16:58:31,243 INFO [decode.py:352] batch 0/?, cuts processed until now is 22
|
12 |
+
2022-03-21 16:58:42,078 INFO [decode.py:352] batch 2/?, cuts processed until now is 69
|
13 |
+
2022-03-21 16:58:52,943 INFO [decode.py:352] batch 4/?, cuts processed until now is 94
|
14 |
+
2022-03-21 16:59:00,982 INFO [decode.py:352] batch 6/?, cuts processed until now is 126
|
15 |
+
2022-03-21 16:59:11,292 INFO [decode.py:352] batch 8/?, cuts processed until now is 148
|
16 |
+
2022-03-21 16:59:22,146 INFO [decode.py:352] batch 10/?, cuts processed until now is 188
|
17 |
+
2022-03-21 16:59:32,590 INFO [decode.py:352] batch 12/?, cuts processed until now is 201
|
18 |
+
2022-03-21 16:59:42,641 INFO [decode.py:352] batch 14/?, cuts processed until now is 224
|
19 |
+
2022-03-21 16:59:49,883 INFO [decode.py:352] batch 16/?, cuts processed until now is 243
|
20 |
+
2022-03-21 17:00:00,443 INFO [decode.py:352] batch 18/?, cuts processed until now is 278
|
21 |
+
2022-03-21 17:00:11,335 INFO [decode.py:352] batch 20/?, cuts processed until now is 314
|
22 |
+
2022-03-21 17:00:21,748 INFO [decode.py:352] batch 22/?, cuts processed until now is 359
|
23 |
+
2022-03-21 17:00:30,249 INFO [decode.py:352] batch 24/?, cuts processed until now is 380
|
24 |
+
2022-03-21 17:00:38,706 INFO [decode.py:352] batch 26/?, cuts processed until now is 401
|
25 |
+
2022-03-21 17:00:45,626 INFO [decode.py:352] batch 28/?, cuts processed until now is 425
|
26 |
+
2022-03-21 17:00:54,943 INFO [decode.py:352] batch 30/?, cuts processed until now is 445
|
27 |
+
2022-03-21 17:01:01,447 INFO [decode.py:352] batch 32/?, cuts processed until now is 457
|
28 |
+
2022-03-21 17:01:05,423 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/modified_beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt
|
29 |
+
2022-03-21 17:01:05,455 INFO [utils.py:406] [dev-beam_4] %WER 6.72% [1225 / 18226, 175 ins, 397 del, 653 sub ]
|
30 |
+
2022-03-21 17:01:05,535 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/modified_beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt
|
31 |
+
2022-03-21 17:01:05,535 INFO [decode.py:399]
|
32 |
+
For dev, WER of different settings are:
|
33 |
+
beam_4 6.72 best for dev
|
34 |
+
|
35 |
+
2022-03-21 17:01:11,300 INFO [decode.py:352] batch 0/?, cuts processed until now is 29
|
36 |
+
2022-03-21 17:01:22,172 INFO [decode.py:352] batch 2/?, cuts processed until now is 107
|
37 |
+
2022-03-21 17:01:32,987 INFO [decode.py:352] batch 4/?, cuts processed until now is 140
|
38 |
+
2022-03-21 17:01:42,837 INFO [decode.py:352] batch 6/?, cuts processed until now is 216
|
39 |
+
2022-03-21 17:01:53,801 INFO [decode.py:352] batch 8/?, cuts processed until now is 246
|
40 |
+
2022-03-21 17:02:04,846 INFO [decode.py:352] batch 10/?, cuts processed until now is 304
|
41 |
+
2022-03-21 17:02:15,147 INFO [decode.py:352] batch 12/?, cuts processed until now is 320
|
42 |
+
2022-03-21 17:02:25,857 INFO [decode.py:352] batch 14/?, cuts processed until now is 351
|
43 |
+
2022-03-21 17:02:36,852 INFO [decode.py:352] batch 16/?, cuts processed until now is 393
|
44 |
+
2022-03-21 17:02:47,304 INFO [decode.py:352] batch 18/?, cuts processed until now is 434
|
45 |
+
2022-03-21 17:02:57,987 INFO [decode.py:352] batch 20/?, cuts processed until now is 465
|
46 |
+
2022-03-21 17:03:07,180 INFO [decode.py:352] batch 22/?, cuts processed until now is 583
|
47 |
+
2022-03-21 17:03:17,898 INFO [decode.py:352] batch 24/?, cuts processed until now is 624
|
48 |
+
2022-03-21 17:03:28,845 INFO [decode.py:352] batch 26/?, cuts processed until now is 658
|
49 |
+
2022-03-21 17:03:39,757 INFO [decode.py:352] batch 28/?, cuts processed until now is 699
|
50 |
+
2022-03-21 17:03:50,155 INFO [decode.py:352] batch 30/?, cuts processed until now is 738
|
51 |
+
2022-03-21 17:04:01,078 INFO [decode.py:352] batch 32/?, cuts processed until now is 794
|
52 |
+
2022-03-21 17:04:09,367 INFO [decode.py:352] batch 34/?, cuts processed until now is 836
|
53 |
+
2022-03-21 17:04:19,291 INFO [decode.py:352] batch 36/?, cuts processed until now is 881
|
54 |
+
2022-03-21 17:04:26,154 INFO [decode.py:352] batch 38/?, cuts processed until now is 907
|
55 |
+
2022-03-21 17:04:29,678 INFO [decode.py:352] batch 40/?, cuts processed until now is 943
|
56 |
+
2022-03-21 17:04:34,928 INFO [decode.py:352] batch 42/?, cuts processed until now is 979
|
57 |
+
2022-03-21 17:04:41,495 INFO [decode.py:352] batch 44/?, cuts processed until now is 1011
|
58 |
+
2022-03-21 17:04:46,178 INFO [decode.py:352] batch 46/?, cuts processed until now is 1068
|
59 |
+
2022-03-21 17:04:49,535 INFO [decode.py:352] batch 48/?, cuts processed until now is 1084
|
60 |
+
2022-03-21 17:04:55,171 INFO [decode.py:352] batch 50/?, cuts processed until now is 1113
|
61 |
+
2022-03-21 17:04:58,825 INFO [decode.py:352] batch 52/?, cuts processed until now is 1155
|
62 |
+
2022-03-21 17:04:58,950 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/modified_beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt
|
63 |
+
2022-03-21 17:04:58,989 INFO [utils.py:406] [test-beam_4] %WER 6.12% [1741 / 28430, 202 ins, 643 del, 896 sub ]
|
64 |
+
2022-03-21 17:04:59,118 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/modified_beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt
|
65 |
+
2022-03-21 17:04:59,119 INFO [decode.py:399]
|
66 |
+
For test, WER of different settings are:
|
67 |
+
beam_4 6.12 best for test
|
68 |
+
|
69 |
+
2022-03-21 17:04:59,119 INFO [decode.py:491] Done!
|
log/modified_beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/modified_beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
log/modified_beam_search/wer-summary-dev-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
settings WER
|
2 |
+
beam_4 6.72
|
log/modified_beam_search/wer-summary-test-beam_4-epoch-29-avg-13-beam-4.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
settings WER
|
2 |
+
beam_4 6.12
|
test_wavs/RESULTS.md
ADDED
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
You can use the following command to test the pretrained.py and the pretrained files:
|
2 |
+
```
|
3 |
+
CUDA_VISIBLE_DEVICES='1' python pruned_transducer_stateless/pretrained.py --checkpoint icefall_asr_tedlium3_pruned_transducer_stateless/exp/pretrained_average_17_to_29.pt --bpe-model icefall_asr_tedlium3_pruned_transducer_stateless/data/lang_bpe_500/bpe.model --method greedy_search icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W01.wav icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W02.wav icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W03.wav
|
4 |
+
```
|
5 |
+
|
6 |
+
The running results are as follows:
|
7 |
+
```
|
8 |
+
2022-03-21 16:19:46,125 INFO [pretrained.py:253] {'sample_rate': 16000, 'feature_dim': 80, 'subsampling_factor': 4, 'attention_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'vgg_frontend': False, 'embedding_dim': 512, 'env_info': {'k2-version': '1.13', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '5ee082ea55f50e8bd42203ba266945ea5a236ab8', 'k2-git-date': 'Sun Feb 27 09:00:48 2022', 'lhotse-version': '1.0.0.dev+git.d917411.clean', 'torch-cuda-available': True, 'torch-cuda-version': '10.1', 'python-version': '3.8', 'icefall-git-branch': 'tedlium3-pruned-transducer-stateless-recipe', 'icefall-git-sha1': 'ad28c8c-dirty', 'icefall-git-date': 'Fri Mar 18 11:39:06 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-meixu/luomingshuang/k2/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.0.0.dev0+git.d917411.clean-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0307200233-b554c565c-lf9qd', 'IP address': '10.177.74.201'}, 'checkpoint': 'icefall_asr_tedlium3_pruned_transducer_stateless/exp/pretrained_average_17_to_29.pt', 'bpe_model': 'icefall_asr_tedlium3_pruned_transducer_stateless/data/lang_bpe_500/bpe.model', 'method': 'greedy_search', 'sound_files': ['icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W01.wav', 'icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W02.wav', 'icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W03.wav'], 'beam_size': 4, 'context_size': 2, 'max_sym_per_frame': 3, 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
|
9 |
+
2022-03-21 16:19:46,125 INFO [pretrained.py:259] device: cuda:0
|
10 |
+
2022-03-21 16:19:46,125 INFO [pretrained.py:261] Creating model
|
11 |
+
2022-03-21 16:20:03,351 INFO [pretrained.py:270] Constructing Fbank computer
|
12 |
+
2022-03-21 16:20:03,354 INFO [pretrained.py:280] Reading sound files: ['icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W01.wav', 'icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W02.wav', 'icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W03.wav']
|
13 |
+
2022-03-21 16:20:03,379 INFO [pretrained.py:286] Decoding started
|
14 |
+
2022-03-21 16:20:03,519 INFO [pretrained.py:306] Using greedy_search
|
15 |
+
2022-03-21 16:20:05,040 INFO [pretrained.py:334]
|
16 |
+
icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W01.wav:
|
17 |
+
choice isn 't it i don 't live there but i did journey on a twenty seven thousand mile trip for two years to the fastest growing and widest counties in america what is a whitopia i define whitopia in three ways first a whitopia has posted at least six percent
|
18 |
+
|
19 |
+
icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W02.wav:
|
20 |
+
population growth since two thousand secondly the majority of that growth comes from white migrants and third a whitopia has an ineffable charm a pleasant look and feel a jenesequa to learn how and why whitopias are ticking i immersed myself for several months apiece in three of them first
|
21 |
+
|
22 |
+
icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W03.wav:
|
23 |
+
st george utah second coeuralene idaho and third for siteth county georgia first stop st george a beautiful town of red rock landscapes in the 1850s brigham young dispatched families to st george to grow cotton because of the hot arid climate and so they called it utah 's dixie and the name sticks to this day
|
24 |
+
|
25 |
+
|
26 |
+
2022-03-21 16:20:05,040 INFO [pretrained.py:336] Decoding Done
|
27 |
+
|
28 |
+
```
|
test_wavs/RichBenjamin_2015W01.wav
ADDED
Binary file (961 kB). View file
|
|
test_wavs/RichBenjamin_2015W02.wav
ADDED
Binary file (961 kB). View file
|
|
test_wavs/RichBenjamin_2015W03.wav
ADDED
Binary file (959 kB). View file
|
|