luomingshuang commited on
Commit
70628de
1 Parent(s): 6e17241

add tedlium3-pruned-transducer-stateless files

Browse files
Files changed (28) hide show
  1. README.md +39 -0
  2. data/lang_bpe_500/bpe.model +3 -0
  3. exp/pretrained_average_17_to_29.pt +3 -0
  4. log/beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
  5. log/beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
  6. log/beam_search/log-decode-epoch-29-avg-13-beam-4-2022-03-21-17-40-20 +69 -0
  7. log/beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
  8. log/beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
  9. log/beam_search/wer-summary-dev-beam_4-epoch-29-avg-13-beam-4.txt +2 -0
  10. log/beam_search/wer-summary-test-beam_4-epoch-29-avg-13-beam-4.txt +2 -0
  11. log/greedy_search/errs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +0 -0
  12. log/greedy_search/errs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +0 -0
  13. log/greedy_search/log-decode-epoch-29-avg-13-context-2-max-sym-per-frame-3-2022-03-21-16-33-45 +27 -0
  14. log/greedy_search/recogs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +0 -0
  15. log/greedy_search/recogs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +0 -0
  16. log/greedy_search/wer-summary-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +2 -0
  17. log/greedy_search/wer-summary-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt +2 -0
  18. log/modified_beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
  19. log/modified_beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
  20. log/modified_beam_search/log-decode-epoch-29-avg-13-beam-4-2022-03-21-16-57-34 +69 -0
  21. log/modified_beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
  22. log/modified_beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt +0 -0
  23. log/modified_beam_search/wer-summary-dev-beam_4-epoch-29-avg-13-beam-4.txt +2 -0
  24. log/modified_beam_search/wer-summary-test-beam_4-epoch-29-avg-13-beam-4.txt +2 -0
  25. test_wavs/RESULTS.md +28 -0
  26. test_wavs/RichBenjamin_2015W01.wav +0 -0
  27. test_wavs/RichBenjamin_2015W02.wav +0 -0
  28. test_wavs/RichBenjamin_2015W03.wav +0 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Note: This recipe is trained with the codes from this PR https://github.com/k2-fsa/icefall/pull/261
2
+ And the SpecAugment codes from this PR https://github.com/lhotse-speech/lhotse/pull/604.
3
+
4
+ # Pre-trained Transducer-Stateless models for the TEDLium3 dataset with icefall.
5
+ The model was trained on full [TEDLium3](https://www.openslr.org/51) with the scripts in [icefall](https://github.com/k2-fsa/icefall).
6
+ ## Training procedure
7
+ The main repositories are list below, we will update the training and decoding scripts with the update of version.
8
+ k2: https://github.com/k2-fsa/k2
9
+ icefall: https://github.com/k2-fsa/icefall
10
+ lhotse: https://github.com/lhotse-speech/lhotse
11
+ * Install k2 and lhotse, k2 installation guide refers to https://k2.readthedocs.io/en/latest/installation/index.html, lhotse refers to https://lhotse.readthedocs.io/en/latest/getting-started.html#installation. I think the latest version would be ok. And please also install the requirements listed in icefall.
12
+ * Clone icefall(https://github.com/k2-fsa/icefall) and check to the commit showed above.
13
+ ```
14
+ git clone https://github.com/k2-fsa/icefall
15
+ cd icefall
16
+ ```
17
+ * Preparing data.
18
+ ```
19
+ cd egs/tedlium3/ASR
20
+ bash ./prepare.sh
21
+ ```
22
+ * Training
23
+ ```
24
+ export CUDA_VISIBLE_DEVICES="0,1,2,3"
25
+ ./pruned_transducer_stateless/train.py \
26
+ --world-size 4 \
27
+ --num-epochs 30 \
28
+ --start-epoch 0 \
29
+ --exp-dir pruned_transducer_stateless/exp \
30
+ --max-duration 300
31
+ ```
32
+ ## Evaluation results
33
+ The decoding results (WER%) on TEDLium3 (dev and test) are listed below, we got this result by averaging models from epoch 17 to 29.
34
+ The WERs are
35
+ | | dev | test | comment |
36
+ |------------------------------------|------------|------------|------------------------------------------|
37
+ | greedy search | 7.27 | 6.69 | --epoch 29, --avg 13, --max-duration 100 |
38
+ | beam search (beam size 4) | 6.70 | 6.04 | --epoch 29, --avg 13, --max-duration 100 |
39
+ | modified beam search (beam size 4) | 6.72 | 6.12 | --epoch 29, --avg 13, --max-duration 100 |
data/lang_bpe_500/bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f950ca4200a0611ae3a2b2cb561f34ed7f39ae554512dce54134c55aa29d7188
3
+ size 244890
exp/pretrained_average_17_to_29.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d2743a86f6d631d11183a2aca89428c71e76d749587a64eb5043598bb9c32aa5
3
+ size 1014598105
log/beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt ADDED
The diff for this file is too large to render. See raw diff
log/beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt ADDED
The diff for this file is too large to render. See raw diff
log/beam_search/log-decode-epoch-29-avg-13-beam-4-2022-03-21-17-40-20 ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2022-03-21 17:40:20,878 INFO [decode.py:427] Decoding started
2
+ 2022-03-21 17:40:20,878 INFO [decode.py:433] Device: cuda:0
3
+ 2022-03-21 17:40:20,880 INFO [decode.py:443] {'feature_dim': 80, 'subsampling_factor': 4, 'attention_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'vgg_frontend': False, 'embedding_dim': 512, 'env_info': {'k2-version': '1.13', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '5ee082ea55f50e8bd42203ba266945ea5a236ab8', 'k2-git-date': 'Sun Feb 27 09:00:48 2022', 'lhotse-version': '1.0.0.dev+git.d917411.clean', 'torch-cuda-available': True, 'torch-cuda-version': '10.1', 'python-version': '3.8', 'icefall-git-branch': 'tedlium3-pruned-transducer-stateless-recipe', 'icefall-git-sha1': 'ad28c8c-dirty', 'icefall-git-date': 'Fri Mar 18 11:39:06 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-meixu/luomingshuang/k2/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.0.0.dev0+git.d917411.clean-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0307200233-b554c565c-lf9qd', 'IP address': '10.177.74.201'}, 'epoch': 29, 'avg': 13, 'exp_dir': PosixPath('pruned_transducer_stateless/exp'), 'bpe_model': 'data/lang_bpe_500/bpe.model', 'decoding_method': 'beam_search', 'beam_size': 4, 'context_size': 2, 'max_sym_per_frame': 3, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'res_dir': PosixPath('pruned_transducer_stateless/exp/beam_search'), 'suffix': 'epoch-29-avg-13-beam-4', 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
4
+ 2022-03-21 17:40:20,880 INFO [decode.py:445] About to create model
5
+ 2022-03-21 17:40:21,494 INFO [decode.py:456] averaging ['pruned_transducer_stateless/exp/epoch-17.pt', 'pruned_transducer_stateless/exp/epoch-18.pt', 'pruned_transducer_stateless/exp/epoch-19.pt', 'pruned_transducer_stateless/exp/epoch-20.pt', 'pruned_transducer_stateless/exp/epoch-21.pt', 'pruned_transducer_stateless/exp/epoch-22.pt', 'pruned_transducer_stateless/exp/epoch-23.pt', 'pruned_transducer_stateless/exp/epoch-24.pt', 'pruned_transducer_stateless/exp/epoch-25.pt', 'pruned_transducer_stateless/exp/epoch-26.pt', 'pruned_transducer_stateless/exp/epoch-27.pt', 'pruned_transducer_stateless/exp/epoch-28.pt', 'pruned_transducer_stateless/exp/epoch-29.pt']
6
+ 2022-03-21 17:41:03,798 INFO [decode.py:465] Number of model parameters: 84514780
7
+ 2022-03-21 17:41:03,798 INFO [asr_datamodule.py:357] About to get dev cuts
8
+ 2022-03-21 17:41:03,824 INFO [asr_datamodule.py:362] About to get test cuts
9
+ 2022-03-21 17:41:03,877 INFO [asr_datamodule.py:300] About to create dev dataset
10
+ 2022-03-21 17:41:03,878 INFO [asr_datamodule.py:319] About to create dev dataloader
11
+ 2022-03-21 17:41:18,426 INFO [decode.py:352] batch 0/?, cuts processed until now is 22
12
+ 2022-03-21 17:41:47,198 INFO [decode.py:352] batch 2/?, cuts processed until now is 69
13
+ 2022-03-21 17:42:17,014 INFO [decode.py:352] batch 4/?, cuts processed until now is 94
14
+ 2022-03-21 17:42:38,356 INFO [decode.py:352] batch 6/?, cuts processed until now is 126
15
+ 2022-03-21 17:43:07,763 INFO [decode.py:352] batch 8/?, cuts processed until now is 148
16
+ 2022-03-21 17:43:34,568 INFO [decode.py:352] batch 10/?, cuts processed until now is 188
17
+ 2022-03-21 17:44:08,157 INFO [decode.py:352] batch 12/?, cuts processed until now is 201
18
+ 2022-03-21 17:44:38,968 INFO [decode.py:352] batch 14/?, cuts processed until now is 224
19
+ 2022-03-21 17:44:59,567 INFO [decode.py:352] batch 16/?, cuts processed until now is 243
20
+ 2022-03-21 17:45:28,356 INFO [decode.py:352] batch 18/?, cuts processed until now is 278
21
+ 2022-03-21 17:45:57,868 INFO [decode.py:352] batch 20/?, cuts processed until now is 314
22
+ 2022-03-21 17:46:24,161 INFO [decode.py:352] batch 22/?, cuts processed until now is 359
23
+ 2022-03-21 17:46:48,186 INFO [decode.py:352] batch 24/?, cuts processed until now is 380
24
+ 2022-03-21 17:47:12,656 INFO [decode.py:352] batch 26/?, cuts processed until now is 401
25
+ 2022-03-21 17:47:32,501 INFO [decode.py:352] batch 28/?, cuts processed until now is 425
26
+ 2022-03-21 17:47:59,243 INFO [decode.py:352] batch 30/?, cuts processed until now is 445
27
+ 2022-03-21 17:48:18,201 INFO [decode.py:352] batch 32/?, cuts processed until now is 457
28
+ 2022-03-21 17:48:27,738 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt
29
+ 2022-03-21 17:48:27,768 INFO [utils.py:406] [dev-beam_4] %WER 6.70% [1221 / 18226, 187 ins, 378 del, 656 sub ]
30
+ 2022-03-21 17:48:27,851 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt
31
+ 2022-03-21 17:48:27,852 INFO [decode.py:399]
32
+ For dev, WER of different settings are:
33
+ beam_4 6.7 best for dev
34
+
35
+ 2022-03-21 17:48:42,194 INFO [decode.py:352] batch 0/?, cuts processed until now is 29
36
+ 2022-03-21 17:49:09,860 INFO [decode.py:352] batch 2/?, cuts processed until now is 107
37
+ 2022-03-21 17:49:37,820 INFO [decode.py:352] batch 4/?, cuts processed until now is 140
38
+ 2022-03-21 17:50:03,444 INFO [decode.py:352] batch 6/?, cuts processed until now is 216
39
+ 2022-03-21 17:50:32,865 INFO [decode.py:352] batch 8/?, cuts processed until now is 246
40
+ 2022-03-21 17:50:59,370 INFO [decode.py:352] batch 10/?, cuts processed until now is 304
41
+ 2022-03-21 17:51:30,258 INFO [decode.py:352] batch 12/?, cuts processed until now is 320
42
+ 2022-03-21 17:51:59,659 INFO [decode.py:352] batch 14/?, cuts processed until now is 351
43
+ 2022-03-21 17:52:27,258 INFO [decode.py:352] batch 16/?, cuts processed until now is 393
44
+ 2022-03-21 17:52:54,338 INFO [decode.py:352] batch 18/?, cuts processed until now is 434
45
+ 2022-03-21 17:53:23,627 INFO [decode.py:352] batch 20/?, cuts processed until now is 465
46
+ 2022-03-21 17:53:46,037 INFO [decode.py:352] batch 22/?, cuts processed until now is 583
47
+ 2022-03-21 17:54:13,663 INFO [decode.py:352] batch 24/?, cuts processed until now is 624
48
+ 2022-03-21 17:54:41,276 INFO [decode.py:352] batch 26/?, cuts processed until now is 658
49
+ 2022-03-21 17:55:09,769 INFO [decode.py:352] batch 28/?, cuts processed until now is 699
50
+ 2022-03-21 17:55:37,868 INFO [decode.py:352] batch 30/?, cuts processed until now is 738
51
+ 2022-03-21 17:56:05,526 INFO [decode.py:352] batch 32/?, cuts processed until now is 794
52
+ 2022-03-21 17:56:26,775 INFO [decode.py:352] batch 34/?, cuts processed until now is 836
53
+ 2022-03-21 17:56:54,645 INFO [decode.py:352] batch 36/?, cuts processed until now is 881
54
+ 2022-03-21 17:57:23,873 INFO [decode.py:352] batch 38/?, cuts processed until now is 907
55
+ 2022-03-21 17:57:38,814 INFO [decode.py:352] batch 40/?, cuts processed until now is 943
56
+ 2022-03-21 17:58:01,999 INFO [decode.py:352] batch 42/?, cuts processed until now is 979
57
+ 2022-03-21 17:58:30,187 INFO [decode.py:352] batch 44/?, cuts processed until now is 1011
58
+ 2022-03-21 17:58:50,432 INFO [decode.py:352] batch 46/?, cuts processed until now is 1068
59
+ 2022-03-21 17:59:06,546 INFO [decode.py:352] batch 48/?, cuts processed until now is 1084
60
+ 2022-03-21 17:59:31,094 INFO [decode.py:352] batch 50/?, cuts processed until now is 1113
61
+ 2022-03-21 17:59:47,050 INFO [decode.py:352] batch 52/?, cuts processed until now is 1155
62
+ 2022-03-21 17:59:47,184 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt
63
+ 2022-03-21 17:59:47,231 INFO [utils.py:406] [test-beam_4] %WER 6.04% [1717 / 28430, 219 ins, 602 del, 896 sub ]
64
+ 2022-03-21 17:59:47,357 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt
65
+ 2022-03-21 17:59:47,358 INFO [decode.py:399]
66
+ For test, WER of different settings are:
67
+ beam_4 6.04 best for test
68
+
69
+ 2022-03-21 17:59:47,358 INFO [decode.py:491] Done!
log/beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt ADDED
The diff for this file is too large to render. See raw diff
log/beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt ADDED
The diff for this file is too large to render. See raw diff
log/beam_search/wer-summary-dev-beam_4-epoch-29-avg-13-beam-4.txt ADDED
@@ -0,0 +1,2 @@
 
 
1
+ settings WER
2
+ beam_4 6.7
log/beam_search/wer-summary-test-beam_4-epoch-29-avg-13-beam-4.txt ADDED
@@ -0,0 +1,2 @@
 
 
1
+ settings WER
2
+ beam_4 6.04
log/greedy_search/errs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt ADDED
The diff for this file is too large to render. See raw diff
log/greedy_search/errs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt ADDED
The diff for this file is too large to render. See raw diff
log/greedy_search/log-decode-epoch-29-avg-13-context-2-max-sym-per-frame-3-2022-03-21-16-33-45 ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2022-03-21 16:33:45,405 INFO [decode.py:427] Decoding started
2
+ 2022-03-21 16:33:45,405 INFO [decode.py:433] Device: cuda:0
3
+ 2022-03-21 16:33:45,411 INFO [decode.py:443] {'feature_dim': 80, 'subsampling_factor': 4, 'attention_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'vgg_frontend': False, 'embedding_dim': 512, 'env_info': {'k2-version': '1.13', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '5ee082ea55f50e8bd42203ba266945ea5a236ab8', 'k2-git-date': 'Sun Feb 27 09:00:48 2022', 'lhotse-version': '1.0.0.dev+git.d917411.clean', 'torch-cuda-available': True, 'torch-cuda-version': '10.1', 'python-version': '3.8', 'icefall-git-branch': 'tedlium3-pruned-transducer-stateless-recipe', 'icefall-git-sha1': 'ad28c8c-dirty', 'icefall-git-date': 'Fri Mar 18 11:39:06 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-meixu/luomingshuang/k2/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.0.0.dev0+git.d917411.clean-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0307200233-b554c565c-lf9qd', 'IP address': '10.177.74.201'}, 'epoch': 29, 'avg': 13, 'exp_dir': PosixPath('pruned_transducer_stateless/exp'), 'bpe_model': 'data/lang_bpe_500/bpe.model', 'decoding_method': 'greedy_search', 'beam_size': 4, 'context_size': 2, 'max_sym_per_frame': 3, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'res_dir': PosixPath('pruned_transducer_stateless/exp/greedy_search'), 'suffix': 'epoch-29-avg-13-context-2-max-sym-per-frame-3', 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
4
+ 2022-03-21 16:33:45,411 INFO [decode.py:445] About to create model
5
+ 2022-03-21 16:33:46,108 INFO [decode.py:456] averaging ['pruned_transducer_stateless/exp/epoch-17.pt', 'pruned_transducer_stateless/exp/epoch-18.pt', 'pruned_transducer_stateless/exp/epoch-19.pt', 'pruned_transducer_stateless/exp/epoch-20.pt', 'pruned_transducer_stateless/exp/epoch-21.pt', 'pruned_transducer_stateless/exp/epoch-22.pt', 'pruned_transducer_stateless/exp/epoch-23.pt', 'pruned_transducer_stateless/exp/epoch-24.pt', 'pruned_transducer_stateless/exp/epoch-25.pt', 'pruned_transducer_stateless/exp/epoch-26.pt', 'pruned_transducer_stateless/exp/epoch-27.pt', 'pruned_transducer_stateless/exp/epoch-28.pt', 'pruned_transducer_stateless/exp/epoch-29.pt']
6
+ 2022-03-21 16:34:07,582 INFO [decode.py:465] Number of model parameters: 84514780
7
+ 2022-03-21 16:34:07,582 INFO [asr_datamodule.py:357] About to get dev cuts
8
+ 2022-03-21 16:34:07,617 INFO [asr_datamodule.py:362] About to get test cuts
9
+ 2022-03-21 16:34:07,686 INFO [asr_datamodule.py:300] About to create dev dataset
10
+ 2022-03-21 16:34:07,688 INFO [asr_datamodule.py:319] About to create dev dataloader
11
+ 2022-03-21 16:34:11,811 INFO [decode.py:352] batch 0/?, cuts processed until now is 22
12
+ 2022-03-21 16:35:58,689 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/greedy_search/recogs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
13
+ 2022-03-21 16:35:58,722 INFO [utils.py:406] [dev-greedy_search] %WER 7.27% [1325 / 18226, 171 ins, 491 del, 663 sub ]
14
+ 2022-03-21 16:35:58,806 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/greedy_search/errs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
15
+ 2022-03-21 16:35:58,807 INFO [decode.py:399]
16
+ For dev, WER of different settings are:
17
+ greedy_search 7.27 best for dev
18
+
19
+ 2022-03-21 16:36:02,945 INFO [decode.py:352] batch 0/?, cuts processed until now is 29
20
+ 2022-03-21 16:38:54,255 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/greedy_search/recogs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
21
+ 2022-03-21 16:38:54,300 INFO [utils.py:406] [test-greedy_search] %WER 6.69% [1902 / 28430, 197 ins, 802 del, 903 sub ]
22
+ 2022-03-21 16:38:54,399 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/greedy_search/errs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt
23
+ 2022-03-21 16:38:54,400 INFO [decode.py:399]
24
+ For test, WER of different settings are:
25
+ greedy_search 6.69 best for test
26
+
27
+ 2022-03-21 16:38:54,400 INFO [decode.py:491] Done!
log/greedy_search/recogs-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt ADDED
The diff for this file is too large to render. See raw diff
log/greedy_search/recogs-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt ADDED
The diff for this file is too large to render. See raw diff
log/greedy_search/wer-summary-dev-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt ADDED
@@ -0,0 +1,2 @@
 
 
1
+ settings WER
2
+ greedy_search 7.27
log/greedy_search/wer-summary-test-greedy_search-epoch-29-avg-13-context-2-max-sym-per-frame-3.txt ADDED
@@ -0,0 +1,2 @@
 
 
1
+ settings WER
2
+ greedy_search 6.69
log/modified_beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt ADDED
The diff for this file is too large to render. See raw diff
log/modified_beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt ADDED
The diff for this file is too large to render. See raw diff
log/modified_beam_search/log-decode-epoch-29-avg-13-beam-4-2022-03-21-16-57-34 ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2022-03-21 16:57:34,038 INFO [decode.py:427] Decoding started
2
+ 2022-03-21 16:57:34,039 INFO [decode.py:433] Device: cuda:0
3
+ 2022-03-21 16:57:34,041 INFO [decode.py:443] {'feature_dim': 80, 'subsampling_factor': 4, 'attention_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'vgg_frontend': False, 'embedding_dim': 512, 'env_info': {'k2-version': '1.13', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '5ee082ea55f50e8bd42203ba266945ea5a236ab8', 'k2-git-date': 'Sun Feb 27 09:00:48 2022', 'lhotse-version': '1.0.0.dev+git.d917411.clean', 'torch-cuda-available': True, 'torch-cuda-version': '10.1', 'python-version': '3.8', 'icefall-git-branch': 'tedlium3-pruned-transducer-stateless-recipe', 'icefall-git-sha1': 'ad28c8c-dirty', 'icefall-git-date': 'Fri Mar 18 11:39:06 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-meixu/luomingshuang/k2/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.0.0.dev0+git.d917411.clean-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0307200233-b554c565c-lf9qd', 'IP address': '10.177.74.201'}, 'epoch': 29, 'avg': 13, 'exp_dir': PosixPath('pruned_transducer_stateless/exp'), 'bpe_model': 'data/lang_bpe_500/bpe.model', 'decoding_method': 'modified_beam_search', 'beam_size': 4, 'context_size': 2, 'max_sym_per_frame': 3, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200.0, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'res_dir': PosixPath('pruned_transducer_stateless/exp/modified_beam_search'), 'suffix': 'epoch-29-avg-13-beam-4', 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
4
+ 2022-03-21 16:57:34,042 INFO [decode.py:445] About to create model
5
+ 2022-03-21 16:57:34,689 INFO [decode.py:456] averaging ['pruned_transducer_stateless/exp/epoch-17.pt', 'pruned_transducer_stateless/exp/epoch-18.pt', 'pruned_transducer_stateless/exp/epoch-19.pt', 'pruned_transducer_stateless/exp/epoch-20.pt', 'pruned_transducer_stateless/exp/epoch-21.pt', 'pruned_transducer_stateless/exp/epoch-22.pt', 'pruned_transducer_stateless/exp/epoch-23.pt', 'pruned_transducer_stateless/exp/epoch-24.pt', 'pruned_transducer_stateless/exp/epoch-25.pt', 'pruned_transducer_stateless/exp/epoch-26.pt', 'pruned_transducer_stateless/exp/epoch-27.pt', 'pruned_transducer_stateless/exp/epoch-28.pt', 'pruned_transducer_stateless/exp/epoch-29.pt']
6
+ 2022-03-21 16:58:25,221 INFO [decode.py:465] Number of model parameters: 84514780
7
+ 2022-03-21 16:58:25,222 INFO [asr_datamodule.py:357] About to get dev cuts
8
+ 2022-03-21 16:58:25,253 INFO [asr_datamodule.py:362] About to get test cuts
9
+ 2022-03-21 16:58:25,323 INFO [asr_datamodule.py:300] About to create dev dataset
10
+ 2022-03-21 16:58:25,324 INFO [asr_datamodule.py:319] About to create dev dataloader
11
+ 2022-03-21 16:58:31,243 INFO [decode.py:352] batch 0/?, cuts processed until now is 22
12
+ 2022-03-21 16:58:42,078 INFO [decode.py:352] batch 2/?, cuts processed until now is 69
13
+ 2022-03-21 16:58:52,943 INFO [decode.py:352] batch 4/?, cuts processed until now is 94
14
+ 2022-03-21 16:59:00,982 INFO [decode.py:352] batch 6/?, cuts processed until now is 126
15
+ 2022-03-21 16:59:11,292 INFO [decode.py:352] batch 8/?, cuts processed until now is 148
16
+ 2022-03-21 16:59:22,146 INFO [decode.py:352] batch 10/?, cuts processed until now is 188
17
+ 2022-03-21 16:59:32,590 INFO [decode.py:352] batch 12/?, cuts processed until now is 201
18
+ 2022-03-21 16:59:42,641 INFO [decode.py:352] batch 14/?, cuts processed until now is 224
19
+ 2022-03-21 16:59:49,883 INFO [decode.py:352] batch 16/?, cuts processed until now is 243
20
+ 2022-03-21 17:00:00,443 INFO [decode.py:352] batch 18/?, cuts processed until now is 278
21
+ 2022-03-21 17:00:11,335 INFO [decode.py:352] batch 20/?, cuts processed until now is 314
22
+ 2022-03-21 17:00:21,748 INFO [decode.py:352] batch 22/?, cuts processed until now is 359
23
+ 2022-03-21 17:00:30,249 INFO [decode.py:352] batch 24/?, cuts processed until now is 380
24
+ 2022-03-21 17:00:38,706 INFO [decode.py:352] batch 26/?, cuts processed until now is 401
25
+ 2022-03-21 17:00:45,626 INFO [decode.py:352] batch 28/?, cuts processed until now is 425
26
+ 2022-03-21 17:00:54,943 INFO [decode.py:352] batch 30/?, cuts processed until now is 445
27
+ 2022-03-21 17:01:01,447 INFO [decode.py:352] batch 32/?, cuts processed until now is 457
28
+ 2022-03-21 17:01:05,423 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/modified_beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt
29
+ 2022-03-21 17:01:05,455 INFO [utils.py:406] [dev-beam_4] %WER 6.72% [1225 / 18226, 175 ins, 397 del, 653 sub ]
30
+ 2022-03-21 17:01:05,535 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/modified_beam_search/errs-dev-beam_4-epoch-29-avg-13-beam-4.txt
31
+ 2022-03-21 17:01:05,535 INFO [decode.py:399]
32
+ For dev, WER of different settings are:
33
+ beam_4 6.72 best for dev
34
+
35
+ 2022-03-21 17:01:11,300 INFO [decode.py:352] batch 0/?, cuts processed until now is 29
36
+ 2022-03-21 17:01:22,172 INFO [decode.py:352] batch 2/?, cuts processed until now is 107
37
+ 2022-03-21 17:01:32,987 INFO [decode.py:352] batch 4/?, cuts processed until now is 140
38
+ 2022-03-21 17:01:42,837 INFO [decode.py:352] batch 6/?, cuts processed until now is 216
39
+ 2022-03-21 17:01:53,801 INFO [decode.py:352] batch 8/?, cuts processed until now is 246
40
+ 2022-03-21 17:02:04,846 INFO [decode.py:352] batch 10/?, cuts processed until now is 304
41
+ 2022-03-21 17:02:15,147 INFO [decode.py:352] batch 12/?, cuts processed until now is 320
42
+ 2022-03-21 17:02:25,857 INFO [decode.py:352] batch 14/?, cuts processed until now is 351
43
+ 2022-03-21 17:02:36,852 INFO [decode.py:352] batch 16/?, cuts processed until now is 393
44
+ 2022-03-21 17:02:47,304 INFO [decode.py:352] batch 18/?, cuts processed until now is 434
45
+ 2022-03-21 17:02:57,987 INFO [decode.py:352] batch 20/?, cuts processed until now is 465
46
+ 2022-03-21 17:03:07,180 INFO [decode.py:352] batch 22/?, cuts processed until now is 583
47
+ 2022-03-21 17:03:17,898 INFO [decode.py:352] batch 24/?, cuts processed until now is 624
48
+ 2022-03-21 17:03:28,845 INFO [decode.py:352] batch 26/?, cuts processed until now is 658
49
+ 2022-03-21 17:03:39,757 INFO [decode.py:352] batch 28/?, cuts processed until now is 699
50
+ 2022-03-21 17:03:50,155 INFO [decode.py:352] batch 30/?, cuts processed until now is 738
51
+ 2022-03-21 17:04:01,078 INFO [decode.py:352] batch 32/?, cuts processed until now is 794
52
+ 2022-03-21 17:04:09,367 INFO [decode.py:352] batch 34/?, cuts processed until now is 836
53
+ 2022-03-21 17:04:19,291 INFO [decode.py:352] batch 36/?, cuts processed until now is 881
54
+ 2022-03-21 17:04:26,154 INFO [decode.py:352] batch 38/?, cuts processed until now is 907
55
+ 2022-03-21 17:04:29,678 INFO [decode.py:352] batch 40/?, cuts processed until now is 943
56
+ 2022-03-21 17:04:34,928 INFO [decode.py:352] batch 42/?, cuts processed until now is 979
57
+ 2022-03-21 17:04:41,495 INFO [decode.py:352] batch 44/?, cuts processed until now is 1011
58
+ 2022-03-21 17:04:46,178 INFO [decode.py:352] batch 46/?, cuts processed until now is 1068
59
+ 2022-03-21 17:04:49,535 INFO [decode.py:352] batch 48/?, cuts processed until now is 1084
60
+ 2022-03-21 17:04:55,171 INFO [decode.py:352] batch 50/?, cuts processed until now is 1113
61
+ 2022-03-21 17:04:58,825 INFO [decode.py:352] batch 52/?, cuts processed until now is 1155
62
+ 2022-03-21 17:04:58,950 INFO [decode.py:369] The transcripts are stored in pruned_transducer_stateless/exp/modified_beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt
63
+ 2022-03-21 17:04:58,989 INFO [utils.py:406] [test-beam_4] %WER 6.12% [1741 / 28430, 202 ins, 643 del, 896 sub ]
64
+ 2022-03-21 17:04:59,118 INFO [decode.py:382] Wrote detailed error stats to pruned_transducer_stateless/exp/modified_beam_search/errs-test-beam_4-epoch-29-avg-13-beam-4.txt
65
+ 2022-03-21 17:04:59,119 INFO [decode.py:399]
66
+ For test, WER of different settings are:
67
+ beam_4 6.12 best for test
68
+
69
+ 2022-03-21 17:04:59,119 INFO [decode.py:491] Done!
log/modified_beam_search/recogs-dev-beam_4-epoch-29-avg-13-beam-4.txt ADDED
The diff for this file is too large to render. See raw diff
log/modified_beam_search/recogs-test-beam_4-epoch-29-avg-13-beam-4.txt ADDED
The diff for this file is too large to render. See raw diff
log/modified_beam_search/wer-summary-dev-beam_4-epoch-29-avg-13-beam-4.txt ADDED
@@ -0,0 +1,2 @@
 
 
1
+ settings WER
2
+ beam_4 6.72
log/modified_beam_search/wer-summary-test-beam_4-epoch-29-avg-13-beam-4.txt ADDED
@@ -0,0 +1,2 @@
 
 
1
+ settings WER
2
+ beam_4 6.12
test_wavs/RESULTS.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ You can use the following command to test the pretrained.py and the pretrained files:
2
+ ```
3
+ CUDA_VISIBLE_DEVICES='1' python pruned_transducer_stateless/pretrained.py --checkpoint icefall_asr_tedlium3_pruned_transducer_stateless/exp/pretrained_average_17_to_29.pt --bpe-model icefall_asr_tedlium3_pruned_transducer_stateless/data/lang_bpe_500/bpe.model --method greedy_search icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W01.wav icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W02.wav icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W03.wav
4
+ ```
5
+
6
+ The running results are as follows:
7
+ ```
8
+ 2022-03-21 16:19:46,125 INFO [pretrained.py:253] {'sample_rate': 16000, 'feature_dim': 80, 'subsampling_factor': 4, 'attention_dim': 512, 'nhead': 8, 'dim_feedforward': 2048, 'num_encoder_layers': 12, 'vgg_frontend': False, 'embedding_dim': 512, 'env_info': {'k2-version': '1.13', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '5ee082ea55f50e8bd42203ba266945ea5a236ab8', 'k2-git-date': 'Sun Feb 27 09:00:48 2022', 'lhotse-version': '1.0.0.dev+git.d917411.clean', 'torch-cuda-available': True, 'torch-cuda-version': '10.1', 'python-version': '3.8', 'icefall-git-branch': 'tedlium3-pruned-transducer-stateless-recipe', 'icefall-git-sha1': 'ad28c8c-dirty', 'icefall-git-date': 'Fri Mar 18 11:39:06 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-meixu/luomingshuang/k2/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.0.0.dev0+git.d917411.clean-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0307200233-b554c565c-lf9qd', 'IP address': '10.177.74.201'}, 'checkpoint': 'icefall_asr_tedlium3_pruned_transducer_stateless/exp/pretrained_average_17_to_29.pt', 'bpe_model': 'icefall_asr_tedlium3_pruned_transducer_stateless/data/lang_bpe_500/bpe.model', 'method': 'greedy_search', 'sound_files': ['icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W01.wav', 'icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W02.wav', 'icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W03.wav'], 'beam_size': 4, 'context_size': 2, 'max_sym_per_frame': 3, 'blank_id': 0, 'unk_id': 2, 'vocab_size': 500}
9
+ 2022-03-21 16:19:46,125 INFO [pretrained.py:259] device: cuda:0
10
+ 2022-03-21 16:19:46,125 INFO [pretrained.py:261] Creating model
11
+ 2022-03-21 16:20:03,351 INFO [pretrained.py:270] Constructing Fbank computer
12
+ 2022-03-21 16:20:03,354 INFO [pretrained.py:280] Reading sound files: ['icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W01.wav', 'icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W02.wav', 'icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W03.wav']
13
+ 2022-03-21 16:20:03,379 INFO [pretrained.py:286] Decoding started
14
+ 2022-03-21 16:20:03,519 INFO [pretrained.py:306] Using greedy_search
15
+ 2022-03-21 16:20:05,040 INFO [pretrained.py:334]
16
+ icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W01.wav:
17
+ choice isn 't it i don 't live there but i did journey on a twenty seven thousand mile trip for two years to the fastest growing and widest counties in america what is a whitopia i define whitopia in three ways first a whitopia has posted at least six percent
18
+
19
+ icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W02.wav:
20
+ population growth since two thousand secondly the majority of that growth comes from white migrants and third a whitopia has an ineffable charm a pleasant look and feel a jenesequa to learn how and why whitopias are ticking i immersed myself for several months apiece in three of them first
21
+
22
+ icefall_asr_tedlium3_pruned_transducer_stateless/test_wavs/RichBenjamin_2015W03.wav:
23
+ st george utah second coeuralene idaho and third for siteth county georgia first stop st george a beautiful town of red rock landscapes in the 1850s brigham young dispatched families to st george to grow cotton because of the hot arid climate and so they called it utah 's dixie and the name sticks to this day
24
+
25
+
26
+ 2022-03-21 16:20:05,040 INFO [pretrained.py:336] Decoding Done
27
+
28
+ ```
test_wavs/RichBenjamin_2015W01.wav ADDED
Binary file (961 kB). View file
test_wavs/RichBenjamin_2015W02.wav ADDED
Binary file (961 kB). View file
test_wavs/RichBenjamin_2015W03.wav ADDED
Binary file (959 kB). View file