luomingshuang commited on
Commit
7a6760a
1 Parent(s): ffa2a91

change for README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -15
README.md CHANGED
@@ -1,7 +1,6 @@
1
- Note: This recipe is trained with the codes from this PR https://github.com/k2-fsa/icefall/pull/355
2
- And the SpecAugment codes from this PR https://github.com/lhotse-speech/lhotse/pull/604.
3
- # Pre-trained Transducer-Stateless2 models for the Aidatatang_200zh dataset with icefall.
4
- The model was trained on full [Aidatatang_200zh](https://www.openslr.org/62) with the scripts in [icefall](https://github.com/k2-fsa/icefall) based on the latest version k2.
5
  ## Training procedure
6
  The main repositories are list below, we will update the training and decoding scripts with the update of version.
7
  k2: https://github.com/k2-fsa/k2
@@ -15,25 +14,29 @@ cd icefall
15
  ```
16
  * Preparing data.
17
  ```
18
- cd egs/aidatatang_200zh/ASR
19
  bash ./prepare.sh
20
  ```
21
  * Training
22
  ```
23
- export CUDA_VISIBLE_DEVICES="0,1"
24
  ./pruned_transducer_stateless2/train.py \
25
- --world-size 2 \
26
- --num-epochs 30 \
27
  --start-epoch 0 \
28
  --exp-dir pruned_transducer_stateless2/exp \
29
  --lang-dir data/lang_char \
30
- --max-duration 250
 
 
 
 
31
  ```
32
  ## Evaluation results
33
- The decoding results (WER%) on Aidatatang_200zh(dev and test) are listed below, we got this result by averaging models from epoch 11 to 29.
34
  The WERs are
35
- | | dev | test | comment |
36
- |------------------------------------|------------|------------|------------------------------------------|
37
- | greedy search | 5.53 | 6.59 | --epoch 29, --avg 19, --max-duration 100 |
38
- | modified beam search (beam size 4) | 5.28 | 6.32 | --epoch 29, --avg 19, --max-duration 100 |
39
- | fast beam search (set as default) | 5.29 | 6.33 | --epoch 29, --avg 19, --max-duration 1500|
 
1
+ Note: This recipe is trained with the codes from this PR https://github.com/k2-fsa/icefall/pull/349
2
+ # Pre-trained Transducer-Stateless2 models for the WenetSpeech dataset with icefall.
3
+ The model was trained on the L subset of WenetSpeech with the scripts in [icefall](https://github.com/k2-fsa/icefall) based on the latest version k2.
 
4
  ## Training procedure
5
  The main repositories are list below, we will update the training and decoding scripts with the update of version.
6
  k2: https://github.com/k2-fsa/k2
 
14
  ```
15
  * Preparing data.
16
  ```
17
+ cd egs/wenetspeech/ASR
18
  bash ./prepare.sh
19
  ```
20
  * Training
21
  ```
22
+ export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
23
  ./pruned_transducer_stateless2/train.py \
24
+ --world-size 8 \
25
+ --num-epochs 15 \
26
  --start-epoch 0 \
27
  --exp-dir pruned_transducer_stateless2/exp \
28
  --lang-dir data/lang_char \
29
+ --max-duration 180 \
30
+ --valid-interval 3000 \
31
+ --model-warm-step 3000 \
32
+ --save-every-n 8000 \
33
+ --training-subset L
34
  ```
35
  ## Evaluation results
36
+ The decoding results (WER%) on WenetSpeech(dev, test-net and test-meeting) are listed below, we got this result by averaging models from epoch 9 to 10.
37
  The WERs are
38
+ | | dev | test-net | test-meeting | comment |
39
+ |------------------------------------|-------|----------|--------------|------------------------------------------|
40
+ | greedy search | 7.80 | 8.75 | 13.49 | --epoch 10, --avg 2, --max-duration 100 |
41
+ | modified beam search (beam size 4) | 7.76 | 8.71 | 13.41 | --epoch 10, --avg 2, --max-duration 100 |
42
+ | fast beam search (set as default) | 7.94 | 8.74 | 13.80 | --epoch 10, --avg 2, --max-duration 1500 |