csukuangfj commited on
Commit
c4c0def
1 Parent(s): 592e43b

Update README.

Browse files
Files changed (1) hide show
  1. README.md +166 -0
README.md ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ tags:
4
+ - icefall
5
+ - k2
6
+ - transducer
7
+ - aishell
8
+ - ASR
9
+ - stateless transducer
10
+ - PyTorch
11
+ license: "apache-2.0"
12
+ datasets:
13
+ - aishell
14
+ metrics:
15
+ - WER
16
+ ---
17
+
18
+ # Introduction
19
+
20
+ This repo contains pre-trained model using
21
+ <https://github.com/k2-fsa/icefall/pull/219>.
22
+
23
+ It is trained on [AIShell](https://www.openslr.org/33/) dataset
24
+ using modified transducer from [optimized_transducer](https://github.com/csukuangfj/optimized_transducer).
25
+
26
+ ## How to clone this repo
27
+ ```
28
+ sudo apt-get install git-lfs
29
+ git clone https://huggingface.co/csukuangfj/icefall-aishell-transducer-stateless-modified-2022-03-01
30
+
31
+ cd icefall-aishell-transducer-stateless-modified-2022-03-01
32
+ git lfs pull
33
+ ```
34
+
35
+ **Catuion**: You have to run `git lfs pull`. Otherwise, you will be SAD later.
36
+
37
+ The model in this repo is trained using the commit `TODO`.
38
+
39
+ You can use
40
+
41
+ ```
42
+ git clone https://github.com/k2-fsa/icefall
43
+ cd icefall
44
+ git checkout TODO
45
+ ```
46
+ to download `icefall`.
47
+
48
+ You can find the model information by visiting <https://github.com/k2-fsa/icefall/blob/TODO/egs/aishell/ASR/transducer_stateless_modified/train.py#L232>.
49
+
50
+
51
+ In short, the encoder is a Conformer model with 8 heads, 12 encoder layers, 512-dim attention, 2048-dim feedforward;
52
+ the decoder contains a 512-dim embedding layer and a Conv1d with kernel size 2.
53
+
54
+ The decoder architecture is modified from
55
+ [Rnn-Transducer with Stateless Prediction Network](https://ieeexplore.ieee.org/document/9054419).
56
+ A Conv1d layer is placed right after the input embedding layer.
57
+
58
+ -----
59
+
60
+ ## Description
61
+
62
+ This repo provides pre-trained transducer Conformer model for the AIShell dataset
63
+ using [icefall][icefall]. There are no RNNs in the decoder. The decoder is stateless
64
+ and contains only an embedding layer and a Conv1d.
65
+
66
+ The commands for training are:
67
+
68
+ ```bash
69
+ cd egs/aishell/ASR
70
+ ./prepare.sh --stop-stage 6
71
+
72
+ export CUDA_VISIBLE_DEVICES="0,1,2"
73
+
74
+ ./transducer_stateless_modified/train.py \
75
+ --world-size 3 \
76
+ --num-epochs 90 \
77
+ --start-epoch 0 \
78
+ --exp-dir transducer_stateless_modified/exp-4 \
79
+ --max-duration 250 \
80
+ --lr-factor 2.0 \
81
+ --context-size 2 \
82
+ --modified-transducer-prob 0.25
83
+ ```
84
+
85
+ The tensorboard training log can be found at
86
+ <https://tensorboard.dev/experiment/C27M8YxRQCa1t2XglTqlWg>
87
+
88
+ The commands for decoding are
89
+
90
+ ```bash
91
+ # greedy search
92
+ for epoch in 64; do
93
+ for avg in 33; do
94
+ ./transducer_stateless_modified-2/decode.py \
95
+ --epoch $epoch \
96
+ --avg $avg \
97
+ --exp-dir transducer_stateless_modified/exp-4 \
98
+ --max-duration 100 \
99
+ --context-size 2 \
100
+ --decoding-method greedy_search \
101
+ --max-sym-per-frame 1
102
+ done
103
+ done
104
+
105
+ # modified beam search
106
+ for epoch in 64; do
107
+ for avg in 33; do
108
+ ./transducer_stateless_modified/decode.py \
109
+ --epoch $epoch \
110
+ --avg $avg \
111
+ --exp-dir transducer_stateless_modified/exp-4 \
112
+ --max-duration 100 \
113
+ --context-size 2 \
114
+ --decoding-method modified_beam_search \
115
+ --beam-size 4
116
+ done
117
+ done
118
+ ```
119
+
120
+ You can find the decoding log for the above command in this
121
+ repo (in the folder [log][log]).
122
+
123
+ The WER for the test dataset is
124
+
125
+ | | test |comment |
126
+ |------------------------|------|----------------------------------------------------------------|
127
+ | greedy search | 5.22 |--epoch 64, --avg 33, --max-duration 100, --max-sym-per-frame 1 |
128
+ | modified beam search | 5.02 |--epoch 64, --avg 33, --max-duration 100 --beam-size 4 |
129
+
130
+ # File description
131
+
132
+ - [log][log], this directory contains the decoding log and decoding results
133
+ - [test_wavs][test_wavs], this directory contains wave files for testing the pre-trained model
134
+ - [data][data], this directory contains files generated by [prepare.sh][prepare]
135
+ - [exp][exp], this directory contains only one file: `preprained.pt`
136
+
137
+ `exp/pretrained.pt` is generated by the following command:
138
+
139
+ ```bash
140
+ epoch=64
141
+ avg=33
142
+
143
+ ./transducer_stateless_modified/export.py \
144
+ --exp-dir ./transducer_stateless_modified/exp-4 \
145
+ --lang-dir ./data/lang_char \
146
+ --epoch $epoch \
147
+ --avg $avg
148
+ ```
149
+
150
+ **HINT**: To use `pretrained.pt` to compute the WER for the `test` dataset,
151
+ just do the following:
152
+
153
+ ```bash
154
+ cp icefall-aishell-transducer-stateless-modified-2022-03-01/exp/pretrained.pt \
155
+ /path/to/icefall/egs/aishell/ASR/transducer_stateless_modified/exp/epoch-999.pt
156
+ ```
157
+ and pass `--epoch 999 --avg 1` to `transducer_stateless_modified/decode.py`.
158
+
159
+
160
+ [icefall]: https://github.com/k2-fsa/icefall
161
+ [prepare]: https://github.com/k2-fsa/icefall/blob/master/egs/aishell/ASR/prepare.sh
162
+ [exp]: https://huggingface.co/csukuangfj/icefall-aishell-transducer-stateless-modified-2022-03-01/tree/main/exp
163
+ [data]: https://huggingface.co/csukuangfj/icefall-aishell-transducer-stateless-modified-2022-03-01/tree/main/data
164
+ [test_wavs]: https://huggingface.co/csukuangfj/icefall-aishell-transducer-stateless-modified-2022-03-01/tree/main/test_wavs
165
+ [log]: https://huggingface.co/csukuangfj/icefall-aishell-transducer-stateless-modified-2022-03-01/tree/main/log
166
+ [icefall]: https://github.com/k2-fsa/icefall