johnBamma commited on
Commit
e08d077
1 Parent(s): c24dd82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -1
README.md CHANGED
@@ -5,4 +5,43 @@ language:
5
  pipeline_tag: automatic-speech-recognition
6
  tags:
7
  - icefall
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  pipeline_tag: automatic-speech-recognition
6
  tags:
7
  - icefall
8
+ ---
9
+
10
+ # icefall-asr-ksponspeech-pruned-transducer-stateless7-streaming-2024-06-12
11
+
12
+ KsponSpeech is a large-scale spontaneous speech corpus of Korean.
13
+ This corpus contains 969 hours of open-domain dialog utterances,
14
+ spoken by about 2,000 native Korean speakers in a clean environment.
15
+
16
+ All data were constructed by recording the dialogue of two people
17
+ freely conversing on a variety of topics and manually transcribing the utterances.
18
+
19
+ The transcription provides a dual transcription consisting of orthography and pronunciation,
20
+ and disfluency tags for spontaneity of speech, such as filler words, repeated words, and word fragments.
21
+
22
+ The original audio data has a pcm extension.
23
+ During preprocessing, it is converted into a file in the flac extension and saved anew.
24
+
25
+ KsponSpeech is publicly available on an open data hub site of the Korea government.
26
+ The dataset must be downloaded manually.
27
+
28
+ For more details, please visit:
29
+
30
+ - Dataset: https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=123
31
+ - Paper: https://www.mdpi.com/2076-3417/10/19/6936
32
+
33
+ ### zipformer (Zipformer + pruned statelss transducer)
34
+
35
+ #### [zipformer](./zipformer)
36
+
37
+ Number of model parameters: 74,778,511, i.e., 74.78 M
38
+
39
+ ##### Training on KsponSpeech (with MUSAN)
40
+
41
+ The CERs are:
42
+
43
+ | decoding method | eval_clean | eval_other | comment |
44
+ |----------------------|------------|------------|---------------------|
45
+ | greedy search | 10.60 | 11.56 | --epoch 30 --avg 9 |
46
+ | fast beam search | 10.59 | 11.54 | --epoch 30 --avg 9 |
47
+ | modified beam search | 10.35 | 11.35 | --epoch 30 --avg 9 |