Yurii Paniv commited on
Commit
72475af
1 Parent(s): 6fb0a7e

Add missing information to README.md

Browse files

1. Add disclaimer.
2. Add link to Coqui STT.
3. Hide guide by default.
4. Update guide.

Files changed (1) hide show
  1. README.md +27 -8
README.md CHANGED
@@ -1,27 +1,42 @@
1
  # voice-recognition-ua
2
- This is a repository with aim to apply [DeepSpeech](https://github.com/mozilla/DeepSpeech "DeepSpeech") (state-of-the-art speech recognition model) on Ukrainian language.
3
  You can see online demo here: https://voice-recognition-ua.herokuapp.com (your voice is not stored).
4
  Source code is in this repository together with auto-deploy pipeline scripts.
5
- P.S. Due to small size of dataset (20 hours), don't expect production-grade performance.
6
  Contribute your voice to [Common Voice project](https://commonvoice.mozilla.org/uk "Common Voice") yourself, so we can improve model accuracy.
7
 
 
 
 
 
 
 
8
  ## Pre-run requirements
9
  Make sure to download:
10
- 1. https://github.com/robinhad/voice-recognition-ua/releases/download/v0.2/uk.tflite
11
- 3. https://github.com/mozilla/DeepSpeech/releases/download/v0.9.1/deepspeech-0.9.1-models.tflite
12
 
13
  ## How to launch
14
  ```
15
  export FLASK_APP=main.py
 
16
  flask run
17
  ```
18
 
19
  # How to train your own model
20
 
 
 
21
  Most of the guide is took from there:
22
- https://deepspeech.readthedocs.io/en/v0.9.1/TRAINING.html
 
 
 
23
 
24
  ## Steps:
 
 
 
25
  1. Create g4dn.xlarge instance on AWS, Deep Learning AMI (Ubuntu 18.04), 150 GB of space.
26
 
27
  2. Install Python requirements:
@@ -126,8 +141,10 @@ WER - Word Error Rate, calculates how much characters were guessed correctly.
126
  CER - Character Error Rate, calculates how much characters were guessed correctly.
127
  Here we have WER 95% and CER 36%.
128
  It is high because we don't use scorer (language model that maps chacter sequence to the closest word match) during training, you can improve performance if you create scorer for Ukrainian language. As a text corpus you can use Wikipedia articles.
129
- ```
130
- Test on ../cv-corpus-5.1-2020-06-22/uk/clips/test.csv - WER: 0.950863, CER: 0.357779, loss: 59.444176
 
 
131
  --------------------------------------------------------------------------------
132
  Best WER:
133
  --------------------------------------------------------------------------------
@@ -210,7 +227,8 @@ WER: 2.000000, CER: 0.333333, loss: 10.796988
210
  - src: "легітимність"
211
  - res: "вегі пимнсть"
212
  --------------------------------------------------------------------------------
213
- ```
 
214
  16. To export model for later usage:
215
  ```
216
  mkdir model
@@ -230,3 +248,4 @@ python3 DeepSpeech.py \
230
  --epochs 0
231
  ```
232
  For advanced usage please refer to https://deepspeech.readthedocs.io/en/v0.9.1/USING.html
 
 
1
  # voice-recognition-ua
2
+ This is a repository with aim to apply [Coqui STT](https://github.com/coqui-ai/STT "STT")(formerly [DeepSpeech](https://github.com/mozilla/DeepSpeech)) (state-of-the-art speech recognition model) on Ukrainian language.
3
  You can see online demo here: https://voice-recognition-ua.herokuapp.com (your voice is not stored).
4
  Source code is in this repository together with auto-deploy pipeline scripts.
5
+ P.S. Due to small size of dataset (50 hours), don't expect production-grade performance.
6
  Contribute your voice to [Common Voice project](https://commonvoice.mozilla.org/uk "Common Voice") yourself, so we can improve model accuracy.
7
 
8
+ <h2>CAUTION: THIS MODEL AND SCORER IS PUBLISHED ONLY FOR RESEARCH AND NON-COMMERCIAL USE.</h2>
9
+
10
+ Checkout latest releases here: https://github.com/robinhad/voice-recognition-ua/releases/.
11
+
12
+ If you'd like to check out different models for Ukrainian language, please visit https://github.com/egorsmkv/speech-recognition-uk.
13
+
14
  ## Pre-run requirements
15
  Make sure to download:
16
+ 1. https://github.com/robinhad/voice-recognition-ua/releases/download/v0.4/uk.tflite
17
+ 2. https://github.com/robinhad/voice-recognition-ua/releases/download/v0.4/kenlm.scorer
18
 
19
  ## How to launch
20
  ```
21
  export FLASK_APP=main.py
22
+ export TOKEN=<Telegram bot API key>
23
  flask run
24
  ```
25
 
26
  # How to train your own model
27
 
28
+ Guides for importing data are available in [/scripts](/scripts) folder.
29
+
30
  Most of the guide is took from there:
31
+ https://deepspeech.readthedocs.io/en/v0.9.3/TRAINING.html
32
+
33
+ Disclaimer: if you would like to continue working on the model, use https://github.com/coqui-ai/STT (this is former DeepSpeech team, where development continues).
34
+
35
 
36
  ## Steps:
37
+
38
+ <details>
39
+ <summary>This guide could be outdated, please be aware.</summary>
40
  1. Create g4dn.xlarge instance on AWS, Deep Learning AMI (Ubuntu 18.04), 150 GB of space.
41
 
42
  2. Install Python requirements:
 
141
  CER - Character Error Rate, calculates how much characters were guessed correctly.
142
  Here we have WER 95% and CER 36%.
143
  It is high because we don't use scorer (language model that maps chacter sequence to the closest word match) during training, you can improve performance if you create scorer for Ukrainian language. As a text corpus you can use Wikipedia articles.
144
+
145
+ <details>
146
+ <summary>Test on ../cv-corpus-5.1-2020-06-22/uk/clips/test.csv - WER: 0.950863, CER: 0.357779, loss: 59.444176</summary>
147
+
148
  --------------------------------------------------------------------------------
149
  Best WER:
150
  --------------------------------------------------------------------------------
 
227
  - src: "легітимність"
228
  - res: "вегі пимнсть"
229
  --------------------------------------------------------------------------------
230
+ </details>
231
+
232
  16. To export model for later usage:
233
  ```
234
  mkdir model
 
248
  --epochs 0
249
  ```
250
  For advanced usage please refer to https://deepspeech.readthedocs.io/en/v0.9.1/USING.html
251
+ </details>