File size: 1,847 Bytes
2e11669
c869ad3
 
 
 
 
 
 
2e11669
 
 
 
 
 
 
 
 
 
 
d9e8a10
2e11669
 
 
 
d9e8a10
2e11669
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a51f5db
e676007
a51f5db
2e11669
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
language: "rw"
thumbnail:
pipeline_tag: automatic-speech-recognition
tags:
- Coqui
- Deepspeech
- LSTM
license: "apache-2.0"
datasets:
- commonvoice
metrics:
- wer
---

**Model card - Kinyarwanda coqui STT model**

**Model details**
- Kinyarwanda Speech to text model
- Developed by [Digital Umuganda](digitalumuganda.com)
- Model based from: Baidu Deepspeech end to end RNN model
- paper: [deepspeech end to end STT](https://arxiv.org/pdf/1412.5567.pdf)
- Documentation on model: [deepspeech documentation](https://deepspeech.readthedocs.io/)
- License: Mozilla 2.0 License
- Feedback on the model: samuel@digitalumuganda.com

**Intended use cases**
- Intended to be used for 
  - simple keyword spotting
  - simple transcribing
  - transfer learning for better kinyarwanda and african language models
- Intended to be used by:
  - App developpers
  - various organizations who want to transcribe kinyarwanda recordings
  - ML researchers
  - other researchers in Kinyarwanda and tech usage in kinyarwanda (e.g. Linguists, journalists)
- Not intended to be used as:
  - a fully fledged voice assistant
  - voice recognition application 
  - Multiple languages STT
  - language detection 
  
**Factors**
- Anti-bias: these are bias that can influence the accuracy of the model
  - Gender
  - accents and dialects
  - age
- Voice quality: factors that can influence the accuracy of the model
  - Background noise
  - short sentences
- Voice format: voices must be converted to the wav format
  - wav format
  
**Metrics**
- word error rate on the Common Voice Kinyarwanda test set

|Test Corpus|WER|
|-----------|---|
|Common Voice|39.1\%|

**Training data**
- [common voice crowdsource website](https://commonvoice.mozilla.org/en/datasets)

**Evaluation data**
- [common voice crowdsource website](https://commonvoice.mozilla.org/en/datasets)