chrisjay commited on
Commit
93c9f2d
1 Parent(s): 7a577ae

final commits to project

Browse files
Files changed (1) hide show
  1. article.py +4 -0
article.py CHANGED
@@ -15,6 +15,10 @@ This dataset will boost speech technologies (like speech-to-text, text-to-speech
15
  - Very easy dataset to train your model on and get good results. With this dataset, you can easily train a model to recognize numbers in your language.
16
  - Opens up opportunities for more sophisticated speech processing models for African languages.
17
 
 
 
 
 
18
  **About the dataset**
19
 
20
  - The data (metadata, text, and audio recording) are uploaded to [a public Hugging Face dataset](https://huggingface.co/datasets/chrisjay/crowd-speech-africa). [This](https://huggingface.co/spaces/chrisjay/afro-speech/blob/main/app.py#L90-L106) is the part of our code that handles the upload.
 
15
  - Very easy dataset to train your model on and get good results. With this dataset, you can easily train a model to recognize numbers in your language.
16
  - Opens up opportunities for more sophisticated speech processing models for African languages.
17
 
18
+ **What about License and security?**
19
+ - The safety and interest of the recorders come first. Based on that, we are exploring options like a gated dataset ([this is an example of a gated dataset](https://huggingface.co/datasets/mozilla-foundation/common_voice_9_0)) to ensure anonymity and safety, as well as better license for the dataset.
20
+ - If you have ideas of better privacy enhancement processes, or more licensing that is more beneficial to the contributors, please reach out to me. My contact details are below.
21
+
22
  **About the dataset**
23
 
24
  - The data (metadata, text, and audio recording) are uploaded to [a public Hugging Face dataset](https://huggingface.co/datasets/chrisjay/crowd-speech-africa). [This](https://huggingface.co/spaces/chrisjay/afro-speech/blob/main/app.py#L90-L106) is the part of our code that handles the upload.