chrisjay commited on
Commit
7a577ae
1 Parent(s): 73257d5

final commits to project

Browse files
Files changed (3) hide show
  1. app.py +4 -3
  2. article.py +9 -5
  3. data +1 -1
app.py CHANGED
@@ -214,13 +214,14 @@ This is a platform to contribute to your African language by recording your voic
214
  markdown="""
215
  # 🌍 African Digits Recording Sprint
216
 
217
- > Record numbers 0-9 in your African language.
218
 
219
  1. Fill in your email. This is completely optional. We need this to track your progress for the prize.
 
220
  2. Choose your African language
221
  3. Fill in the speaker metadata (age, gender, accent). This is optional but important to build better speech models.
222
  4. You will see the image of a number __(this is the number you will record)__.
223
- 5. Fill in the word of that number (optional)
224
  6. Click record and say the number in your African language.
225
  7. Click ‘Submit’. It will save your record and go to the next number.
226
  8. Repeat 4-7
@@ -232,7 +233,7 @@ markdown="""
232
  block = gr.Blocks(css=BLOCK_CSS)
233
  with block:
234
  gr.Markdown(markdown)
235
- email = gr.inputs.Textbox(placeholder='your email',label="Email (if you want join the sprint)",default='')
236
  with gr.Tabs():
237
 
238
  with gr.TabItem('Record'):
214
  markdown="""
215
  # 🌍 African Digits Recording Sprint
216
 
217
+ > Record numbers 0-9 in your African language. The ten people who record the most from 19th May - 19th June will each receive a prize.
218
 
219
  1. Fill in your email. This is completely optional. We need this to track your progress for the prize.
220
+ __Note:__ You should record all numbers shown till the end. It does not count if you stop mid-way.
221
  2. Choose your African language
222
  3. Fill in the speaker metadata (age, gender, accent). This is optional but important to build better speech models.
223
  4. You will see the image of a number __(this is the number you will record)__.
224
+ 5. Fill in the word of that number (optional). You can leave this blank.
225
  6. Click record and say the number in your African language.
226
  7. Click ‘Submit’. It will save your record and go to the next number.
227
  8. Repeat 4-7
233
  block = gr.Blocks(css=BLOCK_CSS)
234
  with block:
235
  gr.Markdown(markdown)
236
+ email = gr.inputs.Textbox(placeholder='your email',label="Email (Your email is not made public. We need it to consider you for the prize.)",default='')
237
  with gr.Tabs():
238
 
239
  with gr.TabItem('Record'):
article.py CHANGED
@@ -7,14 +7,17 @@ Existing speech recognition services are not available in many African languages
7
 
8
  This dataset will boost speech technologies (like speech-to-text, text-to-speech, speech translation, and modeling) for African languages, which hitherto had little or no public dataset.
9
 
10
- **Note:** This is a continuous effort. This sprint is just to kick-start the event.
11
 
12
  **Benefits of such a dataset**
13
- - Useful dataset to introduce people to audio-related Machine Learning. It can be used as a simple training and/or evaluation dataset for speech processing tasks.
 
 
 
14
 
15
  **About the dataset**
16
 
17
- - The data (metadat,text, and audio recording) are uploaded to [a public Hugging Face dataset](https://huggingface.co/datasets/chrisjay/crowd-speech-africa).
18
  - We do not collect your name, address or other sensitive information.
19
  - If for some reason you want to remove your entry, please reach out by email.
20
  - Your email, if given, is used only to keep track of your progress in order to give the prizes to the top scorers. They are temporarily stored in [this private dataset](https://huggingface.co/datasets/chrisjay/african-digits-recording-sprint-email) and immediately deleted after the sprint.
@@ -22,7 +25,8 @@ This dataset will boost speech technologies (like speech-to-text, text-to-speech
22
  **Contact**
23
 
24
  In case of questions, issues or anything contact Chris Emezue at:
25
- - chris@huggingface.co
26
-
 
27
 
28
  """
7
 
8
  This dataset will boost speech technologies (like speech-to-text, text-to-speech, speech translation, and modeling) for African languages, which hitherto had little or no public dataset.
9
 
10
+ **Note:** This is a continuous effort. This sprint is just to kick-start the event. Please feel free to share with your family and friends and keep recording more.
11
 
12
  **Benefits of such a dataset**
13
+ - Useful dataset to learn audio-related Machine Learning (automatics speech recognition, text-to-speech, other types of speech processing).
14
+ - It can be used as a simple training and/or evaluation dataset for speech processing tasks.
15
+ - Very easy dataset to train your model on and get good results. With this dataset, you can easily train a model to recognize numbers in your language.
16
+ - Opens up opportunities for more sophisticated speech processing models for African languages.
17
 
18
  **About the dataset**
19
 
20
+ - The data (metadata, text, and audio recording) are uploaded to [a public Hugging Face dataset](https://huggingface.co/datasets/chrisjay/crowd-speech-africa). [This](https://huggingface.co/spaces/chrisjay/afro-speech/blob/main/app.py#L90-L106) is the part of our code that handles the upload.
21
  - We do not collect your name, address or other sensitive information.
22
  - If for some reason you want to remove your entry, please reach out by email.
23
  - Your email, if given, is used only to keep track of your progress in order to give the prizes to the top scorers. They are temporarily stored in [this private dataset](https://huggingface.co/datasets/chrisjay/african-digits-recording-sprint-email) and immediately deleted after the sprint.
25
  **Contact**
26
 
27
  In case of questions, issues or anything contact Chris Emezue at:
28
+ - Email: chris@huggingface.co
29
+ - [Twitter](https://twitter.com/ChrisEmezue)
30
+ - [Telegram](https://t.me/realchrisjay)
31
 
32
  """
data CHANGED
@@ -1 +1 @@
1
- Subproject commit f378d5cd72892211c6ff30dbecf891f953836e0f
1
+ Subproject commit 4dcfaf1e20e56703bd47ac80aafcaddd43af279b