Spaces:

ivrit-ai
/

hebrew-transcription-leaderboard

Running

benderrodriguez commited on 10 days ago

Commit

9ab26e5

•

1 Parent(s): dcd6a69

Benchmark description update

Files changed (1) hide show

src/about.py CHANGED Viewed

@@ -54,17 +54,17 @@ The following datasets are used in our evaluation:
 "SASPEECH: A Hebrew Single Speaker Dataset for Text To Speech and Voice Conversion" (Sharoni, O., Shenberg, R., Cooper, E. (2023) SASPEECH: A Hebrew Single Speaker Dataset for Text To Speech and Voice Conversion. Proc. INTERSPEECH 2023,)
 ### [google/fleurs/he](https://huggingface.co/datasets/google/fleurs)
-- **Size**: X hours
 - **Domain**: Read speech covering common topics and phrases in Hebrew
 - **Source**: Created as part of Google's FLEURS project, designed for multilingual speech tasks and evaluation. Data collected through crowdsourcing from Hebrew speakers.
 ### [mozilla-foundation/common_voice_17_0/he](https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0)
-- **Size**: X hours (test set of the corpus)
 - **Domain**: Read sentences in Hebrew from various texts.
 - **Source**: Collected through Mozilla's Common Voice initiative, where volunteers contribute recordings and validate other speakers' contributions
 ### [imvladikon/hebrew_speech_kan](https://huggingface.co/datasets/imvladikon/hebrew_speech_kan)
-- **Size**: 1.7 hours (validation setof the corpus)
 - **Domain**: Varied content types from the Kan (Israeli Public Broadcasting Corporation) youtube channel
 - **Source**: Published by Vladimir Gurevich. Scraped audio and subtitles data from YouTube channel "כאן" (Kan).
 """

 "SASPEECH: A Hebrew Single Speaker Dataset for Text To Speech and Voice Conversion" (Sharoni, O., Shenberg, R., Cooper, E. (2023) SASPEECH: A Hebrew Single Speaker Dataset for Text To Speech and Voice Conversion. Proc. INTERSPEECH 2023,)
 ### [google/fleurs/he](https://huggingface.co/datasets/google/fleurs)
+- **Size**: 2 hours (test set of the corpus)
 - **Domain**: Read speech covering common topics and phrases in Hebrew
 - **Source**: Created as part of Google's FLEURS project, designed for multilingual speech tasks and evaluation. Data collected through crowdsourcing from Hebrew speakers.
 ### [mozilla-foundation/common_voice_17_0/he](https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0)
+- **Size**: 2 hours (validated set of the corpus)
 - **Domain**: Read sentences in Hebrew from various texts.
 - **Source**: Collected through Mozilla's Common Voice initiative, where volunteers contribute recordings and validate other speakers' contributions
 ### [imvladikon/hebrew_speech_kan](https://huggingface.co/datasets/imvladikon/hebrew_speech_kan)
+- **Size**: 1.7 hours (validation set of the corpus)
 - **Domain**: Varied content types from the Kan (Israeli Public Broadcasting Corporation) youtube channel
 - **Source**: Published by Vladimir Gurevich. Scraped audio and subtitles data from YouTube channel "כאן" (Kan).
 """