Language Technologies, Bangor University
commited on
Commit
•
8a93704
1
Parent(s):
b04dd81
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,25 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
+
|
5 |
+
# Pre-training wav2vec2 models for Welsh speech recognition
|
6 |
+
|
7 |
+
At the moment, the best Welsh speech recognition models are achieved from fine-tuning https://huggingface.co/facebook/wav2vec2-large-xlsr-53 and https://huggingface.co/facebook/wav2vec2-xls-r-1b models by Facebook/Meta AI.
|
8 |
+
|
9 |
+
This model is experimental in investigating pretraining better models with more Welsh language speech that could lower WER scores even further in subsequently fine-tuned models. The work draws heavily on resources and documentation from the HuggingFace examples:
|
10 |
+
|
11 |
+
https://github.com/huggingface/transformers/tree/main/examples/pytorch/speech-pretraining
|
12 |
+
|
13 |
+
This initial base model has been pre-trained with scripts at
|
14 |
+
|
15 |
+
https://github.com/techiaith/docker-wav2vec2-cy/tree/main/train/pre-train
|
16 |
+
|
17 |
+
using English speech from LibriSpeech's minimal subsets (`validation` and `test`), and 184 hours and 47 minutes of Welsh speech from various playlists on YouTube. The script [`build_youtube_playlists_corpus.sh`](https://github.com/techiaith/docker-wav2vec2-cy/blob/main/inference/python/build_youtube_playlists_corpus.sh) lists the playlists used.
|
18 |
+
|
19 |
+
Until we have collected thousands of hours of Welsh speech, rather than hundreds, the WER scores, after fine-tuning, will remain very high. The following WERs are from tests on a Welsh Common Voice test set as well a [second set of YouTube videos with corrected transcriptions](https://git.techiaith.bangor.ac.uk/data-porth-technolegau-iaith/corpws-profi-adnabod-lleferydd/-/tree/master/data/trawsgrifio).
|
20 |
+
|
21 |
+
| Test Set | WER | CER | WER (+LM) | CER (+LM)|
|
22 |
+
| -------- | --- | --- | --------- | -------- |
|
23 |
+
| CV CY 10 | 94.83 | 85.55 | 92.31 | 82.25 |
|
24 |
+
| YouTube | 95.43 | 90.26 | 93.60 | 89.33 |
|
25 |
+
|