XLS-R-300m-FTSpeech

Model description

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the FTSpeech dataset, being a dataset of 1,800 hours of transcribed speeches from the Danish parliament.

Performance

The model achieves the following WER scores (lower is better):

Dataset WER without LM WER with 5-gram LM
Danish part of Common Voice 8.0 20.48 17.91
Alvenir test set 15.46 13.84

License

The use of this model needs to adhere to this license from the Danish Parliament.

Downloads last month
102,244
Safetensors
Model size
315M params
Tensor type
F32
·
Inference API
or

Model tree for saattrupdan/wav2vec2-xls-r-300m-ftspeech

Finetuned
(523)
this model

Evaluation results