File size: 1,499 Bytes
75f9b2e
dbddb58
 
 
 
 
3b7572e
 
7a60985
 
 
75f9b2e
 
3b7572e
7a60985
3b7572e
 
7a60985
73d80f5
 
3b7572e
7a60985
 
 
3b7572e
 
73d80f5
7a60985
3b7572e
7a60985
 
75f9b2e
 
dbddb58
75f9b2e
 
 
dbddb58
75f9b2e
 
dbddb58
75f9b2e
dbddb58
75f9b2e
dbddb58
 
 
 
75f9b2e
 
99febd8
75f9b2e
dbddb58
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
language:
- da
license: other
datasets:
- ftspeech
metrics:
- wer
tasks:
- automatic-speech-recognition
base_model: facebook/wav2vec2-xls-r-300m
model-index:
- name: wav2vec2-xls-r-300m-ftspeech
  results:
  - task:
      type: automatic-speech-recognition
    dataset:
      name: Danish Common Voice 8.0
      type: mozilla-foundation/common_voice_8_0
      args: da
    metrics:
    - type: wer
      value: 17.91
  - task:
      type: automatic-speech-recognition
    dataset:
      name: Alvenir ASR test dataset
      type: Alvenir/alvenir_asr_da_eval
    metrics:
    - type: wer
      value: 13.84
---

# XLS-R-300m-FTSpeech

## Model description

This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the [FTSpeech dataset](https://ftspeech.github.io/), being a dataset of 1,800 hours of transcribed speeches from the Danish parliament.


## Performance

The model achieves the following WER scores (lower is better):

| **Dataset** | **WER without LM** | **WER with 5-gram LM** |
| :---:   | ---: | ---: |
| [Danish part of Common Voice 8.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0/viewer/da/train) | 20.48 | 17.91 |
| [Alvenir test set](https://huggingface.co/datasets/Alvenir/alvenir_asr_da_eval) | 15.46 | 13.84 |


## License

The use of this model needs to adhere to [this license from the Danish Parliament](https://www.ft.dk/da/aktuelt/tv-fra-folketinget/deling-og-rettigheder).