File size: 2,930 Bytes
b8bb496
 
aba0ec2
 
 
 
 
 
 
 
 
 
94ecd31
aba0ec2
78ee44c
aba0ec2
 
 
94ecd31
aba0ec2
 
 
 
 
 
94ecd31
a6a8706
94ecd31
 
a6a8706
94ecd31
b8bb496
 
e66e215
b8bb496
e66e215
b8bb496
 
e66e215
b8bb496
e66e215
 
a6a8706
e66e215
 
 
b8bb496
e66e215
b8bb496
e66e215
b8bb496
e66e215
b8bb496
e66e215
 
b8bb496
e66e215
 
b8bb496
e66e215
b8bb496
e66e215
 
 
78ee44c
b8bb496
 
e66e215
 
 
b8bb496
e66e215
 
 
 
 
78ee44c
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
library_name: transformers
license: openrail
datasets:
- alexandrainst/coral
language:
- da
metrics:
- wer
- cer
base_model:
- openai/whisper-large-v3
pipeline_tag: automatic-speech-recognition
model-index:
- name: coral-1-whisper-large
  results:
  - task:
      type: automatic-speech-recognition
      name: Automatic Speech Recognition
    dataset:
      name: CoRal read-aloud
      type: alexandrainst/coral
      split: test
      args: read_aloud
    metrics:
    - type: cer
      value: 4.3% ± 0.2%
      name: CER
    - type: wer
      value: 10.4% ± 0.3%
      name: WER
---

# Whisper-Large v.3 trained on CoRaL release 1

This is a Danish state-of-the-art speech recognition model, trained by [Alvenir](https://www.alvenir.ai/).


## Evaluation Results

| Model | Number of parameters | [CoRal](https://huggingface.co/datasets/alexandrainst/coral/viewer/read_aloud/test) CER | [CoRal](https://huggingface.co/datasets/alexandrainst/coral/viewer/read_aloud/test) WER |
|:---|---:|---:|---:|
| [Alvenir/coral-1-whisper-large](https://huggingface.co/Alvenir/coral-1-whisper-large) | 1540M | **4.3% ± 0.2%** | **10.4% ± 0.3%** | 
| [alexandrainst/roest-315m](https://huggingface.co/alexandrainst/roest-315m) | 315M | 6.6% ± 0.2% | 17.0% ± 0.4% | 
| [mhenrichsen/hviske-v2](https://huggingface.co/syvai/hviske-v2) | 1540M | 4.7% ± 0.07% | 11.8% ± 0.3% |
| [openai/whisper-large-v3](https://hf.co/openai/whisper-large-v3) | 1540M | 11.4% ± 0.3% | 28.3% ± 0.6% |

Results of more models and more datasets can be seen in the [model card for Røst-315m](https://huggingface.co/alexandrainst/roest-315m).

## Model details

This is simply the [Whisper Large v.3 model](https://hf.co/openai/whisper-large-v3) trained on the first release of [CoRaL data](https://huggingface.co/datasets/alexandrainst/coral).

The model was trained for 30K steps using the configuration from the [CoRaL repository](https://github.com/alexandrainst/coral) by running:
```py

python src/scripts/finetune_asr_model.py model=whisper-large max_steps=30000 model.learning_rate=1e-5
```

## License

Note that the dataset used is licensed under a custom license, adapted from OpenRAIL-M, which allows
commercial use with a few restrictions (speech synthesis and biometric identification).
See
[license](https://huggingface.co/Alvenir/coral-1-whisper-large/blob/main/LICENSE).


## Creators and Funders
The CoRal project is funded by the [Danish Innovation
Fund](https://innovationsfonden.dk/) and consists of the following partners:

- [Alexandra Institute](https://alexandra.dk/)
- [University of Copenhagen](https://www.ku.dk/)
- [Agency for Digital Government](https://digst.dk/)
- [Alvenir](https://www.alvenir.ai/)
- [Corti](https://www.corti.ai/)

We would like specifically thank Dan Saattrup Nielsen, Alexandra Institute for (among other things) the repository work and Simon Leminen Madsen, Alexandra Institute for modelling work.