Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,42 @@ The model was trained using fairseq with [this config](https://github.com/centre
|
|
16 |
|
17 |
|
18 |
## Usage
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
## Performance
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
The model was finetuned in collaboration with [Alvenir](https://alvenir.ai).
|
|
|
16 |
|
17 |
|
18 |
## Usage
|
19 |
+
```Python
|
20 |
+
import torch
|
21 |
+
from datasets import load_dataset
|
22 |
+
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
23 |
+
|
24 |
+
# load model and tokenizer
|
25 |
+
processor = Wav2Vec2Processor.from_pretrained(
|
26 |
+
"chcaa/xls-r-300m-danish-nst-cv9")
|
27 |
+
model = Wav2Vec2ForCTC.from_pretrained(
|
28 |
+
"chcaa/xls-r-300m-danish-nst-cv9")
|
29 |
+
|
30 |
+
# load dataset and read soundfiles
|
31 |
+
ds = load_dataset("Alvenir/alvenir_asr_da_eval", split="test")
|
32 |
+
|
33 |
+
# tokenize
|
34 |
+
input_values = processor(
|
35 |
+
ds[0]["audio"]["array"], return_tensors="pt", padding="longest"
|
36 |
+
).input_values # Batch size 1
|
37 |
+
|
38 |
+
# retrieve logits
|
39 |
+
logits = model(input_values).logits
|
40 |
+
|
41 |
+
# take argmax and decode
|
42 |
+
predicted_ids = torch.argmax(logits, dim=-1)
|
43 |
+
transcription = processor.batch_decode(predicted_ids)
|
44 |
+
print(transcription)
|
45 |
+
```
|
46 |
|
47 |
## Performance
|
48 |
+
The table below shows the WER rate of four different Danish ASR models on three datasets.
|
49 |
+
|
50 |
+
|Model | [Alvenir](https://huggingface.co/datasets/Alvenir/alvenir_asr_da_eval)| [NST](https://www.nb.no/sprakbanken/en/resource-catalogue/oai-nb-no-sbr-19/)| [CV9.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_9_0)|
|
51 |
+
|:--------------------------------------|------:|-----:|-----:|
|
52 |
+
|Alvenir/wav2vec2-base-da-ft-nst | 0.202| 0.099| 0.238|
|
53 |
+
|chcaa/alvenir-wav2vec2-base-da-nst-cv9 | 0.233| 0.126| 0.256|
|
54 |
+
|chcaa/xls-r-300m-nst-cv9-da | 0.105| 0.060| 0.119|
|
55 |
+
|chcaa/xls-r-300m-danish-nst-cv9 | 0.082| 0.051| 0.108|
|
56 |
|
57 |
The model was finetuned in collaboration with [Alvenir](https://alvenir.ai).
|