marma commited on
Commit
d5531ab
·
1 Parent(s): 2889873

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -32,12 +32,12 @@ model-index:
32
  value: 3.946846
33
  ---
34
  # Wav2vec 2.0 large-voxpopuli-sv-swedish
35
- Additionally pretrined and finetuned version of Facebooks [VoxPopuli-sv large](https://huggingface.co/facebook/wav2vec2-large-sv-voxpopuli) model using Swedish radio broadasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **3.95%**. WER for Common Voice test set is **10.99%** directly and **7.82%** with a 4-gram language model.
36
 
37
  When using this model, make sure that your speech input is sampled at 16kHz.
38
 
39
  ## Training
40
- This model has been fine-tuned for 80000 updates on NST + CommonVoice and then for an additional 40000 steps on only CommonVoice. The additional fine-tuning on CommonVoce hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed].
41
 
42
  ## Usage
43
  The model can be used directly (without a language model) as follows:
 
32
  value: 3.946846
33
  ---
34
  # Wav2vec 2.0 large-voxpopuli-sv-swedish
35
+ Additionally pretrined and finetuned version of Facebooks [VoxPopuli-sv large](https://huggingface.co/facebook/wav2vec2-large-sv-voxpopuli) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **3.95%**. WER for Common Voice test set is **10.99%** directly and **7.82%** with a 4-gram language model.
36
 
37
  When using this model, make sure that your speech input is sampled at 16kHz.
38
 
39
  ## Training
40
+ This model has additionally pretrained on 1000h of Swedish local radio broadcasts, fine-tuned for 120000 updates on NST + CommonVoice and then for an additional 20000 updates on CommonVoice only. The additional fine-tuning on CommonVoice hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed].
41
 
42
  ## Usage
43
  The model can be used directly (without a language model) as follows: