patrickvonplaten commited on
Commit
54e90f0
1 Parent(s): 0ab3fb9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -17,6 +17,10 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  # Wav2vec2-xls-r-phoneme-300m-sv
19
 
 
 
 
 
20
  This model is a fine-tuned version of [wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the COMMON_VOICE - SV-SE dataset.
21
 
22
  It achieves the following results on the evaluation set:
17
 
18
  # Wav2vec2-xls-r-phoneme-300m-sv
19
 
20
+ **Note**: The tokenizer was created from the official Swedish phoneme vocabulary as defined here: https://github.com/microsoft/UniSpeech/blob/main/UniSpeech/examples/unispeech/data/sv/phonesMatches_reduced.json
21
+
22
+ One can simply download the file, rename it to `vocab.json` and load a `Wav2Vec2PhonemeCTCTokenizer.from_pretrained("./directory/with/vocab.json/")`.
23
+
24
  This model is a fine-tuned version of [wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the COMMON_VOICE - SV-SE dataset.
25
 
26
  It achieves the following results on the evaluation set: