not-tanh commited on
Commit
b37ba76
1 Parent(s): f519008

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -42,7 +42,7 @@ import torchaudio
42
  from datasets import load_dataset
43
  from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
44
 
45
- test_dataset = load_dataset("common_voice", "vi", split="test") #TODO: replace {lang_id} in your language code here. Make sure the code is one of the *ISO codes* of [this](https://huggingface.co/languages) site.
46
 
47
  processor = Wav2Vec2Processor.from_pretrained("not-tanh/wav2vec2-large-xlsr-53-vietnamese")
48
  model = Wav2Vec2ForCTC.from_pretrained("not-tanh/wav2vec2-large-xlsr-53-vietnamese")
@@ -71,7 +71,7 @@ print("Reference:", test_dataset["sentence"][:2])
71
 
72
  ## Evaluation
73
 
74
- The model can be evaluated as follows on the {language} test data of Common Voice. # TODO: replace #TODO: replace language with your {language}, *e.g.* French
75
 
76
 
77
  ```python
@@ -124,6 +124,6 @@ print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"],
124
  ## Training
125
  ## TODO
126
 
127
- The Common Voice `train`, `validation`, and `vivos` datasets were used for training
128
 
129
- The script used for training can be found ... # TODO: fill in a link to your training script here. If you trained your model in a colab, simply fill in the link here. If you trained the model locally, it would be great if you could upload the training script on github and paste the link here.
 
42
  from datasets import load_dataset
43
  from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
44
 
45
+ test_dataset = load_dataset("common_voice", "vi", split="test")
46
 
47
  processor = Wav2Vec2Processor.from_pretrained("not-tanh/wav2vec2-large-xlsr-53-vietnamese")
48
  model = Wav2Vec2ForCTC.from_pretrained("not-tanh/wav2vec2-large-xlsr-53-vietnamese")
 
71
 
72
  ## Evaluation
73
 
74
+ The model can be evaluated as follows on the Vietnamese test data of Common Voice.
75
 
76
 
77
  ```python
 
124
  ## Training
125
  ## TODO
126
 
127
+ The Common Voice `train`, `validation`, the VIVOS and FOSD datasets were used for training
128
 
129
+ The script used for training can be found ... # TODO