anuragshas commited on
Commit
6c4f40b
1 Parent(s): 9ad28b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -3
README.md CHANGED
@@ -6,11 +6,28 @@ tags:
6
  - automatic-speech-recognition
7
  - mozilla-foundation/common_voice_8_0
8
  - generated_from_trainer
 
9
  datasets:
10
- - common_voice
 
 
11
  model-index:
12
- - name: ''
13
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -88,3 +105,37 @@ The following hyperparameters were used during training:
88
  - Pytorch 1.10.1+cu102
89
  - Datasets 1.17.1.dev0
90
  - Tokenizers 0.11.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - automatic-speech-recognition
7
  - mozilla-foundation/common_voice_8_0
8
  - generated_from_trainer
9
+ - robust-speech-event
10
  datasets:
11
+ - mozilla-foundation/common_voice_8_0
12
+ metrics:
13
+ - wer
14
  model-index:
15
+ - name: XLS-R-1B - Hindi
16
+ results:
17
+ - task:
18
+ name: Automatic Speech Recognition
19
+ type: automatic-speech-recognition
20
+ dataset:
21
+ name: Common Voice 8
22
+ type: mozilla-foundation/common_voice_8_0
23
+ args: hi
24
+ metrics:
25
+ - name: Test WER
26
+ type: wer
27
+ value: 38.892
28
+ - name: Test CER
29
+ type: cer
30
+ value: 19.665
31
  ---
32
 
33
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
105
  - Pytorch 1.10.1+cu102
106
  - Datasets 1.17.1.dev0
107
  - Tokenizers 0.11.0
108
+
109
+ #### Evaluation Commands
110
+ 1. To evaluate on `mozilla-foundation/common_voice_8_0` with split `test`
111
+
112
+ ```bash
113
+ python eval.py --model_id anuragshas/wav2vec2-xls-r-1b-hi-with-lm --dataset mozilla-foundation/common_voice_8_0 --config hi --split test
114
+ ```
115
+
116
+
117
+ ### Inference With LM
118
+
119
+ ```python
120
+ import torch
121
+ from datasets import load_dataset
122
+ from transformers import AutoModelForCTC, AutoProcessor
123
+ import torchaudio.functional as F
124
+ model_id = "anuragshas/wav2vec2-xls-r-1b-hi-with-lm"
125
+ sample_iter = iter(load_dataset("mozilla-foundation/common_voice_8_0", "hi", split="test", streaming=True, use_auth_token=True))
126
+ sample = next(sample_iter)
127
+ resampled_audio = F.resample(torch.tensor(sample["audio"]["array"]), 48_000, 16_000).numpy()
128
+ model = AutoModelForCTC.from_pretrained(model_id)
129
+ processor = AutoProcessor.from_pretrained(model_id)
130
+ input_values = processor(resampled_audio, return_tensors="pt").input_values
131
+ with torch.no_grad():
132
+ logits = model(input_values).logits
133
+ transcription = processor.batch_decode(logits.numpy()).text
134
+ # => "तुम्हारे पास तीन महीने बचे हैं"
135
+ ```
136
+
137
+ ### Eval results on Common Voice 8 "test" (WER):
138
+
139
+ | Without LM | With LM (run `./eval.py`) |
140
+ |---|---|
141
+ | 50.8 | 38.892 |