gchhablani commited on
Commit
c30dc58
1 Parent(s): edf773e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -11,7 +11,7 @@ tags:
11
  - xlsr-fine-tuning-week
12
  license: apache-2.0
13
  model-index:
14
- - name: GChhablani XLSR Wav2Vec2 Large 53 Marathi #TODO: replace {human_readable_name} with a name of your model as it should appear on the leaderboard. It could be something like `Elgeish XLSR Wav2Vec2 Large 53`
15
  results:
16
  - task:
17
  name: Speech Recognition
@@ -25,7 +25,7 @@ model-index:
25
  value: 14.53
26
  ---
27
 
28
- # Wav2Vec2-Large-XLSR-53-Marathi #TODO: replace language with your {language}, *e.g.* French
29
 
30
  Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Marthi using the [OpenSLR SLR64](http://openslr.org/64/) dataset.
31
  When using this model, make sure that your speech input is sampled at 16kHz.
@@ -48,7 +48,7 @@ model = Wav2Vec2ForCTC.from_pretrained("gchhablani/wav2vec2-large-xlsr-mr")
48
  resampler = torchaudio.transforms.Resample(48_000, 16_000) # The original data was with 48,000 sampling rate. You can change it according to your input.
49
 
50
  # Preprocessing the datasets.
51
- # We need to read the aduio files as arrays
52
  def speech_file_to_array_fn(batch):
53
  speech_array, sampling_rate = torchaudio.load(batch["path"])
54
  batch["speech"] = resampler(speech_array).squeeze().numpy()
@@ -120,6 +120,5 @@ print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"],
120
 
121
  ## Training
122
 
123
- 90% of the OpenSLR Marathi dataset was used for training. # TODO: adapt to state all the datasets that were used for training.
124
-
125
  The colab notebook used for training can be found [here](https://colab.research.google.com/drive/1_BbLyLqDUsXG3RpSULfLRjC6UY3RjwME?usp=sharing)
11
  - xlsr-fine-tuning-week
12
  license: apache-2.0
13
  model-index:
14
+ - name: GChhablani XLSR Wav2Vec2 Large 53 Marathi
15
  results:
16
  - task:
17
  name: Speech Recognition
25
  value: 14.53
26
  ---
27
 
28
+ # Wav2Vec2-Large-XLSR-53-Marathi
29
 
30
  Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Marthi using the [OpenSLR SLR64](http://openslr.org/64/) dataset.
31
  When using this model, make sure that your speech input is sampled at 16kHz.
48
  resampler = torchaudio.transforms.Resample(48_000, 16_000) # The original data was with 48,000 sampling rate. You can change it according to your input.
49
 
50
  # Preprocessing the datasets.
51
+ # We need to read the audio files as arrays
52
  def speech_file_to_array_fn(batch):
53
  speech_array, sampling_rate = torchaudio.load(batch["path"])
54
  batch["speech"] = resampler(speech_array).squeeze().numpy()
120
 
121
  ## Training
122
 
123
+ 90% of the OpenSLR Marathi dataset was used for training.
 
124
  The colab notebook used for training can be found [here](https://colab.research.google.com/drive/1_BbLyLqDUsXG3RpSULfLRjC6UY3RjwME?usp=sharing)