sumedh commited on
Commit
c3645dc
1 Parent(s): a34a2d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -27,7 +27,7 @@ model-index:
27
 
28
  # Wav2Vec2-Large-XLSR-53-Marathi
29
  Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Marathi using the [OpenSLR SLR64](http://openslr.org/64/) dataset. When using this model, make sure that your speech input is sampled at 16kHz. This data contains only female voices but it works well for male voices too.
30
- **WER on the Test Set**: 12.70 %
31
  ## Usage
32
  The model can be used directly without a language model as follows, given that your dataset has Marathi `actual_text` and `path_in_folder` columns:
33
  ```python
@@ -68,7 +68,7 @@ processor = Wav2Vec2Processor.from_pretrained("sumedh/wav2vec2-large-xlsr-marath
68
  model = Wav2Vec2ForCTC.from_pretrained("sumedh/wav2vec2-large-xlsr-marathi")
69
  model.to("cuda")
70
 
71
- chars_to_ignore_regex = '[\\\\\\\\,\\\\\\\\?\\\\\\\\.\\\\\\\\!\\\\\\\\-\\\\\\\\;\\\\\\\\:\\\\\\\\"\\\\\\\\“]'
72
  resampler = torchaudio.transforms.Resample(48_000, 16_000)
73
  # Preprocessing the datasets. We need to read the aduio files as arrays
74
  def speech_file_to_array_fn(batch):
@@ -90,7 +90,7 @@ print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"],
90
 
91
  ## Training
92
  Train-Test ratio was 90:10.
93
- The colab notebook used for training can be found [here](https://colab.research.google.com/drive/1wX46fjExcgU5t3AsWhSPTipWg_aMDg2f?usp=sharing).
94
 
95
  ## Training Config and Summary
96
- weights-and-biases run profile [here](https://wandb.ai/wandb/xlsr/runs/3itdhtb8/overview?workspace=user-sumedhkhodke)
 
27
 
28
  # Wav2Vec2-Large-XLSR-53-Marathi
29
  Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Marathi using the [OpenSLR SLR64](http://openslr.org/64/) dataset. When using this model, make sure that your speech input is sampled at 16kHz. This data contains only female voices but it works well for male voices too.
30
+ **WER (Word Error Rate) on the Test Set**: 12.70 %
31
  ## Usage
32
  The model can be used directly without a language model as follows, given that your dataset has Marathi `actual_text` and `path_in_folder` columns:
33
  ```python
 
68
  model = Wav2Vec2ForCTC.from_pretrained("sumedh/wav2vec2-large-xlsr-marathi")
69
  model.to("cuda")
70
 
71
+ chars_to_ignore_regex = '[\,\?\.\!\-\;\:\"\“]'
72
  resampler = torchaudio.transforms.Resample(48_000, 16_000)
73
  # Preprocessing the datasets. We need to read the aduio files as arrays
74
  def speech_file_to_array_fn(batch):
 
90
 
91
  ## Training
92
  Train-Test ratio was 90:10.
93
+ Colab training notebook can be found [here](https://colab.research.google.com/drive/1wX46fjExcgU5t3AsWhSPTipWg_aMDg2f?usp=sharing).
94
 
95
  ## Training Config and Summary
96
+ weights-and-biases run summary [here](https://wandb.ai/wandb/xlsr/runs/3itdhtb8/overview?workspace=user-sumedhkhodke)