tanmaylaud
/

wav2vec2-large-xlsr-hindi-marathi

Automatic Speech Recognition

xlsr-fine-tuning-week

Inference Endpoints

Model card Files Files and versions Community

tanmaylaud commited on Mar 29, 2021

Commit

863a64d

•

1 Parent(s): cb7ac4f

Update README.md

Files changed (1) hide show

README.md +33 -3

README.md CHANGED Viewed

@@ -1,5 +1,35 @@
-# Wav2Vec2-Large-XLSR-53-Marathi
-### Fine-tuned facebook/wav2vec2-large-xlsr-53 on Marathi using the OpenSLR SLR64 dataset and InterSpeech 2021 Marathi datasets. Note that this data OpenSLR contains only female voices. Please keep this in mind before using the model for your task. When using this model, make sure that your speech input is sampled at 16kHz.
 ## Usage
  The model can be used directly (without a language model) as follows, assuming you have a dataset with Marathi text and audio_path fields:
@@ -51,7 +81,7 @@ processor = Wav2Vec2Processor.from_pretrained("gchhablani/wav2vec2-large-xlsr-mr
 model = Wav2Vec2ForCTC.from_pretrained("gchhablani/wav2vec2-large-xlsr-mr-3")
 model.to("cuda")
-chars_to_ignore_regex = '[\,\?\.\!\-\;\:\"\“\%\‘\”\�\–\…]'
 # Preprocessing the datasets.

+---
+language: mr
+datasets:
+- openslr
+- interspeech_2021_asr
+metrics:
+- wer
+tags:
+- audio
+- automatic-speech-recognition
+- speech
+- xlsr-fine-tuning-week
+- hindi
+- marathi
+license: apache-2.0
+model-index:
+- name: XLSR Wav2Vec2 Large 53 Hindi-Marathi by Tanmay Laud
+  results:
+  - task:
+      name: Speech Recognition
+      type: automatic-speech-recognition
+    dataset:
+      name: OpenSLR hi, OpenSLR mr
+      type: openslr, interspeech_2021_asr
+    metrics:
+       - name: Test WER
+         type: wer
+         value: 60.80
+---
+# Wav2Vec2-Large-XLSR-53-Hindi-Marathi
+### Fine-tuned facebook/wav2vec2-large-xlsr-53 on Hindi and Marathi using the OpenSLR SLR64 datasets. Note that this data OpenSLR contains only female voices. Please keep this in mind before using the model for your task. When using this model, make sure that your speech input is sampled at 16kHz.
 ## Usage
  The model can be used directly (without a language model) as follows, assuming you have a dataset with Marathi text and audio_path fields:
 model = Wav2Vec2ForCTC.from_pretrained("gchhablani/wav2vec2-large-xlsr-mr-3")
 model.to("cuda")
+chars_to_ignore_regex = '[\\,\\?\\.\\!\\-\\;\\:\\"\\“\\%\\‘\\”\\�\\–\\…]'
 # Preprocessing the datasets.