Lingalingeswaran
/

whisper-small-ta

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Lingalingeswaran commited on Oct 24

Commit

36509f3

•

1 Parent(s): 52b1014

Update README.md

Files changed (1) hide show

README.md +8 -3

README.md CHANGED Viewed

@@ -40,15 +40,20 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure

 ## Model description
+This Whisper model has been fine-tuned specifically for the Tamil language using the Common Voice 11.0 dataset. It is designed to handle tasks such as speech-to-text transcription and language identification, making it suitable for applications where Tamil is a primary language of interest. The fine-tuning process focused on enhancing performance for Tamil, aiming to reduce the error rate in transcriptions and improve general accuracy.
 ## Intended uses & limitations
+Intended Uses:
+Speech-to-text transcription in Tamil
+Limitations:
+May not perform as well on languages or dialects that are not well-represented in the Common Voice dataset.
+Higher Word Error Rate (WER) in noisy environments or with speakers who have heavy accents not covered in the training data.
+The model is optimized for Tamil; performance in other languages may be suboptimal.
 ## Training and evaluation data
+The training data for this model consists of voice recordings in Tamil from the Mozilla-foundation/Common Voice 11.0 dataset. The dataset is a crowd-sourced collection of transcribed speech, ensuring diversity in terms of speaker accents, age groups, and speech styles.
 ## Training procedure