Lingalingeswaran commited on
Commit
36509f3
1 Parent(s): 52b1014

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -3
README.md CHANGED
@@ -40,15 +40,20 @@ It achieves the following results on the evaluation set:
40
 
41
  ## Model description
42
 
43
- More information needed
44
 
45
  ## Intended uses & limitations
 
 
46
 
47
- More information needed
 
 
 
48
 
49
  ## Training and evaluation data
50
 
51
- More information needed
52
 
53
  ## Training procedure
54
 
 
40
 
41
  ## Model description
42
 
43
+ This Whisper model has been fine-tuned specifically for the Tamil language using the Common Voice 11.0 dataset. It is designed to handle tasks such as speech-to-text transcription and language identification, making it suitable for applications where Tamil is a primary language of interest. The fine-tuning process focused on enhancing performance for Tamil, aiming to reduce the error rate in transcriptions and improve general accuracy.
44
 
45
  ## Intended uses & limitations
46
+ Intended Uses:
47
+ Speech-to-text transcription in Tamil
48
 
49
+ Limitations:
50
+ May not perform as well on languages or dialects that are not well-represented in the Common Voice dataset.
51
+ Higher Word Error Rate (WER) in noisy environments or with speakers who have heavy accents not covered in the training data.
52
+ The model is optimized for Tamil; performance in other languages may be suboptimal.
53
 
54
  ## Training and evaluation data
55
 
56
+ The training data for this model consists of voice recordings in Tamil from the Mozilla-foundation/Common Voice 11.0 dataset. The dataset is a crowd-sourced collection of transcribed speech, ensuring diversity in terms of speaker accents, age groups, and speech styles.
57
 
58
  ## Training procedure
59