Vaani | BRAHMAI RESEARCH

Model Type: Automatic Speech Recognition (ASR) Base Model: whisper-small [Dataset: Mozilla Common Voice 11.0 Hindi dataset]

Vaani (small - hindi) is a fine-tuned version of the whisper-small model by OpenAI, specifically optimized for Hindi speech recognition. It was fine-tuned on the Mozilla Common Voice 11.0 Hindi dataset by BRAHMAI Research. The model demonstrates strong performance on various Hindi speech recognition tasks and can be run locally on GPUs with as little as 4GB of memory.

Intended Use: The primary intended use case for vaani_small is automatic speech recognition and transcription of Hindi audio data. It can be employed in a wide range of applications that require accurate Hindi speech-to-text conversion, such as captioning, speech analytics, voice assistants, and accessibility tools.

Limitations and Biases: While the model shows improved performance on Hindi speech recognition, its performance may vary across different accents, dialects, and demographic groups within the Hindi-speaking population. The biases and limitations of the model are likely to be inherited from the training data used (Mozilla Common Voice 11.0 Hindi dataset). It is recommended to evaluate the model's performance on specific use cases and datasets before deployment.

Training Data:
The model was fine-tuned on the Mozilla Common Voice 11.0 Hindi dataset. It is an open-source dataset containing crowdsourced audio recordings and transcriptions in Hindi. However, potential biases or ethical concerns associated with the training data should be carefully examined.

Hardware and Software Requirements: vaani_small can be run locally on GPUs with at least 4GB of memory. It is recommended to use the Transformers library from Hugging Face for inference and deployment.