metadata
license: cc-by-nc-4.0
language:
- bn
library_name: nemo
pipeline_tag: automatic-speech-recognition
Hishab BN FastConformer
Hishab BN FastConformer is a fastconformer based model trained on ~18K Hours MegaBNSpeech corpus.
Using method
This model can be used for transcribing Bangla audio and also can be used as pre-trained model to fine-tuning on custom datasets using NeMo framework.
Installation
To install NeMo check NeMo documentation.
Inferencing
Download test_bn_fastconformer.wav
# pip install -q 'nemo_toolkit[asr]'
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("hishab/hishab_bn_fastconformer")
auido_file = "test_bn_fastconformer.wav"
transcriptions = asr_model.transcribe([auido_file])
print(transcriptions)
# ['আজ সরকারি ছুটির দিন দেশের সব শিক্ষা প্রতিষ্ঠান সহ সরকারি আধা সরকারি স্বায়ত্তশাসিত প্রতিষ্ঠান ও ভবনে জাতীয় পতাকা অর্ধনমিত ও কালো পতাকা উত্তোলন করা হয়েছে']
Colab Notebook for Infer: Bangla FastConformer Infer.ipynb
Training Datasets
Channels Category | Hours |
---|---|
News | 17,640.00 |
Talkshow | 688.82 |
Vlog | 0.02 |
Crime Show | 4.08 |
Total | 18,332.92 |
Training Details
For training the model, the dataset we selected comprises 17.64k hours of news chan- nel content, 688.82 hours of talk shows, 0.02 hours of vlogs, and 4.08 hours of crime shows.