README.md · hishab/hishab_bn_fastconformer at 1bb676cfdcd0531a91ec0f3f9c52bc99c4d441de

metadata

license: cc-by-nc-4.0
language:
  - bn
library_name: nemo
pipeline_tag: automatic-speech-recognition

Hishab BN FastConformer

Hishab BN FastConformer is a fastconformer based model trained on ~18K Hours MegaBNSpeech corpus.

Using method

This model can be used for transcribing Bangla audio and also can be used as pre-trained model to fine-tuning on custom datasets using NeMo framework.

Installation

To install NeMo check NeMo documentation.

Inferencing

Download test_bn_fastconformer.wav

# pip install -q 'nemo_toolkit[asr]'

import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("hishab/hishab_bn_fastconformer")

auido_file = "test_bn_fastconformer.wav"
transcriptions = asr_model.transcribe([auido_file])
print(transcriptions)
# ['আজ সরকারি ছুটির দিন দেশের সব শিক্ষা প্রতিষ্ঠান সহ সরকারি আধা সরকারি স্বায়ত্তশাসিত প্রতিষ্ঠান ও ভবনে জাতীয় পতাকা অর্ধনমিত ও কালো পতাকা উত্তোলন করা হয়েছে']

Colab Notebook for Infer: Bangla FastConformer Infer.ipynb

Training Datasets

Channels Category	Hours
News	17,640.00
Talkshow	688.82
Vlog	0.02
Crime Show	4.08
Total	18,332.92

Training Details

For training the model, the dataset we selected comprises 17.64k hours of news chan- nel content, 688.82 hours of talk shows, 0.02 hours of vlogs, and 4.08 hours of crime shows.

hishab
/

hishab_bn_fastconformer