Instructions to use Muno459/fastconformer-quran with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use Muno459/fastconformer-quran with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("Muno459/fastconformer-quran") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
Alslamo alaikom my brother <3 , raw transcript repitition problem
I tried to use this model which is more accurate Masha'a Allah
to trasnscribe Recitation audio into segmented word by word timing
but when reciter repeats a phrase , word or ayah this model 50-70% drops the second repititions
is this a deduplication feature in the onnx ?
This is the Repo but i am using there yazinsae model ...you can change the model.onnx with yours to make a test and compare the raw transcription.json <3
while testing the streaming model it catched all repetitions masha'a Allah <3
Wa alaykum as-salam wa rahmatullah. Good observation, and it is not a dedup feature. It is inherent to greedy CTC decoding: when the same tokens repeat back-to-back with no blank frame between them, CTC collapses them into one. A clear pause inserts a blank and the repeat survives; an immediate repeat often does not, so it gets dropped. The streaming model decodes in chunks, which separates the repeats, which is exactly why it catches them (masha'Allah). So for your word-by-word timing use case the streaming model is the right one to use. The next version I am working on should improve this further. Barakallah feek.
Thanks my brother <3 <3
your offline model accuracy of letters in a word is good masha'a Allah <3