Automatic Speech Recognition
NeMo
Persian
speech
FastConformer
Transducer
CTC
NeMo
persian
on-device
Instructions to use Reza2kn/shenava-fa-fastconformer-115m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use Reza2kn/shenava-fa-fastconformer-115m with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("Reza2kn/shenava-fa-fastconformer-115m") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
Shenava β FastConformer-Hybrid Large (fa) β de-poisoned Phase B v4
115M EncDecHybridRNNTCTCBPE (RNNT+CTC), 16kHz. On-device offline Persian ASR for the VisualEars project.
Golden6669 (held-out gold, official Persian normalizer)
| head | WER | CER |
|---|---|---|
| RNNT | 7.29% | 1.63% |
| CTC | 7.92% | 1.87% |
vs prev best (B2) 8.02%/1.82%, vs cloud Gemini 6.49% β fully offline.
Recipe
De-poisoned 7,417h corpus (crap-classifier cut + gates + telephooney-CTC 534h), Phase A continued (β8.73%) + Phase B gold-anchor with 1,420 human corrections (β7.29%).
- Downloads last month
- 9