Model Card for Model ID

Preview-release for Fosdem 2025 with current training epochs (Training is still ongoing).

Overview

This is a family of low-latency streaming models designed for use on edge devices.
Goal: Provide faster or higher-quality performance compared to similarly sized Whisper and other models.

Languages: English, French, German (7 more languages coming).

Demos

Browser Demo (CPU)
(Runs entirely in the browser using CPU.)
Gradio / Python Demo

License

The license is still under consideration (likely Coqui). The model is intended to be dual-licensed:

Free for non-commercial use.
Affordable license for commercial use.

Training

Training is done with a modified k2/Icefall pipeline.
Inference can be performed with the standard Sherpa project.
Silence padding and volume normalization may help produce better results.

Acknowledgements

Special thanks to the Lhotse, Sherpa, k2, and Icefall teams for their support and tools.

Banafo
/

Kroko-ASR