ONNX export for local inference (transcribe-rs / Handy)

#4
by michaeldavidsen - opened

Hi! Thanks for hviske-v5.3 β€” the Danish accuracy looks excellent.

I'd love to run it locally in lightweight Rust-based dictation apps. The transcribe-rs library (used by the Handy dictation app) already ships a working Cohere ASR engine via ONNX Runtime β€” which matches this model's CohereAsrForConditionalGeneration architecture. It expects these files:

cohere-encoder.int4.onnx (+ .onnx.data)
cohere-decoder.int4.onnx (+ .onnx.data)
tokens.txt

Since you already produce Cohere-format ONNX exports (e.g. syvai/cohere-transcribe-diarize), would you consider publishing an int4 ONNX export of hviske-v5.3 in that same layout? That would let it run natively in transcribe-rs/Handy with no Python runtime.

Happy to test on Danish dictation samples and report WER. Thanks!

Sign up or log in to comment