Nemotron ASR ONNX Models

ONNX-exported Nemotron-3.5 ASR streaming model (0.6B) for speech recognition.

Variants

Variant Provider Description
cpu/ CPUExecutionProvider CPU-optimized (fp32 decoder/joint, int4/int8 encoder)
gpu-cuda/ CUDAExecutionProvider NVIDIA GPU via CUDA (fp32)
gpu-dml/ DmlExecutionProvider DirectML for Windows GPU (fp32)

Model Architecture

  • Encoder: Conformer-based audio encoder
  • Decoder: Transformer-based text decoder
  • Joint: Transducer joint network
  • VAD: Silero VAD for voice activity detection

Usage with NemotronSpeech (C#)

git clone https://github.com/DimQ1/nemotron-speech-csharp
cd nemotron-speech-csharp
dotnet run -c Release -- models-onnx/gpu-cuda --mic --language auto

Conversion

Converted from NVIDIA NeMo .nemo checkpoint using Olive. Original model: nemotron-3.5-asr-streaming-0.6b.nemo

License

See original NVIDIA Nemotron license.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support