Parakeet Multitalker (int8 ONNX) โ€” Recogment mirror

Streaming joint ASR + 4-slot speaker diarization. This repo packages int8-quantized ONNX exports of the upstream NVIDIA NeMo checkpoint so that runtimes which don't depend on PyTorch / NeMo can use the model directly.

โš ๏ธ Unofficial. This is a community-maintained mirror of derived artefacts. For the canonical model, citation, and training details see nvidia/multitalker-parakeet-streaming-0.6b-v1.

Files

File Purpose Size
encoder.int8.onnx FastConformer encoder ~626 MB
decoder_joint.int8.onnx RNN-T decoder + joint network ~9 MB
tokenizer.model SentencePiece tokenizer (1024 vocab) ~245 KB

Provenance

These ONNX files were exported from the upstream NeMo .nemo checkpoint of nvidia/multitalker-parakeet-streaming-0.6b-v1 and then quantized to int8 for CPU inference. The conversion pipeline lives in the parakeet-rs project's tooling; the runtime that drives the resulting files is the same crate.

No model weights have been retrained, fine-tuned, or otherwise modified beyond the int8 quantization step.

Why this mirror exists

The Recogment daemon ships with a pinned set of model SHA-256 hashes and a zero-egress sandbox; the only outbound network in the whole product is a companion downloader binary that fetches from public HTTPS sources. Until this mirror existed, the Parakeet files had to be shipped to beta testers out-of-band as a tarball. With this mirror in place, the downloader can fetch them automatically alongside the rest of the public catalogue.

If you're not building Recogment, you almost certainly want the official NVIDIA repo linked above instead โ€” it includes the original .nemo checkpoint, configuration, and reference NeMo inference code.

License

This work is licensed under the NVIDIA Open Model License. See NOTICE.txt in this repo for the required attribution notice, or read the full license text at the URL in the frontmatter above.

Permitted under that license: commercial use, redistribution (this repo), and derivative works (the int8 quantization). Required: include the NVIDIA Open Model License notice when you redistribute further.

Citation

Please cite the upstream model paper, not this mirror:

@misc{nvidia2024multitalkerparakeet,
  title={Multitalker Parakeet streaming 0.6B v1},
  author={NVIDIA},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/nvidia/multitalker-parakeet-streaming-0.6b-v1}}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Recogment/parakeet-multitalker-int8-onnx

Quantized
(4)
this model