pyannote Segmentation 3.0 Β· OpenASR
pyannote segmentation-3.0 β speaker-change and overlap aware speech segmentation for OpenASR diarization
Speaker-diarization support pack for the OpenASR runtime β pure-Rust inference, no Python at inference time.
β¨ Highlights
- βοΈ Speaker-change aware segmentation β PyanNet (SincNet + BiLSTM) with a powerset head that detects up to 3 concurrent speakers, including overlapped speech
- π€ Quality upgrade for
--diarizeβ installed alongside the CAM++ embedder pack, it replaces coarse VAD slices with fine speaker-turn boundaries - π Diarization, not identification β anonymous session-relative labels; nothing leaves the machine
- π― Bit-exact packaging β single raw-f32 build; the pure-Rust forward pass matches the upstream ONNX logits (max abs error ~7e-5)
- π¦ Native in OpenASR β
.oasrpacks run with no Python at inference, engineered for peak performance on CPU & GPU
π Quickstart
# 1. Install the OpenASR CLI Β· https://openasr.org
# 2. Pull the pack
openasr pull pyannote-segmentation-3.0:f32
# 3. Diarize any transcription (works with every OpenASR ASR model)
openasr transcribe meeting.wav --model xasr-zh-en --diarize --format srt
π¦ Pack
| Quant | File (.oasr) |
Size |
|---|---|---|
| f32 | pyannote-segmentation-3.0-f32.oasr |
6 MB |
Single raw-f32 build: the pure-Rust forward pass consumes f32 directly and the parity gates assert bit-exact outputs vs the upstream weights, so no integer quantization is produced.
π§ About pyannote Segmentation 3.0
pyannote segmentation-3.0 is the local speech-segmentation model from the pyannote speaker
diarization toolkit: a PyanNet (SincNet front-end + bidirectional LSTM) classifier over a 7-class
powerset that labels every 10 s window with which of up to three speakers are active β including
overlapped speech. OpenASR uses it as the optional segmentation stage of its model-agnostic
diarization pipeline: when this pack is installed, --diarize splits speech at speaker changes
instead of relying on coarse VAD slices, then the CAM++ embedder pack clusters the segments into
anonymous speaker turns. Weights are extracted from the un-gated, MIT-licensed
onnx-community ONNX mirror at a pinned revision and repackaged as a raw-f32 .oasr pack that
runs in pure Rust β no Python at inference time.
βοΈ How this pack was made
Converted from onnx-community/pyannote-segmentation-3.0 with the OpenASR importer:
openasr model-pack import-pyannote-local <src>.safetensors <out>.oasr \
--package-id pyannote-segmentation-3.0
The .oasr container is GGUF-backed; every tensor is stored as raw f32 so the
pack round-trips bit-identically against the source weights.
βοΈ License
This pack inherits the upstream model's license: MIT (source). OpenASR packaging retains the upstream copyright; the only modification is format conversion.
π Acknowledgements
This pack is a redistribution of pyannote segmentation-3.0, created by HervΓ© Bredin and the pyannote.audio project, via the un-gated ONNX mirror (onnx-community/pyannote-segmentation-3.0). All credit for the architecture, training, and weights belongs to the upstream authors; the license is inherited from and identical to the upstream model (MIT).
π Links
- π¦ OpenASR β https://github.com/QuintinShaw/OpenASR
- π Website β https://openasr.org
- π€ Upstream model β onnx-community/pyannote-segmentation-3.0
Model tree for OpenASR/pyannote-segmentation-3.0
Base model
pyannote/segmentation-3.0