voiceclonnx — pure-ONNX voice conversion
Collection
ONNX exports powering the vconnx voice-conversion library: one repo per engine, with parity reports and provenance. • 10 items • Updated
ONNX export of LSCodec (Guo et al., Interspeech 2025) — a low-bitrate, speaker-decoupled discrete speech codec — packaged for voiceclonnx.
lscodec_encoder.onnx — raw 16 kHz audio → 64-d content tokens (50 Hz).codebook.npy (300 entries) — speaker-agnostic content.wavlm_l6.onnx — reference clip → WavLM-Large layer-6 prompt (exported at a
fixed 4 s window; references are padded/cropped to 64000 samples).lscodec_vocoder.onnx — content + prompt → 24 kHz waveform (CTXVEC2WAV).*_q8.onnx are INT8 variants (fp32 recommended for quality).
from voiceclonnx import VoiceCloner
cloner = VoiceCloner(engine="lscodec")
cloner.clone_voice("source.wav", "target_reference.wav", "out.wav")
MIT. LSCodec code and weights are MIT (X-LANCE, cantabile-kwok/lscodec_50hz); WavLM-Large is MIT (Microsoft). These ONNX artifacts inherit MIT.