Piper TTS (en_US lessac medium) โ GGUF
Native C++ GGUF conversion of the Piper VITS voice en_US-lessac-medium for use with CrispASR.
Files
| File | Size | Description |
|---|---|---|
piper-en_US-lessac-medium-f16.gguf |
30 MB | F16 weights (full model) |
Piper models are small enough that quantization provides no meaningful savings. F16 is the only format.
Usage with CrispASR
./build/bin/crispasr --backend piper \
-m piper-en_US-lessac-medium-f16.gguf \
--tts "Hello, how are you today?" \
--tts-output hello.wav
Phonemization uses espeak-ng (must be installed: apt install espeak-ng).
Architecture
- VITS (Conditional Variational Autoencoder with Adversarial Learning)
- Text encoder: 6-layer relative-position transformer (192-d, 2 heads)
- Duration predictor: Stochastic Duration Predictor with rational-quadratic spline flows
- Flow: 4 affine coupling blocks with WaveNet conditioning
- Decoder: HiFi-GAN (3 upsample stages, 9 MRF resblocks)
- Output: 22.05 kHz mono PCM
- License: MIT
Conversion
python models/convert-piper-to-gguf.py \
--onnx en_US-lessac-medium.onnx \
--output piper-en_US-lessac-medium-f16.gguf
- Downloads last month
- 124
Hardware compatibility
Log In to add your hardware
16-bit
Model tree for cstr/piper-en_US-lessac-medium-GGUF
Base model
rhasspy/piper-voices