ZONOS2-FP8

This repository provides a mixed FP8 Safetensors conversion of the original Zyphra/ZONOS2 model for use with the ZONOS2 TTS ComfyUI custom node.

The model was converted from the original PyTorch checkpoint format to .safetensors and quantized using a conservative mixed-precision policy. Only selected MoE expert projection weights were converted to FP8 E4M3, while the precision-sensitive parts of the model were kept in BF16 for stability and output quality.

Screenshot 2026-06-12 214924

Original Project

ZONOS2 is a text-to-speech model from Zyphra trained on more than 6 million hours of varied multilingual speech. It supports expressive speech generation and high-fidelity voice cloning.

zonos2

ComfyUI Custom Node

This model package is intended for use with:

The ComfyUI node provides native ZONOS2 text-to-speech, audio-only voice cloning, mixed FP8 loading, BF16 compute support, SDPA/FlashAttention inference, progress reporting, and ComfyUI/AIMDO model-management integration.

Model File

Main model file:

  • zonos2-fp8-mixed.safetensors

Direct download:

Model Storage Location

Place the model and required assets under:

ComfyUI/
└── models/
    └── zonos2/
        β”œβ”€β”€ zonos2-fp8-mixed.safetensors
        β”œβ”€β”€ dac_44khz/
        └── speaker_encoder/

Expected layout:

ComfyUI/models/zonos2/
β”œβ”€β”€ zonos2-fp8-mixed.safetensors
β”œβ”€β”€ dac_44khz/
β”‚   β”œβ”€β”€ config.json
β”‚   β”œβ”€β”€ model.safetensors
β”‚   └── preprocessor_config.json
└── speaker_encoder/
    β”œβ”€β”€ config.json
    β”œβ”€β”€ model.safetensors
    └── preprocessor_config.json

If download_if_missing is enabled in the ComfyUI node, missing assets can be downloaded automatically.

Usage

Install the ComfyUI custom node:

cd ComfyUI/custom_nodes
git clone https://github.com/Saganaki22/Zonos2_TTS-ComfyUI.git

Then restart ComfyUI and load the ZONOS2 FP8 Mixed model from the node loader.

Recommended dtype settings for this checkpoint:

  • dtype: auto
  • dtype: bf16

The mixed FP8 checkpoint does not use the fp16 runtime option.

Quantization Details

This checkpoint was quantized as a mixed FP8/BF16 model.

The quantization policy is deliberately conservative:

  • Converted to FP8 E4M3

    • MoE expert gate/up projection weights
    • Specifically the expert w13 projections
  • Left in BF16

    • Attention layers
    • Dense feed-forward layers
    • Expert-down projections, w2
    • LM head
    • Routers
    • Token embeddings
    • Speaker embeddings and speaker projections
    • Normalization layers
    • Biases
    • Temperatures
    • Other precision-sensitive paths

In short, the large MoE expert gate/up weights were quantized to FP8 E4M3, while the parts most likely to affect stability, routing, speaker identity, and generation quality were kept in BF16.

This reduces the main checkpoint from approximately 14.28 GiB for the BF16 version to approximately 9.78 GiB for the mixed FP8 version.

The mixed FP8 checkpoint is primarily a memory-saving option. It is not guaranteed to generate faster than BF16 on every GPU or ComfyUI setup.

Notes

  • This repository is a mixed FP8 Safetensors package of the original ZONOS2 model.
  • The model architecture and original weights come from Zyphra/ZONOS2.
  • This package is provided for ComfyUI compatibility and convenience.
  • Mixed FP8 support requires the current ZONOS2 TTS ComfyUI custom node.
  • Voice cloning should only be used with voices you own or have explicit permission to use.

License

The original ZONOS2 model is released under the Apache License 2.0.

This converted mixed FP8 Safetensors package follows the same model license.

Responsible Use

Do not use this model for malicious impersonation, fraud, deception, harassment, non-consensual voice cloning, or any use intended to cause harm.

Only clone voices you own or have explicit permission to use.

Citation

If you find this model useful in an academic context, please cite the original ZONOS2 work:

@misc{zyphra2025zonos,
  title     = {Zonos V2 Technical Report},
  author    = {Gabriel Clark, Sofian Mejjoute, Mohamed Osman, George Close, Beren Millidge},
  year      = {2026},
}

Credits

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support