This is Z-Image-Turbo text encoder quantized to NVFP4, with working inference scripts that emulate OpenAI image generation viewpoint enclosed in extras. Scripts use vLLM for actual encoder inference. In combination with nunchaku-ai/nunchaku-z-image-turbo, the model comfortably fits on a 16GB NVIDIA GeForce RTX 5060 Ti GPU, allowing very fast image generation with no sequential offload needed. I also uncensored encoder with Heretic to hopefully improve prompt adherence.

Downloads last month
24
Safetensors
Model size
2B params
Tensor type
F32
BF16
F8_E4M3
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for catplusplus/Z-Image-Turbo-Text-Encoder-Heretic-NVFP4

Quantized
(50)
this model