Qwen-Image-Bench: From Generation to Creation in Text-to-Image Evaluation
Paper • 2605.28091 • Published • 5
NVFP4 weight quantization of Qwen/Qwen-Image-Bench (Q-Judger) — a vision-language judge model fine-tuned by the Qwen team for automated evaluation of text-to-image generation quality.
llm-compressor (vLLM project) — see recipe.yaml in the repo for the exact QuantizationModifier config.default_stage:
default_modifiers:
QuantizationModifier:
targets: [Linear]
ignore: ['re:.*lm_head', 're:visual.*', 're:model.visual.*', 're:.*embed_tokens$']
scheme: NVFP4
bypass_divisibility_checks: false
Quantized on a RunPod node. The full recipe.yaml is included for reproducibility.
Drop-in replacement for the BF16 base model on vLLM with NVFP4 support. Tested as the judge step of an image-generation evaluation pipeline (5-dim verdicts: overall_quality, prompt_match, aesthetic_appeal, lora_activation, confidence).
(dense + vision) weight loader path in Atlas's qwen35_dense.rs factory branch. See [TODO: link to filed issue] for details.Apache 2.0 — inherited from the upstream Qwen/Qwen-Image-Bench.
If you use this quant, please cite the original Qwen-Image-Bench paper:
@misc{li2026qwenimagebenchgenerationcreationtexttoimage,
title={Qwen-Image-Bench: From Generation to Creation in Text-to-Image Evaluation},
author={Niantong Li and Guangzheng Hu and Weixu Qiao and Ying Ba and Qichen Hong and Shijun Shen and Jinlin Wang and Fan Zhou and Jianye Kang and Xin Shang and Ziyi He and Wei Wang and Dalin Li and Jiahao Li and Jie Zhang and Kaiyuan Gao and Kun Yan and Lihan Jiang and Ningyuan Tang and Shengming Yin and Tianhe Wu and Xiao Xu and Xiaoyue Chen and Yuxiang Chen and Yan Shu and Yanran Zhang and Yilei Chen and Yixian Xu and Zekai Zhang and Zhendong Wang and Zihao Liu and Zikai Zhou and Hongzhu Shi and Yi Wang and Bing Zhao and Hu Wei and Lin Qu and Chenfei Wu},
year={2026},
eprint={2605.28091},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.28091},
}