nvidia
/

Gemma-4-31B-IT-NVFP4

Text Generation

Model Optimizer

Model card Files Files and versions

Resources

View closed (6)

This model wasn't trained with FP4 or NVFP4

#8 opened about 1 month ago by

1*H100 with vLLM 0.19.0 Failed

#7 opened about 1 month ago by

Question about q_scale / KV cache scale fallback in vLLM for Gemma-4-31B-IT-NVFP4: expected accuracy impact?

#6 opened about 1 month ago by

Why not quantize the MATRICES of Wq, Wk, Wv, Wo?

#5 opened about 2 months ago by

这个版本对于5090单卡来说还是太大了

#4 opened about 2 months ago by

Why is this 4bit version has a 32.7 GB size?

#3 opened about 2 months ago by