GLM-5.2 NVFP4

Private NVFP4 quantization of zai-org/GLM-5.2.

Source checkpoint: zai-org/GLM-5.2 revision e32aaf0396e6987ee6dd2abb7f4d318b5f9b3cfe.

Quantization artifact: GLM-5.2-NVFP4-agentic-v2-b300-hotpath-max-4k-64-r4.

Quantization scope: GLM-5.2 dense decode hot path, shared experts, and routed experts with ModelOpt NVFP4. DSA indexer/router/lm_head/MTP remain unquantized.

Downloads last month
2
Safetensors
Model size
373B params
Tensor type
F32
BF16
F8_E4M3
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for harmya-modal/glm-5.2-nvfp4-all

Base model

zai-org/GLM-5.2
Quantized
(63)
this model