atlas-nvfp4-dense-gemm

Dense NVFP4/FP8 GEMM kernels for Qwen3.6-27B (dense attention projections

  • FFN) on NVIDIA GB10 (DGX Spark, SM121).
from kernels import get_kernel
dg = get_kernel("Atlas-Inference/nvfp4-dense-gemm", trust_remote_code=True)
dg.w4a16_gemm_t(A, B_packed, B_scale, scale2, C)

See CARD.md. GB10 only (sm_121f). AGPL-3.0.

Source: https://github.com/Avarok-Cybersecurity/atlas-kernels

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support