You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

First attempt at quantization.

A lot of existing quantizations used the default dataset of ultrachat or CNN/DailyMail but these are generic datasets.

I thought of doing 512 samples at 4096 seq length but with the actual nex-n2 dataset which the original model was trained on which is the nex-agi/agent-sft

See more at https://github.com/Apro123/quantize-nex-efforts

All respective credits go to the Nex-AGI team, Nvidia for ModelOpt, VLLM for the llmcompressor tools.

Refer to the original model card for details on the underlying model

Downloads last month
28
Safetensors
Model size
20B params
Tensor type
F32
·
BF16
·
F8_E4M3
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Apro123/nex-n2-mini-nvfp4

Quantized
(56)
this model

Dataset used to train Apro123/nex-n2-mini-nvfp4