Apro123/nex-n2-mini-nvfp4

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

First attempt at quantization.

A lot of existing quantizations used the default dataset of ultrachat or CNN/DailyMail but these are generic datasets.

I thought of doing 512 samples at 4096 seq length but with the actual nex-n2 dataset which the original model was trained on which is the nex-agi/agent-sft

See more at https://github.com/Apro123/quantize-nex-efforts

All respective credits go to the Nex-AGI team, Nvidia for ModelOpt, VLLM for the llmcompressor tools.

Refer to the original model card for details on the underlying model

Downloads last month: 28

Safetensors

Model size

20B params

Tensor type

F32

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Apro123/nex-n2-mini-nvfp4

Base model

nex-agi/Nex-N2-mini

Quantized

(56)

this model

Apro123
/

nex-n2-mini-nvfp4

You need to agree to share your contact information to access this model

Model tree for Apro123/nex-n2-mini-nvfp4

Dataset used to train Apro123/nex-n2-mini-nvfp4