Nex-N2-mini-MLX-VLM-8bit

Native MLX-VLM 8-bit quantized version of .

Summary

  • Base model:
  • Format: native MLX / MLX-VLM
  • Quantization: 8-bit MLX-VLM quantization
  • Group size: 64
  • Vision: supported
  • MTP: not included
  • Target runtime: MLX-VLM / oMLX / Apple Silicon

This version is intended as a direct compatible release.

Quick test

python3 -m mlx_vlm.generate \
  --model joowon-jang/Nex-N2-mini-MLX-VLM-8bit \
  --image /path/to/image.jpg \
  --prompt "Describe this image in one sentence." \
  --max-tokens 128 \
  --temp 0.0

MTP variant

For oMLX Native MTP speculative decoding, use:

License

Apache-2.0, following the base model license.

Downloads last month
110
Safetensors
Model size
10B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for joowon-jang/Nex-N2-mini-MLX-VLM-8bit

Quantized
(52)
this model

Collection including joowon-jang/Nex-N2-mini-MLX-VLM-8bit