Instructions to use bkideas/LFM2.5-8B-A1B-MLX-nvfp4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use bkideas/LFM2.5-8B-A1B-MLX-nvfp4 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir LFM2.5-8B-A1B-MLX-nvfp4 bkideas/LFM2.5-8B-A1B-MLX-nvfp4
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
This model is a quantized NVFP4 MLX variant of LiquidAI/LFM2.5‑8B‑A1B‑MLX‑bf16, created by LiquidAI. Original model licensed under the LiquidAI Model License.
NVFP4 MLX Quantization — Performance & Quality
This model is a 4‑bit NVFP4 MLX‑quantized variant of the original BF16 LFM2.5‑8B‑A1B model. NVFP4 is MLX’s optimized 4‑bit format designed for efficient inference on Apple Silicon GPUs.
Why NVFP4?
NVFP4 reduces memory usage by ~65% and increases generation speed by ~1.6–1.8× on M‑series chips, while preserving most of the model’s quality.
Performance Comparison (Representative MLX Benchmarks)
| Metric | BF16 | NVFP4 | Notes |
|---|---|---|---|
| Memory usage | ~15 GB | ~5 GB | Fits on 16 GB Macs |
| Token speed (M5 Max) | ~41 tok/s | ~72 tok/s | ~1.75× faster |
| Perplexity | 1.00× | 1.02–1.03× | ~2–3% degradation |
| Output quality | Baseline | ~95–98% identical | Minor reasoning loss |
Pros
- Much lower memory footprint
- Faster inference on macOS
- Lower power usage
- Ideal for laptops and smaller RAM configs
Cons
- Slight quality degradation (1–3%)
- Not suitable for fine‑tuning
- Slightly more drift in very long generations
Practical Impact
For chat, summarization, and coding, NVFP4 behaves almost identically to the BF16 model.
For math/logic‑heavy tasks, BF16 remains slightly more accurate.
- Downloads last month
- 124
Model size
1B params
Tensor type
BF16
·
U32 ·
F32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for bkideas/LFM2.5-8B-A1B-MLX-nvfp4
Base model
LiquidAI/LFM2.5-8B-A1B-Base Finetuned
LiquidAI/LFM2.5-8B-A1B Finetuned
LiquidAI/LFM2.5-8B-A1B-MLX-bf16