No longer available on HF due to storage restrictions: archived here

See MiniMax-M2.7 in action: demonstration videos

Tested with an M3 Ultra 512 GiB using Inferencer app

Q6-INF uses the data-agnostic INF method tuned to yield maximum general accuracy within a 192 GiB memory budget

Quantization (bpw)	Perplexity	Token Accuracy	Missed Divergence
Q4.5	1.27343	92.40%	24.73%
Q6-INF	1.20312	97.40%	13.92%
Q6.5	1.21093	96.85%	11.74%
Q9	1.20312	97.50%	9.95%
Base	1.20312	100.0%	0.000%

Perplexity: Measures the confidence for predicting base tokens (lower is better)
Token Accuracy: The percentage of correctly generated base tokens
Missed Divergence: Measures severity of misses; how much the token was missed by

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for inferencerlabs/MiniMax-M2.7-MLX-Q6-INF

Base model

Quantized

(114)

this model