chaddy81/Qwen3.6-27b-coder-6bit-mlx

This is a 6-bit quantized MLX conversion of chaddy81/Qwen3.6-27b-coder, for fast local inference on Apple Silicon.

Details

  • Base model: chaddy81/Qwen3.6-27b-coder
  • Architecture: Qwen3_5ForConditionalGeneration (qwen3_5) — vision-language model (text + image + video)
  • Text backbone: 64 layers, hidden 5120, 24 attention heads / 4 KV heads (GQA), vocab 248320, context length 262144 (256K)
  • Vision tower: 27 layers, hidden 1152
  • Quantization: 6-bit, 6.501 bits-per-weight (MLX affine), group size 64
  • Files: 5 safetensors shards, ~20 GB
  • Converted with: mlx_lm 0.31.1, source dtype bfloat16

Usage

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("chaddy81/Qwen3.6-27b-coder-6bit-mlx")
prompt = "Write a Python function that returns the nth Fibonacci number."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
print(generate(model, tokenizer, prompt=text, max_tokens=512, verbose=True))

Or from the CLI:

mlx_lm.generate --model chaddy81/Qwen3.6-27b-coder-6bit-mlx --prompt "Explain async/await in Python."

Quantization variants

Variant bits-per-weight Size Repo
8-bit 8.5 ~27 GB chaddy81/Qwen3.6-27b-coder-8bit-mlx
6-bit 6.5 ~20 GB chaddy81/Qwen3.6-27b-coder-6bit-mlx
4-bit 4.5 ~14 GB chaddy81/Qwen3.6-27b-coder-4bit-mlx

Notes

  • Conversion emitted a tokenizer regex warning referencing a Mistral discussion; this is a generic tokenizers notice and does not affect the Qwen3.5 architecture or quantized weights. If you observe unexpected tokenization, load the tokenizer with fix_mistral_regex=True.
  • License inherited from the base model; refer to chaddy81/Qwen3.6-27b-coder.
Downloads last month
256
Safetensors
Model size
27B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for chaddy81/Qwen3.6-27b-coder-6bit-mlx

Quantized
(3)
this model