Qwen3.6-35B-rust-v2 4-bit MLX

A Rust-focused Qwen3.6-35B-A3B model for Apple Silicon, packaged in MLX.

Use it as a coding assistant for Rust projects: generating focused patches, explaining diffs, tightening tests, reading command output, and making small repo-aware edits. It was tested with Swival on local tool-calling workflows.

This is the plain 4-bit compatibility variant. It does not include native MTP tensors, so it is the best starting point if your MLX loader does not support MTP sidecars.

Which Variant Should I Use?

Use this repo if you want the smallest plain MLX package or need a loader-friendly non-MTP model.
Use jedisct1/Qwen3.6-35B-rust-v2-8bit.mlx if you want more precision without native MTP.
Use jedisct1/Qwen3.6-35B-rust-v2-bf16.mlx if you want full precision without native MTP.
Use jedisct1/Qwen3.6-35B-rust-v2-MTP-4bit.mlx if your runtime supports native MTP and you want the faster MTP path.

Usage

Requires mlx-lm:

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("jedisct1/Qwen3.6-35B-rust-v2-4bit.mlx")

messages = [
    {"role": "system", "content": "You are an expert Rust developer."},
    {"role": "user", "content": "Generate a focused patch that replaces unwrap() calls in parse_config() with proper error propagation."},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=500)
print(response)

What It Is Good At

Writing idiomatic Rust patches from a concise change request.
Explaining Rust diffs in commit-message style.
Following tool-calling workflows where it needs to inspect files before editing.
Keeping changes focused instead of turning small fixes into broad rewrites.
Working with tests, compiler errors, command output, and repository context.

Limitations

Outputs should be reviewed before use, especially unsafe code, concurrency code, and changes that affect security boundaries.
The model works best on focused Rust changes, tests, and explanations. Very large refactors may need to be split into smaller steps.
Tool calling depends on the runtime and client preserving the chat template and tool schema format.

Downloads last month: 162

Safetensors

Model size

35B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jedisct1/Qwen3.6-35B-rust-v2-4bit.mlx

Base model

Qwen/Qwen3.6-35B-A3B

Quantized

(510)

this model

Collection including jedisct1/Qwen3.6-35B-rust-v2-4bit.mlx

Qwen3.6-rust

Collection

12 items • Updated 13 days ago