Instructions to use Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx") config = load_config("Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- Pi
How to use Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx
Run Hermes
hermes
qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx
MLX MXFP4 MLX conversion of Jackrong/Qwopus3.6-35B-A3B-Coder, prepared by Shiftedx for Apple Silicon / MLX / LM Studio.
What Changed
- Converted from the upstream safetensors checkpoint with the local streaming MLX pipeline.
- Quantized primary linear weights with
mxfp4at group size 32. - Kept MoE router/gate modules in affine 8-bit group size 64 for compatibility.
- Removed source MTP tensors and set MTP/next-token prediction layer counts to 0 for LM Studio compatibility.
- Set
tool_parser_typetoqwen3_coder. - Patched the chat template so
enable_thinkingdefaults to false when a runtime honors the template variable. - Added grafted vision tensors from
model.visual.*asvision_tower.*.
Local Validation
Validated locally on June 29, 2026 with LM Studio server on port 8080, 32k context, parallel 1, GPU max.
| Check | Result |
|---|---|
| LM Studio load | Passed; 17.18 GiB in LM Studio at 32k context. |
| Basic text completion | Passed; returned 2+2=4 and stopped. |
| Vision image smoke | Experimental only: model loaded and stopped, but a simple shapes image was not fully reliable. MXFP4 misidentified the shapes/colors; MXFP8 identified shapes/colors but not left-to-right order. |
Note: LM Studio may still report hidden reasoning_tokens for this checkpoint even though the upstream model is intended for thinking-off use. Use adequate max_tokens for smoke tests.
Vision Status
This variant includes a grafted Qwen vision tower from the source checkpoint. The tensor/key layout matches the working MLX Qwen3.5-MoE vision format, and vision_tower.patch_embed.proj.weight was transposed to MLX layout (1152, 2, 16, 16, 3).
Local LM Studio image smoke testing did not fully pass, so treat the vision path as experimental. The language path loads and answers normally.
LM Studio
After downloading in LM Studio, load the model key:
lms load qwopus3.6-35b-a3b-coder-vision-mlx --context-length 32768 --parallel 1 --gpu max
Recommended profile defaults, matching the local Shiftedx 35B AgentWorld/Ornith profiles:
- Preset/template: Qwen3 thinking-compatible Jinja template with
<|im_end|>stop. - Context length:
200000when memory allows;32768was used for local smoke validation. - Sampling: temperature
0.6, top-k20, top-p0.95, min-p enabled at0. - Repeat penalty: unchecked/off, value
1.1if enabled manually. - Load: parallel
1, GPUmax.
Provenance
- Source model: https://huggingface.co/Jackrong/Qwopus3.6-35B-A3B-Coder
- Source license: Apache-2.0
- Quantization date: June 29, 2026
- Downloads last month
- 53
4-bit
Model tree for Shiftedx/qwopus3.6-35b-a3b-coder-mxfp4-vision-mlx
Base model
Qwen/Qwen3.6-35B-A3B