YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Huihui-Step3-VL-10B-abliterated MLX
MLX implementation of Huihui-Step3-VL-10B-abliterated, a vision-language model combining Qwen3-8B with the Step3 vision encoder.
Model Architecture
- LLM Backbone: Qwen3-8B-Instruct (bf16)
- Vision Encoder: Step3 ViT (47 layers, 1536 hidden dim, 12 heads, patch size 14)
- Projector: MLP (1536 -> 4096 -> 4096) with GELU
- Special Tokens:
<|im_start|>,<|im_patch|>,<|im_end|>
Installation
pip install -r requirements.txt
Usage
Basic Generation
from mlx_lm import load as mlx_load
from model import HuihuiStep3VL
# Load Qwen3-8B
model, tokenizer = mlx_load("mlx-community/Qwen3-8B-Instruct-bf16")
# Create VL model
vl_model = HuihuiStep3VL(
llm_model=model,
vision_hidden=1536,
llm_hidden=4096,
)
# Generate with image
response = vl_model.generate(
images=image_tensor,
prompt_tokens=prompt_tokens,
max_tokens=256,
)
With Base64 Image
from sample import generate_response
response = generate_response(
model=vl_model,
tokenizer=tokenizer,
image_base64=base64_encoded_image,
prompt="Describe this image.",
)
Chat Format
from sample import generate_with_chat_messages
messages = [
{"role": "user", "content": "What do you see in this image?"}
]
response = generate_with_chat_messages(
model=vl_model,
tokenizer=tokenizer,
messages=messages,
image=base64_image,
)
Files
model.py- Model definition (VisionEncoder, ImageProjector, HuihuiStep3VL)loader.py- Weight loading utilitiestokenizer.py- Tokenizer with Step3 special tokenssample.py- Sample inference scriptsconvert.py- Weight conversion and hub push script
Conversion
To convert and push to HuggingFace Hub:
python convert.py
Notes
- The abliterated bias fix (1.3K vector subtraction) is baked into the original weights
- Image tokens:
<im_start>+ N×<im_patch>+<im_end>where N = (H/patch_size) × (W/patch_size) - For 224×224 images with patch_size=14: N = 16×16 = 256 patches
License
See original model repository for license information.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support