Instructions to use sugartai/Qwen3.5-2B-MathParser-pro with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sugartai/Qwen3.5-2B-MathParser-pro with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="sugartai/Qwen3.5-2B-MathParser-pro") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("sugartai/Qwen3.5-2B-MathParser-pro") model = AutoModelForImageTextToText.from_pretrained("sugartai/Qwen3.5-2B-MathParser-pro") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use sugartai/Qwen3.5-2B-MathParser-pro with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "sugartai/Qwen3.5-2B-MathParser-pro" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sugartai/Qwen3.5-2B-MathParser-pro", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/sugartai/Qwen3.5-2B-MathParser-pro
- SGLang
How to use sugartai/Qwen3.5-2B-MathParser-pro with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "sugartai/Qwen3.5-2B-MathParser-pro" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sugartai/Qwen3.5-2B-MathParser-pro", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "sugartai/Qwen3.5-2B-MathParser-pro" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sugartai/Qwen3.5-2B-MathParser-pro", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use sugartai/Qwen3.5-2B-MathParser-pro with Docker Model Runner:
docker model run hf.co/sugartai/Qwen3.5-2B-MathParser-pro
Qwen3.5-2B-MathParser-pro
Model Summary
Qwen3.5-2B-MathParser-pro is a compact vision-language model for handwritten mathematical formula OCR. It is optimized to transcribe single-line and multi-line handwritten mathematical expressions into LaTeX, with a focus on local deployment.
This 2B release is intended for lower-memory local deployment. The companion release is Qwen3.5-4B-MathParser-pro.
Intended Use
- Handwritten mathematical formula recognition
- Multi-line LaTeX transcription
- OCR for mathematical expressions and derivations
- Research and application prototyping around handwritten math parsing
This model is not intended to be a general mathematical reasoning model. It should be used as an OCR/transcription model.
Training Recipe
The model follows a two-stage MathParser training recipe:
- Stage 1 SFT builds a stable handwritten mathematical OCR base and teaches direct LaTeX transcription.
- Stage 2 DPO v34 prefers concise, stable, line-count-faithful transcriptions and reduces malformed outputs, repetition, max-token runaway, and very low-similarity failures.
The released weights are fully merged model weights, not LoRA adapters.
Evaluation
Evaluation set: 756 multi-line handwritten mathematical formula samples.
Metrics:
- Avg Sim / Median Sim: normalized edit similarity, higher is better.
- Line Match: exact line-count match with ground truth.
- Within +/-1: predicted line count differs from ground truth by at most one.
- Runaway: max-token or obviously overlong/repetitive generations, lower is better.
- Bad <0.50: samples with similarity below 0.50, lower is better.
| Model | Samples | Avg Sim | Median Sim | Line Match | Within +/-1 | Runaway | Bad <0.50 |
|---|---|---|---|---|---|---|---|
| Qwen3.5-0.8B Base | 756 | 0.544843 | 0.580742 | 149 | 235 | 108 | 262 |
| Qwen3.5-2B Base | 756 | 0.599258 | 0.651649 | 252 | 392 | 19 | 236 |
| Qwen3.5-4B Base | 756 | 0.534456 | 0.541674 | 264 | 368 | 5 | 295 |
| Qwen3.5-2B SFT | 756 | 0.906516 | 0.952732 | 550 | 706 | 13 | 25 |
| Qwen3.5-2B SFT+DPO | 756 | 0.916060 | 0.951464 | 569 | 714 | 3 | 15 |
| Qwen3.5-4B SFT | 756 | 0.942045 | 0.966546 | 612 | 730 | 0 | 2 |
| Qwen3.5-4B SFT+DPO | 756 | 0.942878 | 0.968560 | 611 | 730 | 0 | 1 |
For this release, the main result is:
| Release | Avg Sim | Median Sim | Line Match | Within +/-1 | Runaway | Bad <0.50 |
|---|---|---|---|---|---|---|
| Qwen3.5-2B-MathParser-pro | 0.916060 | 0.951464 | 569 | 714 | 3 | 15 |
Figures
Usage
from PIL import Image
import torch
from transformers import AutoModelForImageTextToText, AutoProcessor
from qwen_vl_utils import process_vision_info
model_id = "sugartai/Qwen3.5-2B-MathParser-pro"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
model_id,
trust_remote_code=True,
dtype=torch.bfloat16,
device_map="auto",
).eval()
image = Image.open("formula.png").convert("RGB")
messages = [
{
"role": "system",
"content": "You are a handwritten mathematical OCR model. Return only the LaTeX transcription.",
},
{
"role": "user",
"content": [
{"type": "image", "image": image},
{"type": "text", "text": "Transcribe the handwritten mathematical formula into LaTeX only."},
],
},
]
text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False,
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
).to(model.device)
eos_ids = [processor.tokenizer.eos_token_id]
pad_id = processor.tokenizer.pad_token_id
if pad_id is not None and pad_id not in eos_ids:
eos_ids.append(pad_id)
with torch.no_grad():
output_ids = model.generate(
**inputs,
max_new_tokens=1536,
do_sample=False,
num_beams=1,
eos_token_id=eos_ids,
pad_token_id=pad_id if pad_id is not None else eos_ids[0],
)
new_ids = output_ids[:, inputs["input_ids"].shape[1]:]
print(processor.decode(new_ids[0], skip_special_tokens=True))
Limitations
- The model is specialized for handwritten mathematical OCR and LaTeX transcription.
- It is not a general reasoning or theorem-proving model.
- Very noisy images, unusual notation, extreme layout variation, or out-of-distribution handwriting may degrade quality.
- The reported metrics are from an internal 756-sample multi-line handwritten formula evaluation set.
License
This model is released under Apache 2.0, following the base model license of Qwen/Qwen3.5-2B.
Citation
If you use this model, please cite or link this model page and the Qwen3.5 base model.
- Downloads last month
- 18



