PaddleOCR-VL-1.6 — MLX 8-bit

MLX-quantized (8-bit, group_size=64) version of PaddlePaddle/PaddleOCR-VL-1.6 for Apple Silicon inference via mlx-vlm.

Model Details

  • Base model: PaddleOCR-VL-1.6
  • OmniDocBench v1.6 score: 96.33 (#1 on the leaderboard as of June 2026)
  • Architecture: PaddleOCRVLForConditionalGeneration (18 LLM layers, 27 vision layers)
  • Quantization: 8-bit affine, group_size=64
  • Size: ~1.0 GB (vs ~2 GB bf16)
  • Converted with: mlx-vlm >= 0.3.11

Usage

from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

model_id = "olragon/PaddleOCR-VL-1.6-8bit"
model, processor = load(model_id)
config = load_config(model_id)

prompt = apply_chat_template(
    processor, config,
    "OCR the text in this image.",
    num_images=1
)

result = generate(
    model, processor, prompt,
    image=["page.png"],
    max_tokens=6000,
    repetition_penalty=1.1,
    verbose=False,
)
text = result.text if hasattr(result, "text") else str(result)
print(text)

Or via CLI:

uv run --python 3.12 --with "mlx-vlm>=0.3.11" --with pillow \
  python3 -m mlx_vlm.generate \
    --model olragon/PaddleOCR-VL-1.6-8bit \
    --image page.png \
    --prompt "OCR the text in this image." \
    --max-tokens 6000

Conversion

Converted using:

python3 -m mlx_vlm.convert \
  --hf-path PaddlePaddle/PaddleOCR-VL-1.6 \
  --mlx-path ./PaddleOCR-VL-1.6-8bit \
  --quantize --q-bits 8

Benchmarks

PaddleOCR-VL 1.6 improvements over 1.5:

  • OmniDocBench: 94.50 → 96.33 (+1.83)
  • Better polygon localization (quadrilateral → polygon shapes)
  • Seal/stamp recognition
  • Cross-page table merging

License

Apache 2.0 (same as the base model)

Downloads last month
244
Safetensors
Model size
0.4B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for olragon/PaddleOCR-VL-1.6-8bit

Quantized
(7)
this model