PaddleOCR-VL-1.6 — MLX 8-bit

MLX-quantized (8-bit, group_size=64) version of PaddlePaddle/PaddleOCR-VL-1.6 for Apple Silicon inference via mlx-vlm.

Model Details

Base model: PaddleOCR-VL-1.6
OmniDocBench v1.6 score: 96.33 (#1 on the leaderboard as of June 2026)
Architecture: PaddleOCRVLForConditionalGeneration (18 LLM layers, 27 vision layers)
Quantization: 8-bit affine, group_size=64
Size: ~1.0 GB (vs ~2 GB bf16)
Converted with: mlx-vlm >= 0.3.11

Usage

from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

model_id = "olragon/PaddleOCR-VL-1.6-8bit"
model, processor = load(model_id)
config = load_config(model_id)

prompt = apply_chat_template(
    processor, config,
    "OCR the text in this image.",
    num_images=1
)

result = generate(
    model, processor, prompt,
    image=["page.png"],
    max_tokens=6000,
    repetition_penalty=1.1,
    verbose=False,
)
text = result.text if hasattr(result, "text") else str(result)
print(text)

Or via CLI:

uv run --python 3.12 --with "mlx-vlm>=0.3.11" --with pillow \
  python3 -m mlx_vlm.generate \
    --model olragon/PaddleOCR-VL-1.6-8bit \
    --image page.png \
    --prompt "OCR the text in this image." \
    --max-tokens 6000

Conversion

Converted using:

python3 -m mlx_vlm.convert \
  --hf-path PaddlePaddle/PaddleOCR-VL-1.6 \
  --mlx-path ./PaddleOCR-VL-1.6-8bit \
  --quantize --q-bits 8

Benchmarks

PaddleOCR-VL 1.6 improvements over 1.5:

OmniDocBench: 94.50 → 96.33 (+1.83)
Better polygon localization (quadrilateral → polygon shapes)
Seal/stamp recognition
Cross-page table merging

License

Apache 2.0 (same as the base model)

Downloads last month: 244

Safetensors

Model size

0.4B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Model tree for olragon/PaddleOCR-VL-1.6-8bit

Base model

baidu/ERNIE-4.5-0.3B-Paddle

Finetuned

PaddlePaddle/PaddleOCR-VL-1.5

Finetuned

PaddlePaddle/PaddleOCR-VL-1.6

Quantized

(7)

this model