Surya OCR 2 MLX 8-bit G64

This repository contains an **experimental quantized** artifact derived from [datalab-to/surya-ocr-2](https://huggingface.co/datalab-to/surya-ocr-2).

This 8-bit MLX quant is the most useful Apple-side artifact from the current batch. It keeps perfect mini-section scores on arxiv math, headers/footers, multi-column, old-scans-math, tables, and baseline checks, but it currently fails the old-scans mini split and is weak on long tiny text.

## What is included

- Source model: `datalab-to/surya-ocr-2`
- Runtime/format: MLX / mlx-vlm
- Quantization: 8-bit affine weight quantization, group size 64
- Vision weights included: Yes. The MLX checkpoint includes the model vision weights and processor assets.
- Processor/tokenizer assets: included

## Mini olmOCR-bench results

| Candidate | Overall | Arxiv math | Headers/footers | Long tiny text | Multi-column | Old scans | Old scans math | Tables | Baseline |

|---|---:|---:|---:|---:|---:|---:|---:|---:|---:| | Source mini baseline | 91.0% ± 6.3% | 100.0% | 100.0% | 100.0% | 100.0% | 33.3% | 100.0% | 100.0% | 94.7% | | Surya OCR 2 MLX 8-bit G64 | 79.2% ± 6.2% | 100.0% | 100.0% | 33.3% | 100.0% | 0.0% | 100.0% | 100.0% | 100.0% |

How to read the benchmark table

This is an early quant release with transparent limitations. The table uses our local 40-test mini slice of allenai/olmOCR-bench, with 3 samples from each named section plus the benchmark baseline checks. It is not the full public score and it is not a claim of >98% parity.

The useful signal is the split behavior: this artifact is currently strong on clean academic/math, headers/footers, multi-column layouts, tables, old-scan math, and baseline OCR checks, but it should not be used for old degraded scans and is weak on long tiny text.

Recommended use

Use this checkpoint for local experimentation and constrained OCR workloads whose documents resemble the passing sections above. Avoid using it as a production replacement for the original model on degraded historical scans, very small dense body text, or workloads requiring full benchmark parity.

## Loading

```python

from mlx_vlm import load, generate

model, processor = load("Reza2kn/surya-ocr-2-mlx-8bit-g64")

Pass images/documents through the same Surya/MLX-VLM prompting path used by your app.


    ## Limitations

    - This is not a full-parity release yet.
    - Do **not** use this artifact for degraded old scans; the current mini split score is 0.0% there.
    - Do **not** use this artifact for long tiny text unless you independently validate your data; the current mini split score is 33.3%.
    - Math-heavy and table/layout-heavy mini examples looked good in this slice, but full olmOCR-bench is still pending.

    ## Provenance

    Generated non-destructively from the original Hugging Face checkpoint. This is not a fine-tune. The goal of publishing this artifact now is transparency: the files are usable for the passing workload slices above, and the known failing slices are documented clearly.
Downloads last month
32
Safetensors
Model size
0.3B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Reza2kn/surya-ocr-2-mlx-8bit-g64

Quantized
(5)
this model