Chandra OCR 2 — 8-bit MLX Quantization

This is an 8-bit MLX quantization of datalab-to/chandra-ocr-2, converted for efficient inference on Apple Silicon using the mlx-vlm framework.

Original model: datalab-to/chandra-ocr-2
Quantization: 8-bit affine, group size 64
Framework: MLX (Apple Silicon)
Modified files: The weight file (model.safetensors) has been quantized from the original bfloat16 weights. All other files are unchanged from the original repository.

About Chandra OCR 2

Chandra 2 is a state-of-the-art OCR model from Datalab that outputs markdown, HTML, and JSON. It is highly accurate at extracting text from images and PDFs while preserving layout information.

What's New in Chandra 2

  • 85.9% olmocr bench score (SOTA), 77.8% multilingual bench score (12% improvement over Chandra 1)
  • Significant improvements to math, tables, and complex layouts
  • Improved layout, especially on wider documents
  • Significantly better image captioning
  • 90+ language support with major accuracy gains

Features

  • Convert documents to markdown, HTML, or JSON with detailed layout information
  • Excellent handwriting support
  • Reconstructs forms accurately, including checkboxes
  • Strong performance with tables, math, and complex layouts
  • Extracts images and diagrams with captions and structured data
  • Support for 90+ languages

Usage with mlx-vlm

Installation

pip install mlx-vlm

Inference

from mlx_vlm import load
from mlx_vlm.utils import generate_step
from PIL import Image

model, processor = load("jacobwindle/chandra-ocr-2-8bit-mlx")

image = Image.open("document.png")
prompt = "Convert this image to markdown."

output = generate_step(
    model=model,
    processor=processor,
    image=image,
    prompt=prompt,
    max_tokens=4096,
)
print(output)

Command-line

python -m mlx_vlm.generate --model jacobwindle/chandra-ocr-2-8bit-mlx --image document.png --prompt "Convert this image to markdown." --max-tokens 4096

Quantization Details

Parameter Value
Bits 8
Group size 64
Mode Affine
Original dtype bfloat16
Quantized size ~4.8 GB

Converted using:

python -m mlx_vlm.convert --model datalab-to/chandra-ocr-2 --mlx-path models/chandra-ocr-2-8bit -q --q-bits 8

Attribution

This is a derivative work of datalab-to/chandra-ocr-2. The original model was created by Datalab. The weights in this repository have been modified (8-bit quantized) from the original release. All credit for the model architecture, training data, and original weights belongs to the original authors.

License

This model inherits the modified OpenRAIL-M license from the original datalab-to/chandra-ocr-2. As a derivative work, the same license terms apply, including the share-alike requirement (Section III, paragraph 8) and use-based restrictions (Attachment A).

Key restrictions from the original license:

  • Free for research, personal use, and startups under $2M funding/revenue
  • Cannot be used competitively with the Datalab API
  • Derivative works must retain the same license

For broader commercial licensing, see Datalab pricing.

Downloads last month
620
Safetensors
Model size
2B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jwindle47/chandra-ocr-2-8bit-mlx

Quantized
(7)
this model