Next Series
Collection
Our Next LLM models will be here.
•
7 items
•
Updated
•
4
Next OCR 8B is an 8-billion parameter model optimized for optical character recognition (OCR) tasks with mathematical and tabular content understanding.
Supports multilingual OCR (Turkish, English, German, Spanish, French, Chinese, Japanese, Korean, Russian...) with high accuracy, including structured documents like tables, forms, and formulas.
| Model | OCR-Bench Accuracy (%) | Multilingual Accuracy (%) | Layout / Table Understanding (%) |
|---|---|---|---|
| Next OCR | 99.0 | 96.8 | 95.3 |
| PaddleOCR | 95.2 | 93.9 | 95.3 |
| Deepseek OCR | 90.6 | 87.4 | 86.1 |
| Tesseract | 92.0 | 88.4 | 72.0 |
| EasyOCR | 90.4 | 84.7 | 78.9 |
| Google Cloud Vision / DocAI | 98.7 | 95.5 | 93.6 |
| Amazon Textract | 94.7 | 86.2 | 86.1 |
| Azure Document Intelligence | 95.1 | 93.6 | 91.4 |
| Model | Handwriting (%) | Scene Text (%) | Complex Tables (%) |
|---|---|---|---|
| Next OCR | 92 | 96 | 91 |
| PaddleOCR | 88 | 92 | 90 |
| Deepseek OCR | 80 | 85 | 83 |
| Tesseract | 75 | 88 | 70 |
| EasyOCR | 78 | 86 | 75 |
| Google Cloud Vision / DocAI | 90 | 95 | 92 |
| Amazon Textract | 85 | 90 | 88 |
| Azure Document Intelligence | 87 | 91 | 89 |
from transformers import AutoTokenizer, AutoModelForVision2Seq
import torch
model_id = "Lamapi/next-ocr"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16)
img = Image.open("image.jpg")
# ATTENTION: The content list must include both an image and text.
messages = [
{"role": "system", "content": "You are Next-OCR, an helpful AI assistant trained by Lamapi."},
{
"role": "user",
"content": [
{"type": "image", "image": img},
{"type": "text", "text": "Read the text in this image and summarize it."}
]
}
]
# Apply the chat template correctly
prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=prompt, images=[img], return_tensors="pt").to(model.device)
with torch.no_grad():
generated = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(generated[0], skip_special_tokens=True))
| Feature | Description |
|---|---|
| 🖼️ High-Accuracy OCR | Extracts text from images, documents, and screenshots reliably. |
| 🇹🇷 Multilingual Support | Works with 30+ languages including Turkish. |
| ⚡ Lightweight & Efficient | Optimized for resource-constrained environments. |
| 📄 Layout & Math Awareness | Handles tables, forms, and mathematical formulas. |
| 🏢 Reliable Outputs | Suitable for enterprise document workflows. |
| Specification | Details |
|---|---|
| Base Model | Qwen 3 |
| Parameters | 8 Billion |
| Architecture | Vision + Transformer (OCR LLM) |
| Modalities | Image-to-text |
| Fine-Tuning | OCR datasets with multilingual and math/tabular content |
| Optimizations | Quantization-ready, FP16 support |
| Primary Focus | Text extraction, document understanding, mathematical OCR |
MIT License — free for commercial & non-commercial use.
Next OCR — Compact OCR + math-capable AI, blending accuracy, speed, and multilingual document intelligence.