Nanonets-OCR-s CrispEmbed GGUF

Nanonets-OCR-s (small) vision-language model converted to GGUF for OCR with CrispEmbed.

Models

Base: Nanonets-OCR-s (Qwen2-VL pruned fine-tune, Apache-2.0)
Params: ~1.5B (16 layers vs 28 in Qwen2-VL-2B)
Languages: 12+ including English, German, French, Spanish, Chinese, Japanese, Arabic
Task: Document OCR, multilingual text recognition

Runs on the existing qwen2vl_ocr engine in CrispEmbed (no custom engine needed):

from crispembed import CrispOcrPipeline

ocr = CrispOcrPipeline(vlm_model="nanonets-ocr-s-q8_0.gguf")
text = ocr.recognize("document.png")

nanonets/Nanonets-OCR-s — Qwen2-VL pruned fine-tune (16L vs 28L), 12+ languages including German.

Apache-2.0

GGUF

Model size

4B params

Architecture

qwen2vl

Hardware compatibility

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Quantized

(23)

this model