Nanonets-OCR-s CrispEmbed GGUF

Nanonets-OCR-s (small) vision-language model converted to GGUF for OCR with CrispEmbed.

Models

File Quant Size
nanonets-ocr-s-f16.gguf F16 ~3.6 GB
nanonets-ocr-s-q8_0.gguf Q8_0 ~1.9 GB
nanonets-ocr-s-q4_k.gguf Q4_K ~1.0 GB

Architecture

  • Base: Nanonets-OCR-s (Qwen2-VL pruned fine-tune, Apache-2.0)
  • Params: ~1.5B (16 layers vs 28 in Qwen2-VL-2B)
  • Languages: 12+ including English, German, French, Spanish, Chinese, Japanese, Arabic
  • Task: Document OCR, multilingual text recognition

Usage

Runs on the existing qwen2vl_ocr engine in CrispEmbed (no custom engine needed):

from crispembed import CrispOcrPipeline

ocr = CrispOcrPipeline(vlm_model="nanonets-ocr-s-q8_0.gguf")
text = ocr.recognize("document.png")

Original Model

nanonets/Nanonets-OCR-s โ€” Qwen2-VL pruned fine-tune (16L vs 28L), 12+ languages including German.

License

Apache-2.0

Downloads last month
227
GGUF
Model size
4B params
Architecture
qwen2vl
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cstr/nanonets-ocr-s-crispembed-GGUF

Quantized
(23)
this model