dots.ocr โ€” CrispEmbed GGUF

GGUF conversion of rednote-hilab/dots.ocr for CrispEmbed.

dots.ocr unifies layout detection, text extraction, table parsing, and formula recognition in a single VLM. 100+ languages. 88.4% on OmniDocBench.

Architecture

  • Vision: Custom ViT (42 layers, 1536d, patch 14, 2D RoPE, SwiGLU FFN, PatchMerger 2x2)
  • LLM: Qwen2 (28 layers, 1536d, GQA 12/2, standard RoPE, attention_bias=true)
  • Training: Prompt-based task switching (OCR, layout, table, formula)

Models

File Quant Size
dots-ocr-f16.gguf F16 2.8 GB
dots-ocr-q8_0.gguf Q8_0 1.5 GB
dots-ocr-q4_k.gguf Q4_K 912 MB

Usage

License

MIT (rednote-hilab/dots.ocr)

Downloads last month
-
GGUF
Model size
3B params
Architecture
dots_ocr
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cstr/dots-ocr-crispembed-GGUF

Quantized
(10)
this model