DeepSeek-OCR-2 CrispEmbed GGUF

DeepSeek-OCR-2 (3.4B MoE) converted to GGUF for OCR with CrispEmbed.

Models

File	Quant	Size
`deepseek-ocr2-f16.gguf`	F16	~6.5 GB

Vision Encoder: SAM-ViT-B (12 layers, 768d, windowed + global attention)
Visual Encoder: Qwen2-0.5B used bidirectionally (24 layers, 896d)
Projector: Linear(896, 1280)
LLM Decoder: DeepSeek-V2 MoE (12 layers, 1280d)
- Layer 0: Dense SwiGLU FFN (intermediate=6848)
- Layers 1-11: 64 routed experts (top-6) + 2 shared experts (intermediate=896 each)
Parameters: 3.4B total
License: Apache-2.0

Via CrispEmbed orchestrator pipeline:

from crispembed import CrispOcrOrchestrator
# Configure orchestrator with deepseek_ocr2 engine

GGUF

Model size

3B params

Architecture

deepseek_ocr2

Hardware compatibility

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Quantized

(5)

this model