Instructions to use XCurOS/XCurOS-OCR-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use XCurOS/XCurOS-OCR-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="XCurOS/XCurOS-OCR-GGUF", filename="XCurOS-OCR-F16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use XCurOS/XCurOS-OCR-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf XCurOS/XCurOS-OCR-GGUF:F16 # Run inference directly in the terminal: llama-cli -hf XCurOS/XCurOS-OCR-GGUF:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf XCurOS/XCurOS-OCR-GGUF:F16 # Run inference directly in the terminal: llama-cli -hf XCurOS/XCurOS-OCR-GGUF:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf XCurOS/XCurOS-OCR-GGUF:F16 # Run inference directly in the terminal: ./llama-cli -hf XCurOS/XCurOS-OCR-GGUF:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf XCurOS/XCurOS-OCR-GGUF:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf XCurOS/XCurOS-OCR-GGUF:F16
Use Docker
docker model run hf.co/XCurOS/XCurOS-OCR-GGUF:F16
- LM Studio
- Jan
- vLLM
How to use XCurOS/XCurOS-OCR-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "XCurOS/XCurOS-OCR-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "XCurOS/XCurOS-OCR-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/XCurOS/XCurOS-OCR-GGUF:F16
- Ollama
How to use XCurOS/XCurOS-OCR-GGUF with Ollama:
ollama run hf.co/XCurOS/XCurOS-OCR-GGUF:F16
- Unsloth Studio
How to use XCurOS/XCurOS-OCR-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for XCurOS/XCurOS-OCR-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for XCurOS/XCurOS-OCR-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for XCurOS/XCurOS-OCR-GGUF to start chatting
- Pi
How to use XCurOS/XCurOS-OCR-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf XCurOS/XCurOS-OCR-GGUF:F16
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "XCurOS/XCurOS-OCR-GGUF:F16" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use XCurOS/XCurOS-OCR-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf XCurOS/XCurOS-OCR-GGUF:F16
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default XCurOS/XCurOS-OCR-GGUF:F16
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use XCurOS/XCurOS-OCR-GGUF with Docker Model Runner:
docker model run hf.co/XCurOS/XCurOS-OCR-GGUF:F16
- Lemonade
How to use XCurOS/XCurOS-OCR-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull XCurOS/XCurOS-OCR-GGUF:F16
Run and chat with the model
lemonade run user.XCurOS-OCR-GGUF-F16
List all available models
lemonade list
XCurOS-OCR ยท GGUF (F16, no quantization)
GGUF build of XCurOS-OCR, a compact 0.9B-parameter vision-language OCR model โ runs locally with llama.cpp on CPU or GPU. Shipped in full precision F16, with no quantization.
โจ Lightweight & CPU-friendly โ only 0.9B parameters, runs on a normal CPU (no GPU required), while staying competitive with much heavier OCR systems.
๐ค Transformers / safetensors version:
XCurOS/XCurOS-OCR.
Files
| File | Role |
|---|---|
XCurOS-OCR-F16.gguf |
Language decoder (F16) |
mmproj-XCurOS-OCR-F16.gguf |
Vision projector (required for image input) |
Quick start
# CPU-only (no GPU)
llama-mtmd-cli -m XCurOS-OCR-F16.gguf --mmproj mmproj-XCurOS-OCR-F16.gguf --image page.png -p "OCR" -ngl 0
# REST API server
llama-server -m XCurOS-OCR-F16.gguf --mmproj mmproj-XCurOS-OCR-F16.gguf -ngl 0
# Or auto-download this repo
llama-server -hf XCurOS/XCurOS-OCR-GGUF
Benchmarks
XCurOS-OCR (ours) compared against leading OCR systems. Bold = best among specialized OCR VLMs.
-= not reported. ๐ก XCurOS-OCR is a lightweight 0.9B model that tracks closely behind GLM-OCR while running on a normal CPU โ no GPU required.
Document understanding
| Task | Benchmark | XCurOS-OCR | GLM-OCR | PaddleOCR-VL-1.5 | Deepseek-OCR2 | MinerU2.5 | dots.ocr | Gemini-3-Pro* | GPT-5.2* |
|---|---|---|---|---|---|---|---|---|---|
| Document Parsing | OmniDocBench v1.5 | 94.3 | 94.6 | 94.5 | 91.1 | 90.7 | 88.4 | 90.3 | 85.4 |
| Text Recognition | OCRBench (Text) | 93.6 | 94.0 | 75.3 | 34.7 | 75.3 | 92.1 | 91.9 | 83.7 |
| Formula Recognition | UniMERNet | 96.3 | 96.5 | 96.1 | 85.8 | 96.4 | 90.0 | 96.4 | 90.5 |
| Table Recognition | PubTabNet | 84.9 | 85.2 | 84.6 | - | 88.4 | 71.0 | 91.4 | 84.4 |
| Table Recognition | TEDS_TEST | 85.5 | 86.0 | 83.3 | - | 85.4 | 62.4 | 81.8 | 67.6 |
| Information Extraction | Nanonets-KIE | 93.3 | 93.7 | - | - | - | - | 95.2 | 87.5 |
| Information Extraction | Handwritten-Forms | 85.8 | 86.1 | - | - | - | - | 94.5 | 78.2 |
Capability breakdown
| Category | XCurOS-OCR | GLM-OCR | PaddleOCR-VL-1.5 | Deepseek-OCR2 | MinerU2.5 | dots.ocr | Gemini-3-Pro* | GPT-5.2* |
|---|---|---|---|---|---|---|---|---|
| Code | 84.4 | 84.7 | 75.8 | 82.1 | 82.9 | 80.8 | 86.9 | 84.4 |
| Real-world Table | 91.0 | 91.5 | 86.1 | - | 70.8 | 81.8 | 90.6 | 86.7 |
| Handwriting | 86.8 | 87.0 | 87.4 | 73.8 | 54.2 | 71.7 | 90.0 | 78.0 |
| Multi-language | 68.9 | 69.3 | 54.8 | 56.1 | 27.8 | 65.1 | 86.2 | 70.1 |
| Seal | 90.2 | 90.5 | 42.2 | 40.4 | - | 63.0 | 91.3 | 58.8 |
| Receipt (KIE) | 94.1 | 94.5 | - | - | - | - | 97.3 | 83.5 |
*Gemini-3-Pro and GPT-5.2 are general-purpose VLMs, shown for reference only.
Throughput
| Method | Image Inputs (Pages/Sec) | PDF Inputs (Pages/Sec) |
|---|---|---|
| XCurOS-OCR | 0.66 | 1.83 |
| GLM-OCR | 0.67 | 1.86 |
| PaddleOCR-VL-1.5 | 0.39 | 1.22 |
| Deepseek-OCR2 | 0.32 | - |
| MinerU2.5 | 0.18 | 0.48 |
| dots.ocr | 0.10 | - |
XCurOS-OCR is optimized to run on commodity CPUs; it scores marginally below GLM-OCR while requiring no GPU.
License
Released under the MIT License. See the LICENSE file in this repository.
- Downloads last month
- 212
16-bit