Instructions to use open-chess/Caissa-Chess-M1-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use open-chess/Caissa-Chess-M1-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="open-chess/Caissa-Chess-M1-GGUF", filename="Caissa-Chess-M1-F16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use open-chess/Caissa-Chess-M1-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
Use Docker
docker model run hf.co/open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use open-chess/Caissa-Chess-M1-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "open-chess/Caissa-Chess-M1-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "open-chess/Caissa-Chess-M1-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
- Ollama
How to use open-chess/Caissa-Chess-M1-GGUF with Ollama:
ollama run hf.co/open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
- Unsloth Studio
How to use open-chess/Caissa-Chess-M1-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for open-chess/Caissa-Chess-M1-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for open-chess/Caissa-Chess-M1-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for open-chess/Caissa-Chess-M1-GGUF to start chatting
- Pi
How to use open-chess/Caissa-Chess-M1-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "open-chess/Caissa-Chess-M1-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use open-chess/Caissa-Chess-M1-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use open-chess/Caissa-Chess-M1-GGUF with Docker Model Runner:
docker model run hf.co/open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
- Lemonade
How to use open-chess/Caissa-Chess-M1-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull open-chess/Caissa-Chess-M1-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Caissa-Chess-M1-GGUF-Q4_K_M
List all available models
lemonade list
Caissa-Chess-M1-GGUF
📄 Overview
| Base Model | open-chess/Caissa-Chess-M1 |
| Parameters | 8B |
| Dataset | MetaChess-20k |
Quant types
| Quant type | Size | Recommended Use |
|---|---|---|
| Q2_K | 3.02 GB | Ultra-low RAM (mobile, embedded) |
| Q3_K_S | 3.49 GB | Low RAM devices |
| Q3_K_M | 3.81 GB | Low RAM devices |
| Q4_K_S | 4.46 GB | Balanced quality/size |
| Q4_K_M | 4.68 GB | Recommended — best balance |
| Q5_K_S | 5.32 GB | Higher quality |
| Q5_K_M | 5.44 GB | Higher quality |
| Q6_K | 6.25 GB | Near-original quality |
| Q8_0 | 8.1 GB | Maximum quality |
| F16 | 15.2 GB | Original precision |
🎯 Intended Use
This model is designed for chess position analysis with structured Chain-of-Thought reasoning. It is optimized for:
- Chess analysis — evaluating positions, calculating variations, finding best moves
- Educational applications — explaining chess concepts and strategic thinking
- On-device chess assistants — runs on mobile, Raspberry Pi, or CPU-only environments
- Chess AI research — studying reasoning patterns in small language models
- Local inference — privacy-focused chess analysis without cloud APIs
Not recommended for: general-purpose reasoning, non-chess tasks, or production systems requiring 100% move accuracy.
💬 Example Usage
Using llama.cpp
./llama-cli -m Caissa-Chess-M1-Q4_K_M.gguf \
-p "You are a chess grandmaster and analyst. Your task is to deeply analyze a position and find the best move. You MUST reason in <think> tags, then output the move in <move> tag.
FEN: r1bqkbnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq - 2 3" \
-n 512 -t 4
Expected Output Format:
<think>
<evaluation>Position is roughly equal (+0.00 pawns). It's White's turn to move.</evaluation>
<calculation>Depth: 60. Principal variation: d2d4 -> c8f5 -> f1e2 -> g8f6 -> e1h1 -> f8e7</calculation>
<reasoning>The move d2d4 establishes a classical center, claiming space and preparing to challenge Black's control of the e5 square...</reasoning>
</think>
<move>d2d4</move>
⚠️ Limitations
- Chess-specific — trained exclusively on chess positions; general reasoning or non-chess tasks will be suboptimal
- Size constraints — 8B parameters, so extremely complex positions (20+ move calculations) may be simplified
- No multimodal — text-only FEN input; no image/board recognition
- Move accuracy — may occasionally suggest suboptimal moves; always verify with Stockfish for critical analysis
- Training data — trained on 20,000 positions from Lichess (depth 28+); may not handle rare openings well
📖 Citation
@misc{Caissa-Chess-M1-GGUF, author = {Open Chess AI}, title = {Caissa-Chess-M1-GGUF: Chess Reasoning Model with Chain-of-Thought}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/open-chess/Caissa-Chess-M1-GGUF}}, }
Made by OpenChess, an open source chess AI research project ❤️
- Downloads last month
- 511
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit