Instructions to use jspaulsen/halluci-mate-v2b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jspaulsen/halluci-mate-v2b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jspaulsen/halluci-mate-v2b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jspaulsen/halluci-mate-v2b") model = AutoModelForCausalLM.from_pretrained("jspaulsen/halluci-mate-v2b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use jspaulsen/halluci-mate-v2b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jspaulsen/halluci-mate-v2b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jspaulsen/halluci-mate-v2b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/jspaulsen/halluci-mate-v2b
- SGLang
How to use jspaulsen/halluci-mate-v2b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jspaulsen/halluci-mate-v2b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jspaulsen/halluci-mate-v2b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jspaulsen/halluci-mate-v2b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jspaulsen/halluci-mate-v2b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use jspaulsen/halluci-mate-v2b with Docker Model Runner:
docker model run hf.co/jspaulsen/halluci-mate-v2b
halluci-mate-v2b
Alpha release. A chess LLM fine-tuned from jspaulsen/halluci-mate-v2a on a higher-quality slice of the Lichess dataset. Uses the Qwen3-0.6B architecture and a custom UCI move tokenizer. First model in the series to score wins against Stockfish skill-5 in 100-game matches.
Source: https://github.com/jspaulsen/halluci-mate
Model details
- Architecture: Qwen3 (
Qwen3ForCausalLM), ~0.6B parameters- 28 layers, hidden size 1024, 16 attention heads (8 KV heads), intermediate size 3072
bfloat16, tied word embeddings, RoPE θ = 1,000,000
- Vocabulary: 1,974 tokens — 6 special tokens (
<PAD>,<UNK>,<EOS>,<WHITE>,<BLACK>,<DRAW>) + ~1,792 geometric UCI moves + 176 promotion moves - Context: 32,768 tokens
- Base model:
jspaulsen/halluci-mate-v2a - Checkpoint:
runs-v2a-ft/languid-sloth-169/checkpoint-4056
Tokenizer
The tokenizer is custom and is not loadable via AutoTokenizer.from_pretrained. It is defined in src/halluci_mate/chess_tokenizer.py in the source repo. Install the package and use ChessTokenizer() directly.
Inputs are conditioned on the side-to-move winning: each game is prefixed with <WHITE> or <BLACK> (or <DRAW>), followed by the sequence of UCI moves.
Usage
import chess
import torch
from transformers import AutoModelForCausalLM
from halluci_mate.chess_tokenizer import ChessTokenizer
from halluci_mate.game.game import Game
from halluci_mate.inference import ChessInferenceEngine
engine = ChessInferenceEngine.from_checkpoint(
"jspaulsen/halluci-mate-v2b",
constrained=True, # mask logits to legal moves
temperature=0.0, # greedy
)
game = Game(board=chess.Board(), condition="<WHITE>")
move = engine.predict(game)
print(move.uci())
Constrained decoding masks the logits to the set of legal UCI moves in the current position, which eliminates illegal-move hallucinations at the cost of potentially hiding model weaknesses. Unconstrained sampling (constrained=False) will occasionally produce illegal tokens — this is expected for an alpha.
Training
- Initialized from
jspaulsen/halluci-mate-v2aweights - 2 epochs, 4,056 optimizer steps, effective batch size 512 (per-device 128 × 2 grad-accum × 2 GPUs)
- Optimizer: paged AdamW 8-bit, peak LR 3e-5, cosine-with-min-lr schedule, warmup ratio 0.005
- bf16 + flash_attention_2, DDP across 2 GPUs, seed 4042
- Training script:
scripts/train.pyin the source repo - Best eval loss 1.637 at step 4,000 (epoch 1.97)
Headline evals vs v2a and v1b
Same eval configs across all three: vs-stockfish at skill-5 with --sf-analyze, legal-rate over 5,000 sampled positions (seed 0), high-elo perplexity over 10,768 sequences.
| Metric | v1b | v2a | v2b |
|---|---|---|---|
| vs-stockfish score-rate, skill-5 | 0.104 (500g) | 0.065 (100g) | 0.135 (100g) |
| vs-stockfish W / L / D | 7 / 403 / 90 | 0 / 87 / 13 | 3 / 76 / 21 |
| Legal-rate (5,000 sampled positions) | 99.06% | 99.00% | 99.02% |
| High-elo perplexity (10,768 seqs) | 4.92 | 5.47 | 5.15 |
| Tactical-oversight, middle phase | 21.0% | 23.4% | 20.3% |
| Tactical-oversight, endgame | 12.4% | 12.2% | 11.9% |
| Blunder-rate (in-game) | 6.5% | 6.5% | 6.5% |
v2b is the first model in the series to win games against Stockfish skill-5 (3W / 21D over 100 games). Middlegame and endgame tactical oversight are both best-in-class. Perplexity recovers most of the gap v2a opened up vs v1b on the high-elo test set.
Limitations
- Alpha quality; move strength has not been benchmarked against a rated engine
- Constrained decoding is recommended for any real use — the raw model may emit illegal move tokens
- Trained on human games, so idiosyncrasies and blunders at lower ratings are reflected in behavior
- No support for analyzing positions from arbitrary FENs beyond what
Gameconstructs
License
MIT. See the source repo for details.
- Downloads last month
- 36