Daimon-X π€
The lightweight local brain of Daimon β runs anywhere, even CPU-only / low-resource.
Daimon-X is the lightweight brain of Daimon β a local-first personal AI
assistant. Daimon-X = Qwen/Qwen2.5-Coder-1.5B-Instruct (quantized GGUF) + a Daimon LoRA, served
locally with llama.cpp. The weights are open β use them standalone today.
What the Daimon LoRA adds
The LoRA tunes persona and behavior, not raw coding (that stays at the base's level):
- A consistent Daimon identity (first person, local-first assistant) instead of a generic "language model" voice.
- Canvas convention: when you ask it to build a web/app/UI it replies with a single
self-contained
htmlblock that renders live β no preamble, no copy-paste. - Concise Rioplatense Spanish for chat, plus a voice-friendly register for TTS.
Benchmarks (real, measured locally)
Daimon uses this model to pick JSON tool-call actions for its browser co-pilot, computer-use,
and code-editing agents (e.g. {"action":"click","ref":3}). Measured on 198 held-out
synthetic scenarios β disjoint vocabulary/phrasing from training, zero string-level overlap
with the training set, graded objectively (not by text similarity):
| Metric | Base Qwen2.5-Coder-1.5B + persona-only LoRA | Daimon-X |
|---|---|---|
| Valid JSON | 67.7% | 98.5% |
| Correct action chosen | 31.3% | 93.9% |
| Correct action + correct fields (ref/path/command/...) | 22.2% | 93.9% |
Persona/behavior suite: 9/12 vs 10/12 for the persona-only checkpoint β within this suite's run-to-run noise (Β±1-2 on 12 questions), far smaller than the tool-calling gain.
We also tried fine-tuning on Lucas's own commit history to reproduce his exact diffs (3 attempts, different data/hyperparameter fixes each time) β it never beat the base model's overlap with held-out commits, so that LoRA was never shipped. We only publish results that actually win a real, pre-registered comparison.
Run standalone
huggingface-cli download lucas-mella/Daimon-X --local-dir ./daimon-models
# llama.cpp: load the base GGUF and apply the Daimon LoRA
llama-server -m ./daimon-models/qwen2.5-coder-1.5b-instruct-q8_0.gguf \
--lora ./daimon-models/daimon-sft-lora-f16.gguf -c 8192
Exposes an OpenAI-compatible endpoint (default http://localhost:8080/v1).
Files
qwen2.5-coder-1.5b-instruct-q8_0.ggufdaimon-sft-lora-f16.gguf
Daimon β coming soon
These weights are the brain of Daimon, a local-first personal assistant that runs on your own machine: real-time local voice, a co-pilot browser that Daimon and you share, a canvas for apps & prototypes, and hybrid local/cloud routing. The full app isn't public yet β these open models are a complement you can already build on. Watch @lucas-mella for the release.
- Downloads last month
- 26
8-bit
16-bit
Model tree for lucas-mella/Daimon-X
Base model
Qwen/Qwen2.5-1.5B