Selora AI

Qwen3 1.7B fine-tuned for Home Assistant with four specialist LoRA adapters. The answer adapter additionally emits a query_state tool envelope for live device-state queries against the Home Assistant REST API. Used by the Selora AI Home Assistant integration; also runnable directly via Ollama, llama.cpp, or vLLM.

Specialists

Adapter Intent Output shape
command "Turn off the kitchen lights" {intent:"command",response,calls:[โ€ฆ]}
automation "Wake up lights at 6:30 AM" {intent:"automation",automation:{triggers,actions,โ€ฆ}}
answer Q&A / small talk {intent:"answer",response}
clarification Ask the user a follow-up {intent:"clarification",response}

The HA integration's selora_local provider classifies each request to one of the four specialists before the call (cheap regex pre-classifier), then sends the request with model: selora-v1-{specialist}. Backends that support multi-LoRA (llama-server's /lora-adapters, vLLM --enable-lora) activate the matching adapter.

Quick start

Ollama

ollama pull selora/commands
ollama run selora/commands

Modelfiles for all four specialists live in ollama/ and are also published as separate Ollama models.

llama.cpp

llama-server \
  --model qwen3_17b_base.Q4_K_M.gguf \
  --lora-init-without-apply \
  --lora qwen3_17b_command.lora.gguf \
  --lora qwen3_17b_automation.lora.gguf \
  --lora qwen3_17b_answer.lora.gguf \
  --lora qwen3_17b_clarification.lora.gguf \
  --ctx-size 8192

POST to /lora-adapters to switch the active LoRA before each /v1/chat/completions call.

vLLM (cloud)

python -m vllm.entrypoints.openai.api_server \
  --model ./qwen3_17b_hf \
  --enable-lora --max-loras 4 --max-lora-rank 32 \
  --lora-modules \
    selora-v1-commands=/path/to/peft/command \
    selora-v1-automations=/path/to/peft/automation \
    selora-v1-answers=/path/to/peft/answer \
    selora-v1-clarifications=/path/to/peft/clarification

vLLM activates the matching LoRA based on the request's model field; no extra routing layer needed.

Generation parameters

{
  "temperature": 0.0,
  "repeat_penalty": 1.15,
  "repeat_last_n": 256,
  "max_tokens": 384,
  "stop": ["<|im_end|>", "<|endoftext|>"]
}

Bump max_tokens to 1536 for automation requests (longer JSON output).

Training

Base: Qwen3 1.7B fine-tuned with Apple mlx-lm. Each specialist has its own LoRA (rank 8โ€“28, scale 20) trained on a curated HA-domain corpus (forum threads, HA docs, synthetic command / automation pairs). System prompts trained per-specialist; see prompts/. The answer adapter went through a sequential continuation pass that added a query_state tool envelope on top of the original answer-only training distribution; that's preserved in the augmented prompts/answers.txt and the Modelfile.answers SYSTEM block.

Evaluation

10/10 parity pass rate on the four-intent suite (command, automation, answer, clarification โ€” plus screenshot regressions). Validator and scenarios live in parity/.

Files in this bundle

Artifact Purpose Distribution
qwen3_17b_base.IQ4_XS.gguf Quantized base for Ollama / llama.cpp Hugging Face, ollama.com
qwen3_17b_{intent}.lora.gguf (ร—4) Specialist LoRA adapters Hugging Face, ollama.com
Modelfile.{intent} (ร—4) Ollama recipes (base + LoRA + system prompt) this repo, ollama.com
prompts/{intent}.txt (ร—4) Plain-text trained prompts (reference / testing) this repo

The full-precision (f16) base and HF safetensors set used by vLLM / TGI / SageMaker live separately in the cloud bundle and are not yet mirrored to Hugging Face.

Citation

@misc{selora-ai-2026,
  title  = {Selora AI: Qwen3 1.7B + LoRA Specialists for Home Assistant},
  author = {{Selora Homes}},
  year   = {2026},
  url    = {https://huggingface.co/selora-homes/selora-ai}
}

Base model citation: Qwen Team, Qwen3 Technical Report (2025).

License

Apache-2.0 (matches the Qwen3 base license).

Downloads last month
338
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for selorahomes/Selora-AI

Finetuned
Qwen/Qwen3-1.7B
Adapter
(500)
this model