How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="flywheel-ai/home-services",
	filename="model-q4_k_m.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Flywheel — home-services (35b-v1.1)

An open-source vertical AI-employee model from Flywheel by OpSpot, fine-tuned (LoRA) from Qwen/Qwen3.6-35B-A3B (Apache-2.0) for the home-services domain.

  • Base: Qwen/Qwen3.6-35B-A3B · License: Apache-2.0 · Version: 35b-v1.1
  • Formats: safetensors (transformers / vLLM, ~65G) + model-q4_k_m.gguf (llama.cpp / Ollama, ~20G)

Download (one command)

pip install -U huggingface_hub
hf download flywheel-ai/home-services                      # full repo (safetensors + GGUF)
hf download flywheel-ai/home-services model-q4_k_m.gguf    # just the GGUF

Run

# llama.cpp
llama-server -m model-q4_k_m.gguf -ngl 999
# Ollama (pulls the GGUF straight from HF)
ollama run hf.co/flywheel-ai/home-services
# vLLM (serves the safetensors)
vllm serve flywheel-ai/home-services

Provenance & honesty

v1.0 is trained on synthetic seed data authored by permissively-licensed local models (Apache/MIT teachers only — never distilled from closed models). On general prompts it is roughly on par with the base; the niche edge sharpens as consented real usage flows through the OpSpot flywheel. Built on Qwen3.6 (Apache-2.0).

Downloads last month
37
Safetensors
Model size
35B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for flywheel-ai/home-services

Quantized
(514)
this model