quick-add-qwen3-0.6b

A small text-only model for on-device use (unified_life_hub) that turns short EN/DK text into structured JSON. One model serves two tasks, selected by a leading tag in the user message. OCR is a separate upstream step — this model never sees images.

Full fine-tune of Qwen/Qwen3-0.6B (no LoRA). Deployable as Qwen3-0.6B.Q4_K_M.gguf (~379 MB, llama.cpp / Ollama) or the HF weights here. Held-out fitness: valid JSON 99.9%, exact 81.5%, item-F1 89.3% ([event_extract] exact 84.3% / F1 93.7%).

⚠️ Dates are emitted VERBATIM — the host resolves them

[event_extract] returns each event's date/time exactly as written in the source ("den 5. november kl 14", "next Thursday", "5/5 at 18", "Sct. Hans kl 9") — it does NOT convert to absolute dates. Your app resolves the phrase to an absolute datetime using the Today is YYYY-MM-DD line (kept in the input as the anchor) + a locale-keyed resolver. Tokens to handle: DK den D. <month> / den D/M / på <weekday> / Sct. Hans; EN <Month> D / D/M / next <weekday>. Recurring lines combine the line's date with each group's time (e.g. 7/6 at 11:15).

System prompt (use VERBATIM — the model is conditioned on this exact text)

You turn the user's text into JSON. The message begins with a mode tag.

[capture] — a short note typed by the user. Split it into items and classify each as "task", "event", or "note". Keep any time reference verbatim, fuzzy is fine (tomorrow, fredag kl 14, frokost, på torsdage). Refer to people by role/name as written. Output:
{"items":[{"type":"task|event|note","title":"...","when"?:"...","where"?:"...","priority"?:"urgent","recurring"?:true}]}

[event_extract] — a longer text, often an OCR'd chat or screenshot, that starts with "Today is YYYY-MM-DD". Extract ONLY upcoming calendar EVENTS. Keep each event's date and time EXACTLY as written in the source (verbatim) — do NOT resolve to an absolute date; the "Today is" line is context only. Ignore chit-chat, to-dos, and reference facts. Output:
{"items":[{"type":"event","title":"...","when":"<date/time exactly as written>","when_end"?:"<as written>","where"?:"..."}]}

Always output ONLY the JSON object — no prose, no markdown. Preserve the input's language in titles. If nothing fits, output {"items":[]}.

User message = tag + input

task user message output
[capture] [capture] <short note> {"items":[{type:task|event|note, title, when?(verbatim), where?, priority?:"urgent", recurring?:true}]}
[event_extract] [event_extract] Today is YYYY-MM-DD\n<OCR'd text> {"items":[{type:"event", title, when:"<as written>", when_end?, where?}]} — future-only, verbatim date, chit-chat ignored

Examples:

  • [capture] Call mom tomorrow{"items":[{"type":"task","title":"Call mom","when":"tomorrow"}]}
  • [capture] Nice weather today{"items":[]}
  • [event_extract] Today is 2026-04-01\nMette: Birthday party on 5/5 at 18 at Café Nord{"items":[{"type":"event","title":"Birthday party","when":"5/5 at 18","where":"Café Nord"}]} → host resolves 5/5 at 18 + anchor 2026-04-012026-05-05 18:00.

Inference

  • Greedy (deterministic) decoding; stop on <|im_end|>.
  • max_new_tokens ≥ 768 (a chunked recurring-event block can emit ~16 events / ~600 tokens; a smaller budget truncates into invalid JSON).
  • The model prefixes an empty thinking block (<think>\n\n</think>). Strip up to and including </think>, then parse the JSON. Untagged input defaults to [capture].

Chunking (host responsibility)

Chunk long input into ~2000-char fragments, call once per chunk, each re-prefixed with [event_extract] Today is YYYY-MM-DD\n. The model extracts only events fully inside the chunk and returns {"items":[]} otherwise. Merge per-chunk items; de-dupe on (title, when).

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("kstorm77/quick-add-qwen3-0.6b")
model = AutoModelForCausalLM.from_pretrained("kstorm77/quick-add-qwen3-0.6b", dtype=torch.bfloat16, device_map="auto")
msgs = [{"role": "system", "content": SYSTEM}, {"role": "user", "content": "[capture] Call mom tomorrow"}]
ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = tok.decode(model.generate(ids, max_new_tokens=768, do_sample=False)[0][ids.shape[1]:], skip_special_tokens=True)
# strip up to </think>, then json.loads
Downloads last month
124
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kstorm77/quick-add-qwen3-0.6b

Finetuned
Qwen/Qwen3-0.6B
Quantized
(317)
this model