Clario · Gemma 4 E4B · Symptom-Diary Extractor (GGUF · Q4_K_M)

CPU-friendly GGUF build of the Clario v2 LoRA merged into unsloth/gemma-4-e4b-it. Quantized to Q4_K_M (~5 GB) for single-step use through Ollama or llama.cpp — no Python sidecar, no bitsandbytes, no CUDA requirement. Runs on Mac (Metal), Linux, Windows, and CPU.

This is the same model published as a LoRA adapter in m0rtyddd/clario-gemma4-e4b-lora-v2, just merged and quantized. For QLoRA / PEFT workflows on a CUDA box, prefer that repo. For everything else, prefer this one.

Quick start (Ollama)

# 1. download the GGUF + Modelfile to a folder
huggingface-cli download m0rtyddd/clario-gemma4-e4b-extract-gguf \
    --local-dir ./clario-extract --include "*.gguf" "Modelfile"

# 2. register with Ollama
cd clario-extract
ollama create clario-extract -f Modelfile

# 3. run
ollama run clario-extract "Diary entry: My eyes are gritty and my mouth is so dry I can't swallow toast."

Or invoke from any Ollama-compatible HTTP client at http://127.0.0.1:11434/api/chat with model: "clario-extract".

Quick start (llama.cpp / llama-cli)

llama-cli -m clario-extract-q4_k_m.gguf \
    -p "Diary entry: My eyes are gritty and my mouth is so dry I can't swallow toast." \
    --temp 0 -n 512

What the model does

Converts colloquial patient diary text into a structured JSON list of medical entities with canonical names suitable for HPO (Human Phenotype Ontology) lookup. Trained on 411 distilled (diary, target_json) pairs derived from Orphanet rare-disease phenotypes and HPO synonyms.

diary text  →  [Gemma 4 E4B + Clario LoRA, Q4_K_M]  →  JSON entities
            →  deterministic HPO synonym lookup     →  HPO IDs + canonical names

The model is not asked to memorise the ~17k HPO IDs from 411 examples. Its job is to extract and canonicalise symptom mentions; IDs are resolved afterwards from a versioned ontology. The reference HPO lookup is built by Clario's backend/scripts/build_knowledge.py from HPO hp.obo + Orphanet en_product4.xml. See m0rtyddd/clario-synthetic-diary for the training corpus and the model card of the unquantized LoRA for the full discussion of limitations.

Performance vs the LoRA adapter

Q4_K_M is the recommended quantization — best size/quality tradeoff across the K-quant family. On Gemma-family models Q4_K_M typically preserves >97% of the BF16 quality on extraction tasks. Reference metrics from the unquantized LoRA (held-out by disease, 68 examples):

Metric Vanilla Gemma 4 E4B Clario LoRA Δ
JSON schema correctness 0%¹ 100% +100 pp
Name F1 (synonym-aware) 0.209 0.524 +151%
HPO ID F1 (via name→lookup) 0.349 0.524 +50%

¹ Without normalising baseline's ad-hoc schema, every baseline score would be 0.

This GGUF was not re-measured against the held-out set after quantization in the hackathon window. Expect a small drop relative to the BF16/4-bit-NF4 numbers above; the system prompt and two few-shot demonstrations are baked into the Modelfile and stack the same way they do for the LoRA path.

Files

File Size What
clario-extract-q4_k_m.gguf ~5.0 GB the model
Modelfile <1 KB Ollama Modelfile with SYSTEM_PROMPT + 2 few-shot demos
README.md this file

Intended use, out of scope, limitations

See the matching sections in m0rtyddd/clario-gemma4-e4b-lora-v2. TL;DR: extract symptoms, do not trust the model's hpo_id field (always post-resolve via deterministic lookup), human-in-the-loop review, English only, not a diagnostic tool.

Licence

CC-BY-4.0, propagating the licences of training data sources (HPO CC-BY-4.0, Orphanet free with attribution) and the Gemma Terms of Use for the base model.

Cite as:

@misc{okulov2026clario_gguf,
  title  = {{Clario} {Gemma 4 E4B} Symptom-Diary Extractor (GGUF, Q4\_K\_M)},
  author = {Okulov, Maksim},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/m0rtyddd/clario-gemma4-e4b-extract-gguf}}
}
Downloads last month
202
GGUF
Model size
8B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train m0rtyddd/clario-gemma4-e4b-extract-gguf