Clario · Gemma 4 E4B · Symptom-Diary LoRA (v2)

LoRA adapter on unsloth/gemma-4-e4b-it that converts colloquial patient diary text into a structured JSON list of medical entities with canonical names suitable for HPO (Human Phenotype Ontology) lookup. Trained on 411 distilled (diary, target_json) pairs derived from Orphanet rare-disease phenotypes and HPO synonyms.

The adapter is one stage of a deterministic pipeline:

diary text  →  [LoRA-adapted Gemma 4 E4B, 4-bit]  →  JSON entities
            →  deterministic HPO synonym lookup  →  HPO IDs + canonical names

The model is not asked to memorise the ~17k HPO IDs from 411 examples. Its job is to extract and canonicalise symptom mentions; IDs are resolved afterwards from a versioned ontology (see Limitations §1).

Headline results

Held-out 68-example validation set (split by disease — 651 Orphanet disorders never appear in training). Baseline: the same unsloth/gemma-4-e4b-it in 4-bit without the adapter, apples-to-apples, same system prompt.

Metric Baseline Fine-tuned Δ
JSON schema correctness 0%¹ 100% +100 pp
Entity type-field accuracy 89.1% 100% +10.9 pp
Name F1 (synonym-aware) 0.209 0.524 +151%
HPO ID F1 (via name→HPO lookup) 0.349 0.524 +50%
Avg entities per example (gold = 2.93) 5.37 (over-extract) 2.91 (calibrated)

¹ Vanilla Gemma 4 emits its own ad-hoc shape ({symptoms, triggers, body_parts, medications, lab_values}); the baseline row above is computed after normalising those outputs to the requested schema. Without normalisation, every baseline score would be 0.

Quick start

import json
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

BASE = "unsloth/gemma-4-e4b-it"
ADAPTER = "m0rtyddd/clario-gemma4-e4b-lora-v2"

bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype="bfloat16")
tok = AutoTokenizer.from_pretrained(ADAPTER)
base = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()

SYSTEM = (
    "Extract medical entities from the diary. Return strict JSON: "
    '{"entities":[{"name_colloquial":"…","name_canonical":"…",'
    '"hpo_id":"HP:…","type":"symptom|lab_marker|med|trigger|behavior"}]}. '
    "Use canonical HPO names where possible. Output JSON only."
)
diary = (
    "My eyes have been gritty, like there's sand in them. Mouth's been so "
    "dry I can't swallow toast without water. Fingers ache when I type for long."
)

prompt = tok.apply_chat_template(
    [{"role": "system", "content": SYSTEM}, {"role": "user", "content": diary}],
    tokenize=False, add_generation_prompt=True,
)
ids = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**ids, max_new_tokens=512, do_sample=False)
response = tok.decode(out[0][ids["input_ids"].shape[1]:], skip_special_tokens=True)
print(json.loads(response))
# {"entities": [
#   {"name_colloquial": "gritty eyes",  "name_canonical": "Keratoconjunctivitis sicca", ...},
#   {"name_colloquial": "mouth so dry", "name_canonical": "Xerostomia", ...},
#   {"name_colloquial": "fingers ache", "name_canonical": "Arthralgia",  ...},
# ]}

Production note. Discard the model's hpo_id field and resolve from name_canonical via a synonym index built from HPO hp.obo (42k normalised name → HP:ID entries). See Limitations §1.

Pipeline integration

Reference integration (FastAPI sidecar + Clario backend) is open-source:

  • Sidecar (this adapter + HPO post-lookup + few-shot SLE demos): scripts/clario_extractor_service.py.
  • Consumer: backend/diary/extraction.py::process_one calls the sidecar when CLARIO_EXTRACTOR_URL is set, otherwise falls back to vanilla Gemma via Ollama.

Training

Base unsloth/gemma-4-e4b-it (4-bit NF4, BF16 compute)
Adapter LoRA, r=16, α=32, dropout=0.05
Target modules q/k/v/o/gate/up/down_proj on all language-model layers
Optimiser adamw_8bit
Learning rate 5e-5
Epochs 1 (on top of a resumed checkpoint-50 from the v1 run)
Max sequence length 1536
Total steps 22 SGD steps over 343 train examples
Train loss (final) 0.478
Hardware single RTX 5060 Ti 16 GB (Blackwell sm_120)
Wall time ~22 min

The first training run hung at step 50/66 due to a known interaction between paged_adamw_8bit and Windows NVIDIA driver 596.36. The resumed run with adamw_8bit completed cleanly.

Training data is published as m0rtyddd/clario-synthetic-diary.

Intended use

  • Extracting symptom, lab-marker, medication, trigger, and behaviour entities from short English-language patient diary entries.
  • As the extraction stage of a longer pipeline that includes deterministic HPO-ID resolution and downstream graph / hypothesis-engine logic.
  • Research and educational use around rare-disease symptom canonicalisation.

Out of scope

  • Direct clinical decision-making. This is an extraction model, not a diagnostic one. Outputs require human-in-the-loop review.
  • Non-English diaries. All training data is English (see §5 below).
  • HPO ID generation. Use the deterministic lookup; do not consume entities[].hpo_id from the model (see §1 below).
  • Rationale / explanation generation for downstream hypothesis ranking — this LoRA does not improve that path.

Limitations

1. The model does not faithfully generate HPO IDs. 411 training pairs cover ~1.5k unique HPO terms out of the ontology's ~17k. On free-form generation the model hallucinates sequential dummies (HP:0001211, HP:0001212, HP:0001213, …). The +50% relative HPO F1 over baseline measures how the better name extraction improves end-to-end ID resolution once a deterministic lookup is applied. Consumers must discard the model's hpo_id field.

2. Eval is synthetic, held-out by disease, but not clinical. Both train and validation come from the same gpt-oss:20b teacher pipeline. 651 Orphanet disorders never appear in train, so the split is non-trivially novel — but the val set is not a clinical golden. A manual 200-pair golden set is the next measurement gate and was not completed within the hackathon window.

3. Two few-shot demonstrations stack on top of the LoRA for the SLE pattern. The LoRA alone correctly extracts the MCAS-style (Tom) and Hashimoto-style (Anna) personas with no prompt scaffolding. For SLE specifically, the upstream sidecar prepends two demonstrations to the chat history (showing "reddened band across cheekbones → Malar rash" and "burning skin where sun hit → Photosensitivity", with surface forms different from any test diary) to push the model toward those two textbook SLE phenotypes. Both contributions stack.

4. Rationale generation is unchanged. The adapter is scoped to extraction. Downstream hypothesis rationales still run on vanilla Gemma 4 weights with templated phrasing.

5. English only. No Russian, Italian, or other-language training data. Frontend i18n slots exist upstream, but the extractor would need a translate-pass or additional fine-tuning for non-English diaries.

Licence and attribution

This adapter is released under CC-BY-4.0, propagating the licence of its training-data sources:

  • HPO (Human Phenotype Ontology) phenotype names and synonyms — CC-BY-4.0. Köhler S. et al. (2021) The Human Phenotype Ontology in 2021. Nucleic Acids Research, 49(D1):D1207–D1217. https://hpo.jax.org/
  • Orphanet rare-disease ↔ phenotype annotations (en_product4.xml, 2026-05-13 snapshot) — free for academic and commercial use with attribution. Orphadata: Free access products on rare diseases and orphan drugs. INSERM 1978. https://www.orphadata.com/
  • Base model: unsloth/gemma-4-e4b-it, used under the Gemma Terms of Use.
  • Teacher: gpt-oss:20b (Apache-2.0).

Cite as:

@misc{okulov2026clario_lora_v2,
  title  = {{Clario} {Gemma 4 E4B} Symptom-Diary {LoRA} (v2)},
  author = {Okulov, Maksim},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/m0rtyddd/clario-gemma4-e4b-lora-v2}}
}

Frameworks

peft 0.19.1 · trl 1.4.0 · transformers 5.8.0.dev0 · torch 2.11.0.dev20260108+cu128 · datasets 4.8.5 · tokenizers 0.22.2 · unsloth (QLoRA path) · bitsandbytes (4-bit NF4)

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train m0rtyddd/clario-gemma4-e4b-lora-v2

Evaluation results

  • Name F1 (synonym-aware) on clario-synthetic-diary (val split, held-out by disease)
    self-reported
    0.524
  • HPO ID F1 (via name to HPO lookup) on clario-synthetic-diary (val split, held-out by disease)
    self-reported
    0.524
  • JSON schema correctness on clario-synthetic-diary (val split, held-out by disease)
    self-reported
    1.000
  • Entity type accuracy on clario-synthetic-diary (val split, held-out by disease)
    self-reported
    1.000