Zekra — Memory Care for Alzheimer's

A digital life-story book for Alzheimer's, running entirely on-device.

Website: zekra.live App source: github.com/aelhajj/zekra-ai Submission: Kaggle Gemma 4 Good Hackathon — Impact Track (Health & Sciences), with eligibility for the LiteRT and Unsloth Special Technology Tracks. Deadline 2026-05-18.

Zekra helps people with Alzheimer's recognize the faces around them and recall the stories that go with those faces. A caregiver builds a small graph of the family — who everyone is, how they are related, the photos and memories that matter. The patient lifts the phone, points the camera at someone, and Zekra tells the story warmly:

"This is your grandson Ahmad. Last summer you took him to the park and he laughed at the ducks."

No "do you remember." No quiz. Just sharing. Everything runs on the phone — no backend, no cloud, no account, photos never leave the device.

What's in this repo

Two on-device models powering the Zekra Flutter app:

File	Size	Purpose
`model-dyn-wi8-afp32.litertlm`	5.1 GB	Fine-tuned Gemma 4 E2B care-dialogue model, exported as a `.litertlm` bundle for LiteRT-LM (GPU full-delegation on Pixel 9 Pro).
`arcface_zekra_r50_fp16.onnx`	83 MB	Fine-tuned InsightFace ArcFace face embedder (ResNet50, 512-dim, fp16) for cross-age + kinship-aware face verification, served via `onnxruntime`.

Both files are dropped into the Flutter app's private documents folder at runtime (see deployment recipe below).

Gemma 4 E2B — care-dialogue fine-tune

Fine-tuned from litert-community/gemma-4-E2B-it-litert-lm (which is itself a LiteRT-LM port of google/gemma-4-E2B-it) with Unsloth LoRA + a hand-curated SFT corpus of multi-turn care dialogues. The model is wired as a router, not a free agent loop — it picks from twelve tools (identify_face, tell_me_about, whos_with_me, where_am_i, check_meds, log_dose_taken, create_reminder, make_note, get_help, etc.). Pre-routers on the Flutter side catch distress signals and face-recognition intents deterministically before the model sees the turn.

Training

Base: Gemma 4 E2B (instruction-tuned)
Method: Unsloth LoRA — all-linear, r=16, α=32 — then merged into the base for export
Corpus: 3,198 hand-curated turns across 19 waves, generated by ~9 Flutter instances running in parallel on an M4 MacBook (Claude played the patient, Gemma played the companion), LLM-judged per batch
Doctrine: never quiz, just tell — backed by Tom Kitwood's Dementia Reconsidered (1997) and the DAWN Method's guidance against recall-testing
Distress detection uses Al-Mosaiwi & Johnstone (2018) absolutist-word patterns
Clinical grounding: Cochrane review (Woods et al., 2018) on reminiscence therapy + DEMENTIA-PLAN (2025) on knowledge-graph-grounded dialogue

Export

Quantization: dynamic int8 weights, fp32 activations (wi8 afp32) — wi4 collapsed quality, wi8 was the sweet spot that kept persona while loading on GPU
Format: .litertlm bundle, 12-section layout, SP_Tokenizer spliced from the base bundle so prefill/decode token IDs stay aligned with the LiteRT-LM runtime
Runtime: flutter_gemma on Android/macOS, GPU backend, full delegation (865/865 + 1533/1533 nodes, single partition each), warm-up ~2.5 s on Pixel 9 Pro

The full export recipe (including the dead-ends — --task text_generation SEGVs in embedding_lookup, --experimental_lightweight_conversion causes MUL(const, const) GPU rejection, add_hf_tokenizer produces token salad) is on our blog and in the working Kaggle notebook: Fine-Tuned Gemma 4 on Mobile — LiteRT Solution.

ArcFace — cross-age face embedder

Fine-tuned from the InsightFace w600k_r50 ArcFace ResNet50 backbone (the buffalo_l model pack, from insightface/python-package).

Training

Loss: CosFace classification (s=15, m=0.35) — pair-contrastive collapsed; switching loss families fixed it
Data: 68,000 images across 8,651 identities from six datasets — FG-NET (cross-age), Families in the Wild (kinship hard-negatives), CFP-FP and CPLFW (frontal-vs-profile), RMFRD and MLFW (masked faces)
Tricks: BN frozen, head warmup, full backbone unfrozen after warmup
Output: 512-dim L2-normalized embeddings, fp16 ONNX

Verification accuracy (held-out 13.8K pairs, cos ≥ 0.40 threshold):

Model	Accuracy	Separation
`w600k_r50` (stock)	75.1%	1.0×
Zekra fine-tune (this file)	95.5%	2.8×

Runtime: onnxruntime on Android/macOS. Embedded into a 512-dim HNSW vector index in ObjectBox on the device; cosine lookup at recognition time.

How the two models work together

camera frame
   ↓  Google ML Kit (on-device face detection)
face crop
   ↓  ArcFace ONNX  (this repo — 83 MB)
512-dim embedding
   ↓  ObjectBox HNSW cosine search
PersonEntity match  ──→  graph lookup (relationship, recent memories, photos)
                            ↓  prompt assembly + Gemma 4 call
                       Gemma 4 LiteRT (this repo — 5.1 GB)
                            ↓
                       warm sentence (TTS optional)

No network. Photos never leave the device.

How to use

The shipping integration is the Zekra Flutter app — flutter_gemma for the LLM, onnxruntime for ArcFace, ObjectBox for the on-device knowledge graph + vector search. Recipe to drop both files onto a Pixel:

# Bundle (5.1 GB) — push, then run-as cp into the app's private dir
adb push model-dyn-wi8-afp32.litertlm /data/local/tmp/model.litertlm
adb shell "run-as com.zekra.zekra cp /data/local/tmp/model.litertlm app_flutter/model.litertlm"

# ArcFace (83 MB)
adb push arcface_zekra_r50_fp16.onnx /data/local/tmp/w600k_r50.onnx
adb shell "chmod 0666 /data/local/tmp/w600k_r50.onnx"
adb shell "run-as com.zekra.zekra cp /data/local/tmp/w600k_r50.onnx app_flutter/w600k_r50.onnx"

Both files land in /data/user/0/com.zekra.zekra/app_flutter/. The app picks them up on the next launch.

License

MIT (this repo). Upstream licenses apply to the base models — see the Gemma terms for the LLM base (Apache 2.0 on the litert-community mirror, Gemma terms on the Google original) and the InsightFace license for the ArcFace backbone.

Citation

@misc{zekra2026,
  title  = {Zekra: A Digital Life Story Book for Alzheimer's, Running Entirely On-Device},
  author = {El Hajj, Amanie and El Hajj, Hadi},
  year   = {2026},
  url    = {https://zekra.live},
  note   = {Kaggle Gemma 4 Good Hackathon submission — Health \& Sciences track}
}

Zekra (ذكرة) means memento in Arabic. Built for our grandmother, and one day perhaps for ourselves.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for amaniee/zekra-memory-assistant

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it

Quantized

(192)

this model