Zekra β€” Memory Care for Alzheimer's

A digital life-story book for Alzheimer's, running entirely on-device.

Website: zekra.live App source: github.com/aelhajj/zekra-ai Submission: Kaggle Gemma 4 Good Hackathon β€” Impact Track (Health & Sciences), with eligibility for the LiteRT and Unsloth Special Technology Tracks. Deadline 2026-05-18.

Zekra helps people with Alzheimer's recognize the faces around them and recall the stories that go with those faces. A caregiver builds a small graph of the family β€” who everyone is, how they are related, the photos and memories that matter. The patient lifts the phone, points the camera at someone, and Zekra tells the story warmly:

"This is your grandson Ahmad. Last summer you took him to the park and he laughed at the ducks."

No "do you remember." No quiz. Just sharing. Everything runs on the phone β€” no backend, no cloud, no account, photos never leave the device.

What's in this repo

Two on-device models powering the Zekra Flutter app:

File Size Purpose
model-dyn-wi8-afp32.litertlm 5.1 GB Fine-tuned Gemma 4 E2B care-dialogue model, exported as a .litertlm bundle for LiteRT-LM (GPU full-delegation on Pixel 9 Pro).
arcface_zekra_r50_fp16.onnx 83 MB Fine-tuned InsightFace ArcFace face embedder (ResNet50, 512-dim, fp16) for cross-age + kinship-aware face verification, served via onnxruntime.

Both files are dropped into the Flutter app's private documents folder at runtime (see deployment recipe below).

Gemma 4 E2B β€” care-dialogue fine-tune

Fine-tuned from litert-community/gemma-4-E2B-it-litert-lm (which is itself a LiteRT-LM port of google/gemma-4-E2B-it) with Unsloth LoRA + a hand-curated SFT corpus of multi-turn care dialogues. The model is wired as a router, not a free agent loop β€” it picks from twelve tools (identify_face, tell_me_about, whos_with_me, where_am_i, check_meds, log_dose_taken, create_reminder, make_note, get_help, etc.). Pre-routers on the Flutter side catch distress signals and face-recognition intents deterministically before the model sees the turn.

Training

  • Base: Gemma 4 E2B (instruction-tuned)
  • Method: Unsloth LoRA β€” all-linear, r=16, Ξ±=32 β€” then merged into the base for export
  • Corpus: 3,198 hand-curated turns across 19 waves, generated by ~9 Flutter instances running in parallel on an M4 MacBook (Claude played the patient, Gemma played the companion), LLM-judged per batch
  • Doctrine: never quiz, just tell β€” backed by Tom Kitwood's Dementia Reconsidered (1997) and the DAWN Method's guidance against recall-testing
  • Distress detection uses Al-Mosaiwi & Johnstone (2018) absolutist-word patterns
  • Clinical grounding: Cochrane review (Woods et al., 2018) on reminiscence therapy + DEMENTIA-PLAN (2025) on knowledge-graph-grounded dialogue

Export

  • Quantization: dynamic int8 weights, fp32 activations (wi8 afp32) β€” wi4 collapsed quality, wi8 was the sweet spot that kept persona while loading on GPU
  • Format: .litertlm bundle, 12-section layout, SP_Tokenizer spliced from the base bundle so prefill/decode token IDs stay aligned with the LiteRT-LM runtime
  • Runtime: flutter_gemma on Android/macOS, GPU backend, full delegation (865/865 + 1533/1533 nodes, single partition each), warm-up ~2.5 s on Pixel 9 Pro

The full export recipe (including the dead-ends β€” --task text_generation SEGVs in embedding_lookup, --experimental_lightweight_conversion causes MUL(const, const) GPU rejection, add_hf_tokenizer produces token salad) is on our blog and in the working Kaggle notebook: Fine-Tuned Gemma 4 on Mobile β€” LiteRT Solution.

ArcFace β€” cross-age face embedder

Fine-tuned from the InsightFace w600k_r50 ArcFace ResNet50 backbone (the buffalo_l model pack, from insightface/python-package).

Training

  • Loss: CosFace classification (s=15, m=0.35) β€” pair-contrastive collapsed; switching loss families fixed it
  • Data: 68,000 images across 8,651 identities from six datasets β€” FG-NET (cross-age), Families in the Wild (kinship hard-negatives), CFP-FP and CPLFW (frontal-vs-profile), RMFRD and MLFW (masked faces)
  • Tricks: BN frozen, head warmup, full backbone unfrozen after warmup
  • Output: 512-dim L2-normalized embeddings, fp16 ONNX

Verification accuracy (held-out 13.8K pairs, cos β‰₯ 0.40 threshold):

Model Accuracy Separation
w600k_r50 (stock) 75.1% 1.0Γ—
Zekra fine-tune (this file) 95.5% 2.8Γ—

Runtime: onnxruntime on Android/macOS. Embedded into a 512-dim HNSW vector index in ObjectBox on the device; cosine lookup at recognition time.

How the two models work together

camera frame
   ↓  Google ML Kit (on-device face detection)
face crop
   ↓  ArcFace ONNX  (this repo β€” 83 MB)
512-dim embedding
   ↓  ObjectBox HNSW cosine search
PersonEntity match  ──→  graph lookup (relationship, recent memories, photos)
                            ↓  prompt assembly + Gemma 4 call
                       Gemma 4 LiteRT (this repo β€” 5.1 GB)
                            ↓
                       warm sentence (TTS optional)

No network. Photos never leave the device.

How to use

The shipping integration is the Zekra Flutter app β€” flutter_gemma for the LLM, onnxruntime for ArcFace, ObjectBox for the on-device knowledge graph + vector search. Recipe to drop both files onto a Pixel:

# Bundle (5.1 GB) β€” push, then run-as cp into the app's private dir
adb push model-dyn-wi8-afp32.litertlm /data/local/tmp/model.litertlm
adb shell "run-as com.zekra.zekra cp /data/local/tmp/model.litertlm app_flutter/model.litertlm"

# ArcFace (83 MB)
adb push arcface_zekra_r50_fp16.onnx /data/local/tmp/w600k_r50.onnx
adb shell "chmod 0666 /data/local/tmp/w600k_r50.onnx"
adb shell "run-as com.zekra.zekra cp /data/local/tmp/w600k_r50.onnx app_flutter/w600k_r50.onnx"

Both files land in /data/user/0/com.zekra.zekra/app_flutter/. The app picks them up on the next launch.

License

MIT (this repo). Upstream licenses apply to the base models β€” see the Gemma terms for the LLM base (Apache 2.0 on the litert-community mirror, Gemma terms on the Google original) and the InsightFace license for the ArcFace backbone.

Citation

@misc{zekra2026,
  title  = {Zekra: A Digital Life Story Book for Alzheimer's, Running Entirely On-Device},
  author = {El Hajj, Amanie and El Hajj, Hadi},
  year   = {2026},
  url    = {https://zekra.live},
  note   = {Kaggle Gemma 4 Good Hackathon submission β€” Health \& Sciences track}
}

Zekra (Ψ°ΩƒΨ±Ψ©) means memento in Arabic. Built for our grandmother, and one day perhaps for ourselves.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for amaniee/zekra-memory-assistant

Quantized
(192)
this model