Gemma-4-E2B-SOS-LoRA

Fine-tuned LoRA adapter for disaster triage and emergency response.
Built on Gemma 4 E2B using Unsloth + TRL SFTTrainer. Trained on a synthetic dataset of ~2000 START Protocol triage and FEMA emergency scenarios.

Base Model

unsloth/gemma-4-e2b-it-unsloth-bnb-4bit — Gemma 4 E2B (5.15B total, 2.3B effective parameters) in 4-bit NF4 quantization.

Training Details

Hardware: Kaggle T4 (16 GB VRAM, NVIDIA Tesla T4)
Framework: Unsloth 2026.5.2 + PEFT 0.18.1 + TRL SFTTrainer
Quantization: 4-bit NF4 (bitsandbytes)
LoRA Rank: 16
LoRA Alpha: 16
LoRA Dropout: 0
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Batch Size: 2 (gradient accumulation 4 → effective batch 8)
Learning Rate: 2e-4 (cosine schedule, 10 warmup steps)
Optimizer: AdamW 8-bit
Precision: BF16 mixed precision
Training Steps: 500 (2 epochs, 2000 examples)
Final Loss: 0.1424
Total Runtime: 826 seconds (13.8 min on T4)
Trainable Parameters: 31,006,720 (0.60% of 5.15B)
Adapter Size: 124 MB (safetensors format)

Dataset

Synthetic dataset with 2000 examples across three categories:

START Protocol triage (1200 examples) — Victim assessment based on respiratory rate, pulse, capillary refill, and mental status. Outputs RED (Immediate), YELLOW (Delayed), GREEN (Minor), or BLACK (Deceased) per the START triage system.
FEMA emergency response (500 examples) — Protocols for earthquake, fire, flood, tornado, tsunami, CPR, bleeding control, burns, hypothermia, heat stroke, snake bites, chemical exposure, choking, allergic reactions, and more.
Triage edge cases (300 examples) — Respiratory distress, mass casualty scenarios, pediatric victims, pregnant patients, amputations, and multi-casualty sorting.

Usage

With Unsloth (recommended)

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name='unsloth/gemma-4-e2b-it-unsloth-bnb-4bit',
    max_seq_length=2048,
    load_in_4bit=True,
)
model.load_adapter('agp9/gemma-4-e2b-sos-lora')
FastLanguageModel.for_inference(model)

messages = [{'role': 'user', 'content': [{'type': 'text', 'text': 'START triage: Adult male found crushed under rubble. RR=6, pulse=absent, cap_refill=4s, mental=unresponsive.'}]}]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors='pt').to('cuda')
outputs = model.generate(inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

With PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-e2b-it-unsloth-bnb-4bit", device_map="auto")
model = PeftModel.from_pretrained(base, "agp9/gemma-4-e2b-sos-lora")
tokenizer = AutoTokenizer.from_pretrained("unsloth/gemma-4-e2b-it-unsloth-bnb-4bit")

Inference Results

Example outputs from the fine-tuned model:

Input	Output
RR=6, pulse=absent, unresponsive	RED - Immediate
RR=22, pulse=present, cap_refill=2s, alert	YELLOW - Delayed
"What to do during an earthquake?"	DROP, COVER, HOLD ON.
"How do I stop severe bleeding?"	Direct pressure. Elevate. Tourniquet as last resort.

Training Notebook

Kaggle Notebook

Project

Gemma-SOS — An offline Android app for disaster response. Runs Gemma 4 E2B on-device via LiteRT-LM. Features:

START Protocol triage (instant local engine + LLM)
SOS beacon with GPS coordinates
QR-based mesh sync for patient data
Offline maps with resource finder
Wreckage analyzer (camera-based structural assessment)

Competition

This adapter was created for the Gemma 4 Good Hackathon (Google DeepMind x Kaggle) — Unsloth Special Technology Track.

Environmental Impact

Hardware: NVIDIA Tesla T4 (Kaggle)
Training Time: ~14 minutes
Cloud Provider: Kaggle (Google Cloud)
Estimated CO₂: <0.1 kg CO₂eq

Downloads last month: 102