Instructions to use Alice0914/gemma4-e2b-scam-sentinel with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Alice0914/gemma4-e2b-scam-sentinel with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-E2B-it-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "Alice0914/gemma4-e2b-scam-sentinel") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use Alice0914/gemma4-e2b-scam-sentinel with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Alice0914/gemma4-e2b-scam-sentinel to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Alice0914/gemma4-e2b-scam-sentinel to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Alice0914/gemma4-e2b-scam-sentinel to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Alice0914/gemma4-e2b-scam-sentinel", max_seq_length=2048, )
Scam Sentinel — Fine-tuned Gemma 4 E2B for Multimodal Scam Risk Detection
LoRA adapter fine-tuned on Gemma 4 E2B-it for the Gemma 4 Good Hackathon (2026), Safety & Trust + Unsloth tracks.
This is not a final forensic deepfake detector. It is a multimodal scam risk assistant that combines phone call transcript analysis, conversation patterns, and verification workflows.
Headline Results (300-sample real evaluation, apples-to-apples)
All three rows use the same 300-sample real test set, no RAG, identical v3 system prompt. The only variables are base-model size and the presence of the LoRA adapter.
| Setup | Size | Accuracy | Precision | Recall | F1 | FPR |
|---|---|---|---|---|---|---|
| Gemma 4 E4B base | ~8B | 53.0% | 46.9% | 97.6% | 63.4% | 78.9% |
| Gemma 4 E2B base | ~5B | 41.7% | 41.4% | 96.8% | 58.0% | 97.7% |
| Gemma 4 E2B + QLoRA (this adapter) | ~5B | 89.7% | 98.0% | 76.8% | 86.1% | 1.1% |
Key findings
- Same-size apples-to-apples (E2B base → E2B + QLoRA): F1 jumps +28.1 pt (58.0 → 86.1), FPR collapses 88× (97.7% → 1.1%), Precision more than doubles (41.4% → 98.0%).
- Untuned Gemma 4 base is unusable for this task: both base models flag the vast majority of normal messages as suspicious (FPR 78.9% and 97.7%). The instruction-tuned base has no domain prior for scam vs. normal text.
- Fine-tuning beats raw scale: the fine-tuned 5B model outperforms the larger 8B base by +22.7 F1 points (63.4 → 86.1).
- Recall trade-off is intentional: 96.8% (E2B base) → 76.8% (fine-tuned). See "Design rationale" below — the production cascade's Stage 1 retains high recall.
Model Details
- Developed by: Alice0914 (Gemma 4 Good Hackathon submission)
- Base model: unsloth/gemma-4-E2B-it (~5B params, MatFormer architecture)
- Adapter type: LoRA (PEFT) — 28.7M trainable params (0.56% of base)
- Training framework: Unsloth + TRL SFTTrainer
- Quantization at training: 4-bit NF4 (QLoRA)
- License: Apache 2.0
- Language: English
- Project: Scam Sentinel GitHub repo
Intended Use
Direct Use
Analyze SMS, email, or transcribed phone-call messages and output structured JSON containing:
risk_level:safe/low/medium/high/criticalpatterns: Detected scam patterns (urgency, impersonation, secrecy, etc.)user_message: Plain-language explanation answering "Is this a scam? Why? What to do? How to verify?"tool_calls: Function calls into 12 protective tools (notify family, suggest callback, block payment, etc.)
Out-of-Scope Use
- Voice authenticity / deepfake audio detection (use a dedicated audio model)
- Languages other than English
- Real-time telephony interception (requires phone-system integration)
- Replacement for human judgment in financial decisions
How to Use
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="Alice0914/gemma4-e2b-scam-sentinel",
max_seq_length=1024,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
# (Load the full system prompt from the project repo)
system_prompt = "..."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": "ANALYZE THIS INPUT:\n\nTEXT: Mom, send $500 right now\nMETADATA: {\"channel\": \"sms\"}"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text=text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.3)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Training Details
- Training Data
- 3,100 chat-formatted samples (system + user + assistant)
- Generated from 80 hand-written seeds + 571 real UCI SMS Spam samples × Gemma-4 paraphrased variants
- 8 categories: family_impersonation, prosecutor_scam, bec_scam, romance_scam, package_scam, bank_phishing, phishing_link, normal
- Assistant responses follow a 5-step Chain-of-Thought (IDENTIFY → ASSESS → EXPLAIN → DECIDE TOOLS → ANSWER) + JSON output format
- Train / dev split: 3,100 / 771 (stratified by category)
- Training Hyperparameters
- Method: QLoRA (4-bit NF4 base + LoRA r=16)
- LoRA:
r=16,alpha=32,dropout=0.05,target_modules="all-linear" - Batch: 1 × grad_accum 8 (effective batch 8)
- Epochs: 2 (~775 steps)
- Learning rate: 2e-4, cosine schedule,
warmup_ratio=0.03 - Optimizer:
paged_adamw_8bit - Precision: bf16 (compute) / NF4 (base weights)
- Max sequence length: 1024
- Random seed: 3407
- Hardware
- GPU: Google Colab Pro L4 (22.5 GB VRAM)
- Training time: ~50 minutes for 2 epochs
- Framework versions: PEFT 0.19.1, transformers ≥4.50, trl ≥1.4, Unsloth (latest from GitHub)
Evaluation
- Testing Data
- Held-out set of 300 hand-labeled real samples
- Distribution: 175 safe / 7 low / 79 medium / 26 high / 13 critical
- Sources: FTC consumer-fraud reports, UCI SMS Spam Collection (training-disjoint subset), custom edge cases
- The evaluation set is disjoint from training via the
seeds_real.jsonlfilter — verified by hash check
- Metrics (this adapter)
Binary danger-vs-safe (matches the project's baseline reporting protocol):
| Metric | Value |
|---|---|
| Accuracy | 89.7% |
| Precision | 98.0% |
| Recall | 76.8% |
| F1 | 86.1% |
| FPR | 1.1% |
| JSON parsing success | 95.3% (286/300) |
- Strict 5-class match: 69.0% (model occasionally over-classifies within the dangerous range, e.g., medium → high — the correct failure mode for a safety-critical app)
- What Fine-tuning Changed (E2B base → E2B + QLoRA)
| Behavior | Base (E2B ~5B) | Fine-tuned (E2B ~5B) | Δ |
|---|---|---|---|
| FPR | 97.7% | 1.1% | 88× reduction |
| Precision | 41.4% | 98.0% | +56.6 pt |
| Accuracy | 41.7% | 89.7% | +48.0 pt |
| F1 | 58.0% | 86.1% | +28.1 pt |
| Recall | 96.8% | 76.8% | −20.0 pt (intentional trade-off) |
- The base instruction-tuned model has no in-domain prior for "what does a normal message look like?" — it flags 97.7% of safe messages as suspicious
- Fine-tuning re-calibrates the decision boundary using 3,100 in-domain examples
- The recall reduction is a deliberate trade-off favoring user trust over raw catch rate
- Fine-tuning vs Raw Scale (E4B base → E2B + QLoRA)
| Behavior | E4B base (~8B) | E2B + QLoRA (~5B) | Δ |
|---|---|---|---|
| F1 | 63.4% | 86.1% | +22.7 pt |
| FPR | 78.9% | 1.1% | 72× reduction |
| Precision | 46.9% | 98.0% | +51.1 pt |
- A fine-tuned smaller model decisively outperforms a larger base model on this task
- Demonstrates that domain adaptation dominates scale for safety-critical classification with limited training compute
- Total cost: one Colab L4 session, ~50 minutes
Note on comparison fairness: All three setups use the same 300-sample test set and identical v3 system prompt; no RAG. Base models use Ollama Q4_K_M quantization; the fine-tune uses Unsloth NF4 (4-bit). Both are 4-bit; quantization differences contribute marginally — the +28.1 F1 / 88× FPR delta is dominated by the adapter, not quantization or size.
- Design Rationale: Precision over Recall
In the Scam Sentinel production system, this adapter is Stage 2 of a two-stage cascade:
Stage 1 — a fast classifier (e.g., gemma3:4b) ensures every potentially dangerous message is escalated (recall 99%+)
Stage 2 — this fine-tuned adapter provides high-confidence reasoning and tool calls only when action is warranted (precision 98%)
Stage 1 handles "catch everything"
Stage 2's job is to justify action — blocking payments, alerting family, demanding callback verification
With 1.1% FPR, when this model flags a message, downstream actions are trusted by users
Higher recall at this stage would re-introduce the user-trust collapse seen in the base model (FPR 97.7%), making the product unusable in real deployment regardless of recall
Bias, Risks, and Limitations
- Language
- English-only: Trained on English text; performance on other languages is not validated
- Classification Behavior
- Over-classification bias within the dangerous range: The model leans toward "more dangerous" classifications (e.g., medium → high)
- This is intentional — once false positives on safe messages are eliminated, the safer error mode within non-safe messages is to over-classify
- Downstream tools (wait timer, callback verification) make over-classification cheap to recover from
- Recall Trade-off
- Some borderline messages may be missed
- Recommended deployment pairs this adapter with a high-recall first-pass classifier (cascade Stage 1)
- Training Data Provenance
- Synthetic origin: 80% of training data was Gemma-paraphrased from hand-written seeds and real UCI SMS spam
- Evaluation uses real held-out data only to detect any overfit to synthetic style
- Tool Calls Are Advisory
- The 12 protective tools are recommended actions
- Downstream systems must enforce safety policies independently
- The model does not execute actions — it returns structured intent
Citation
Project: Scam Sentinel — submission for the Gemma 4 Good Hackathon (2026).
@misc{scam-sentinel-2026,
author = {Alice0914},
title = {Scam Sentinel: Multimodal Scam Risk Assistant with Fine-tuned Gemma 4 E2B},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Alice0914/gemma4-e2b-scam-sentinel}},
note = {Submission for the Gemma 4 Good Hackathon}
}
## Framework Versions
- PEFT 0.19.1
- Unsloth (latest from GitHub)
- transformers ≥4.50
- trl ≥1.4
- Downloads last month
- 123