Instructions to use Raras-AI/araras-gemma4-e4b-v4-sota with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Raras-AI/araras-gemma4-e4b-v4-sota with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-e4b-it-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "Raras-AI/araras-gemma4-e4b-v4-sota") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use Raras-AI/araras-gemma4-e4b-v4-sota with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Raras-AI/araras-gemma4-e4b-v4-sota to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Raras-AI/araras-gemma4-e4b-v4-sota to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Raras-AI/araras-gemma4-e4b-v4-sota to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Raras-AI/araras-gemma4-e4b-v4-sota", max_seq_length=2048, )
Araras-Gemma4-E4B — Clinical Decision Support for Brazilian Rare-Disease Care
The first Gemma 4 fine-tune purpose-built as offline clinical decision support for healthcare professionals attending Brazil's SUS (Sistema Único de Saúde).
"Eu levei vinte anos pra ter o meu diagnóstico de doença rara. Não dá pra mudar isso pra mim. Mas a próxima criança raríssima que entrar numa UBS amanhã — o médico que atender ela não precisa estar sozinho." — Dimas, fundador da Raras, paciente raro (distonia mioclônica, ORPHA:36899)
Submission to the Gemma 4 Good Hackathon.
⚕️ Intended use
Clinical decision support for licensed healthcare professionals (médicos, enfermeiros, agentes comunitários, geneticistas) attending patients in Brazil — especially in remote regions where rare-disease expertise doesn't reach.
- Not a patient-facing diagnostic tool
- Not a replacement for clinical judgment
- Augments the physician with structured differentials, PCDT lookups, and CEAF medication info
- Runs offline on smartphone, laptop, or any device that runs llama.cpp / Ollama / MLX
Compliance posture
Positioned as Software as a Medical Device (SaMD) — Clinical Decision Support under ANVISA Res. 657/2022, 751/2022, and 830/2023. Aligned with CFM guidance: AI augments physicians, never replaces. Every clinical claim is grounded in verifiable public sources (gov.br/conitec, bvsms, PubMed).
Model details
- Base:
unsloth/gemma-4-E4B-it(8B total / 4.5B effective parameters) - Method: QLoRA SFT via Unsloth —
r=8, α=8, dropout=0, NEFTune α=5,train_on_responses_only - Training data: 120,740 train / 5,137 val examples (deduplicated), all PT-BR rare-disease content:
- 108k curated rare-disease Q&A (HPO/OMIM/ORPHA-grounded)
- 26k MedPT PT-BR clinical pairs
- 10k ultra-rare disease longitudinal cases
- 10k tool-calling examples (HPO normalize, PCDT lookup, CID-10 map)
- 5.7k knowledge-graph triples (RarasNet Neo4j — 10,468 rare diseases)
- 3.2k FindZebra hard cases + 1.3k ReDis-QA
- Compute: A100 80GB on Vertex AI, ~2 hours, ~$25 total cost
- Native Gemma 4 features used:
<|channel>thoughtthinking blocks (toggleable per request)- 128K context (fits full patient laudo + PCDT + retrieved cases)
- System role for safety constraints
Companion artifacts
| Repo | Format | Size | Use case |
|---|---|---|---|
| Raras-AI/araras-gemma4-e4b-v4-sota | PEFT LoRA adapter | 147 MB | This — apply to base Gemma 4 E4B |
| Raras-AI/araras-gemma4-e4b-v4-gguf | GGUF Q4_K_M / Q5_K_M | 5.3 / 5.7 GB | llama.cpp, Ollama, edge |
| Raras-AI/araras-hpo-brasil | sentence-transformers | 340 MB | PT-BR clinical → HPO matcher (companion) |
| Raras-AI/araras-hpo-brasil-int8 | ONNX INT8 | 85 MB | Edge HPO matching |
| 🤗 RareBench-BR (public) | JSONL dataset | 2.5 MB | 835 SUS-grounded validated cases |
End-to-end pipeline
PT-BR free text (laudo, prontuário, transcrição da consulta)
↓
[1] 🧬 araras-hpo-brasil (BioLORD-2023 fine-tune for PT-BR)
Normaliza idioma clínico regional → HPO codes
"amarelão" → HP:0000952 · "bebê molinho" → HP:0001252
↓
[2] 🧠 araras-gemma4-e4b Q4_K_M (this model, 5.3 GB, llama.cpp)
Gera TOP-5 diferenciais ranqueados em PT-BR
↓
[3] ✅ Hybrid canonical ORPHA resolver (production technique, e.g. MedCAT/scispaCy)
Tier 1: strict substring match on 10,468-keyword PT-BR dict (~30ms)
Tier 2: BioLORD semantic fallback @ cosine ≥ 0.78 (~50ms, only if Tier 1 fails)
Returns None (honest abstention) if neither tier matches confidently
↓
[4] 📋 PCDT overlay — 24 PCDTs do MS estruturados → CEAF + centro de referência
↓
Output: structured clinical decision support — differentials + PCDT + SUS conduta + centro
Total stack footprint: 5.5 GB. Runs offline on iPhone, Android, laptop. Zero cloud. Zero LGPD risk.
Evaluation — RareBench-BR
We built RareBench-BR, the first benchmark uniting Dx + SUS conduta in PT-BR. Anchored in 52,343 real anonymized SUS patient trajectories (APAC, CNS-linked, 2017-2020+).
Results on full unified RareBench-BR_SUS (833 cases, 0 errors)
| Metric | Araras-Gemma4 (Q4_K_M offline, 4.5B) | DeepSeek V4 (cloud, ~600B, 36-case subsample) |
|---|---|---|
| R@1 (clinical name) | 41.2% | 86.1% |
| R@3 | 47.1% | 91.7% |
| 🔥 Track B PCDT-correct | 76.8% (331/431) | 91.7% (33/36) |
| Errors | 0 | n/a |
| Latency p50 | 6.5s | 4.1s |
| Cost per query | $0 (local) | ~$0.001 |
| Params | 4.5B | ~600B |
| Device | phone | data center |
Per-layer breakdown (full 833 cases):
| Layer | N | R@1 | R@3 |
|---|---|---|---|
| L3_v2 — PCDT atypical presentations | 135 | 27.4% | 32.6% |
| L4 — Hard BR (tropical phenocopies, founder, IEI) | 79 | 17.7% | 19.0% |
| L5_v2 — SUS-grounded synthesis | 619 | 47.2% | 53.8% |
| L5_realsus standalone (common SUS rare diseases) | 240 | 70.4% | 78.3% |
Why Track B = 76.8% is the headline: for 3 out of every 4 cases where ground truth specifies a CEAF-dispensed medication, Araras recommends the exact molecule SUS actually pays for. Nobody publishes this number because nobody trained on this data. Cloud frontier (DeepSeek 600B) hits 91.7% — Araras hits 76.8% at 150× fewer parameters, fully offline, $0 marginal cost.
Vs. prior Qwen3.5-9B baseline on L1: 16.6% R@1 at 64s
Araras-Gemma4 is 2.5× more accurate, 9× faster, smaller.
Example clinical conversations
These are real outputs from the deployed model. Multi-turn, with HPO normalization upstream.
Example 1 — AME tipo 1 (UBS in Petrolina-PE, no internet)
Médico: Lactente feminino, 4 meses, natural de Petrolina (PE), pais primos. Hipotonia profunda generalizada desde o nascimento, ausência de sustentação cefálica, choro fraco, dificuldade para mamar, abolição dos reflexos osteotendíneos profundos. SpO2 95%.
Araras (4.5s): "Olá! 👶 Com base nos achados clínicos e na história, o quadro sugere um diagnóstico de AME tipo 1. A tríade clássica (hipotonia profunda generalizada desde o nascimento + ausência dos reflexos osteotendíneos profundos + dificuldade para mamar) é altamente sugestiva. Recomendo encaminhamento para neurologia pediátrica e investigação laboratorial específica (eletroneuromiografia, testes genéticos SMN1)."
Médico turn 2: Como agilizar acesso ao tratamento pelo SUS?
Araras (5.0s): "Para acelerar o acesso diagnóstico e terapêutico no SUS: (1) Protocolo de Urgência/Emergência: se houver piora do quadro, acione SAMU/UPA. (2) Articulação com Equipe Básica: peça ao médico da UBS um laudo detalhado, com ênfase na 'hipotonia profunda generalizada desde o nascimento' e nos marcos perdidos. (3) Interconsulta Direta: solicite agendamento prioritário com Neuropediatria do município/região de referência."
Example 2 — Crise vaso-oclusiva em Doença Falciforme (UBS rural BA)
Médico: Adolescente masculino, 15 anos, zona rural de Conceição do Coité (BA). Internado com dor torácica intensa + dispneia. Triagem neonatal positiva HbSS, crises álgicas vaso-oclusivas frequentes. Hb 6.8, retic 12%, leuco 18k. Sat 89%.
Araras (1.8s): "🚨 ALERTA — Quadro altamente sugestivo de Crise vaso-oclusiva grave em paciente com Anemia Falciforme. Aguardamos retorno do Hematologista para manejo da crise e avaliação transfusional urgente. Manter monitorização respiratória máxima, oxigenoterapia e analgesia contínua."
Médico turn 2: É síndrome torácica aguda? Que conduta urgente?
Araras (3.5s): "⚠️ Conduta IMEDIATA: (1) Oxigênio 10L/min, manter saturação >92%. (2) Analgesia potente (opioides IV) e manejo do choque (fluidos com cautela). (3) Hemotransfusão emergencial de hemácias concentradas se Hb <6 g/dL ou lactato alto. (4) Manter monitorização cardiorrespiratória e comunicar Hematologia."
Example 3 — Fenilcetonúria (triagem neonatal Manaus-AM)
Médico: RN masculino, 12 dias de vida, Manaus (AM). Teste do pezinho com fenilalanina sérica 18 mg/dL (normal <2). Pais consanguíneos.
Araras (4.3s): "🚨 ALERTA ALTO RISCO. O teste do pezinho com fenilalanina sérica de 18 mg/dL é extremamente elevado. Quadro de Fenilcetonúria (PKU) que requer teste confirmatório imediato. Não espere o retorno dos resultados: a hiperfenilalaninemia leva à encefalotoxicidade irreversível em horas. AÇÃO IMEDIATA: (1) Notificar UTI Neonatal/Neuropediatria. (2) Iniciar controle metabólico (fórmula livre de Phe). (3) Coleta de aminoácidos plasmáticos e urinários."
How to use
Option A — llama.cpp (recommended for clinical edge deployment)
hf download Raras-AI/araras-gemma4-e4b-v4-gguf araras-gemma4-e4b-v4-Q4_K_M.gguf --local-dir ./
llama-server -m araras-gemma4-e4b-v4-Q4_K_M.gguf -ngl 99 -c 8192 --jinja
curl http://127.0.0.1:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"messages": [
{"role": "system", "content": "Você é ARARAS, copiloto de apoio à decisão clínica em doenças raras..."},
{"role": "user", "content": "<caso clínico em PT-BR>"}
],
"temperature": 0.5, "top_p": 0.95, "top_k": 64, "repeat_penalty": 1.15
}'
Option B — transformers + PEFT
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base = "unsloth/gemma-4-E4B-it"
adapter = "Raras-AI/araras-gemma4-e4b-v4-sota"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
model.eval()
Option C — Full pipeline (BioLORD + Gemma + ORPHA lookup + PCDT)
git clone https://github.com/rarasAI/araras-gemma4
cd araras-gemma4 && pip install -r requirements.txt
python -m araras_gemma4.pipeline "<caso clínico>"
Limitations
- Quantization Q4 ceiling: Q4_K_M loses ~3-5% accuracy vs full bf16. Use Q5_K_M or full adapter for research.
- Ultra-rare disease coverage: training data heavily skewed toward the 12 CEAF-covered diseases. Performance drops on ultra-rare conditions (<100 cases described in BR).
- SMA subtype confusion: distinguishing SMA type 1 vs 2 vs 3 from phenotype text alone is genuinely hard without genetic typing. We document this honestly.
- ORPHA code hallucination: like all rare-disease LLMs, may emit invented ORPHA numbers. Always use the canonical lookup post-processor in the pipeline.
- Not for emergency triage: this is for longitudinal-care decision support, not ER triage.
- PT-BR primary: model trained primarily for Brazilian Portuguese clinical idiom. English/Spanish performance not characterized.
Companion: RareBench-BR_Public
Together with this model we release RareBench-BR_Public — 835 validated cases spanning 24 official Ministry of Health PCDTs + hard Brazilian rare-disease cases. The first benchmark in the world that scores rare-disease LLMs on Brazilian SUS conduta (PCDT + CEAF medication).
Citation
@misc{raras_gemma4_2026,
author = {Raras Team},
title = {Araras-Gemma4-E4B: Open Gemma 4 Clinical Decision Support for Brazilian Rare-Disease Care},
year = {2026},
month = {may},
publisher = {Hugging Face},
url = {https://huggingface.co/Raras-AI/araras-gemma4-e4b-v4-sota},
howpublished = {Submitted to Gemma 4 Good Hackathon, Google DeepMind / Kaggle},
}
Acknowledgments
Built on the work of Google DeepMind (Gemma 4), Unsloth (training recipe), FremyCompany (BioLORD-2023), Chen et al. (RareBench L1), Conitec / Ministry of Health Brazil (PCDTs), Casa dos Raros (model of local-clinician intervention), and the rare-disease open-science community.
Built by a Brazilian rare-disease patient, for the 13 million Brazilians the existing AI doesn't know exist — and for the doctors who attend them in UBSs across the country.
- Downloads last month
- 36
Model tree for Raras-AI/araras-gemma4-e4b-v4-sota
Dataset used to train Raras-AI/araras-gemma4-e4b-v4-sota
Evaluation results
- R@1 (full unified bench, 833 cases) on RareBench-BR_SUS unified (833 cases, full layers, 0 errors)self-reported0.412
- R@3 (full unified bench) on RareBench-BR_SUS unified (833 cases, full layers, 0 errors)self-reported0.471
- Track B PCDT-correct (331/431 evaluable — model recommends exact CEAF drug) on RareBench-BR_SUS unified (833 cases, full layers, 0 errors)self-reported0.768
- L5_v2 R@1 (SUS-grounded subset, n=619) on RareBench-BR_SUS unified (833 cases, full layers, 0 errors)self-reported0.472
- L5_realsus standalone R@1 (n=240, common SUS rare diseases) on RareBench-BR_SUS unified (833 cases, full layers, 0 errors)self-reported0.704