teach-multilingual-gemma-4-e2b-r3

LoRA adapter for Gemma 4 E2B (4-bit Unsloth base), fine-tuned for Socratic K–12 tutoring in realistic student register. Part of Canis Teach for the Gemma 4 Good hackathon (Future of Education · Unsloth · llama.cpp).

Trained on the full 51,870 multi-turn split of CanisAI/teach-r3-multilingual (dialogue field, default config). A separate adapter was trained on the 161k single-turn export; this repository is the 51k multi-turn publish.

Unsloth

Model details

Field Value
Developed by CanisAI (Marko Nedilko)
Adapter type LoRA (PEFT / QLoRA)
Base model unsloth/gemma-4-E2B-unsloth-bnb-4bit
Training framework Unsloth + TRL
LoRA config r=16, alpha=16, dropout=0
Training data 51,870 multi-turn dialogues
Data generation Gemma 4 26B A4B-IT on DGX Spark + Canis.lab
Training hardware RTX 4080 Super (full run impractical) → rented A6000, ~12 h
Languages en, de, uk, fr, es, it (dataset coverage)
Code https://github.com/crasyK/canis-gemma4good
Live Studio demo https://canis.appwrite.network
Video https://www.youtube.com/watch?v=QbxPs0jLiZY
License Apache-2.0 (adapter). Base + derivatives: Gemma Terms of Use

Training run telemetry (loss CSVs) was not retained. The training config in the public repo is reproducible.

Intended use

  • Local / on-device Socratic tutoring (llama-server + Canis CLI)
  • Messy, short, informal student messages (slang, typos, fragments)
  • Research on small specialised tutors vs general chatbots

Out of scope: medical or legal advice, high-stakes assessment, unsupervised classroom deployment.

Training data

CanisAI/teach-r3-multilingual

Config ~Rows Role
default 51,870 This adapter — multi-turn dialogue
adapted-hybrid 4,958 Adaption-enhanced hybrid slice
adapted-hybrid-flat 4,958 Flat single-turn variant
chat-pilot-source 300 Small adaptation experiment
chat-pilot-adapted 298 Small chat-column pilot — not this adapter’s train set

Gemma 4 on Gemma 4: 26B for synthetic data generation, E2B for this student-facing adapter.

How to use

Transformers + PEFT

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base = "unsloth/gemma-4-E2B-unsloth-bnb-4bit"
adapter = "CanisAI/teach-multilingual-gemma-4-e2b-r3"

tok = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(base, device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(model, adapter)

messages = [
    {"role": "system", "content": "You are a Socratic K-12 tutor. Do not give the final answer directly."},
    {"role": "user", "content": "hey kannst du mir einfach die lösung für aufgabe 3 schicken"},
]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = model.generate(**tok(text, return_tensors="pt").to(model.device), max_new_tokens=256, temperature=0.7)
print(tok.decode(out[0], skip_special_tokens=True))

See model/load_adapter.py in the submission repo.

Unsloth (faster)

from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    model_name="CanisAI/teach-multilingual-gemma-4-e2b-r3",
    max_seq_length=2048,
)

Local deployment (llama.cpp / Ollama)

Merge or serve with llama.cpp / Ollama. See model/inference_ollama.md and cli/ in the submission repo.

Training procedure

  • Method: QLoRA (4-bit) via PEFT + Unsloth, supervised fine-tuning (SFT)
  • Base: unsloth/gemma-4-E2B-unsloth-bnb-4bit
  • LoRA: r=16, alpha=16
  • Data: 51,870 dialogues, default config, dialogue field
  • Hardware: A6000, ~12 hours (after 4080 Super projected ~1 week for full run)
  • Code / configs: https://github.com/crasyK/canis-gemma4good/tree/main/training

Evaluation (honest)

  • No formal benchmark suite published for this adapter at submission time.
  • No classroom field trial of this R3 adapter or Canis Studio.
  • R1 research A/B (Qwen3-era ELMs; tools in lesson/, Canis paper) informed dataset design (e.g. short student messages, generalist vs math-only). That is not validation of this Gemma 4 checkpoint in production.
  • Re-run training/train.py for your own loss curves.

Limitations and risks

  • Small E2B model → limited factual depth; can confabulate.
  • Synthetic R3 data → generator biases; uneven per-language coverage.
  • Optimises tutoring style, not factual correctness — use teacher oversight and/or RAG.
  • Not a child-safety certification; test before school deployment.

Citation

@software{canis_teach_r3_gemma4_2026,
  author  = {Nedilko, Marko},
  title   = {teach-multilingual-gemma-4-e2b-r3: Socratic LoRA for Gemma 4 E2B},
  year    = {2026},
  url     = {https://huggingface.co/CanisAI/teach-multilingual-gemma-4-e2b-r3},
  note    = {Canis Gemma 4 Good submission, github.com/crasyK/canis-gemma4good}
}

Acknowledgments

Google DeepMind (Gemma 4), Unsloth, llama.cpp / Ollama communities, Appwrite (Studio hosting), Ollama × NVIDIA (GTC Golden Ticket / DGX Spark for R3 generation), participants in the R1 research study (Canis paper).

Downloads last month
68
GGUF
Model size
25.3M params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CanisAI/teach-multilingual-gemma-4-e2b-r3

Adapter
(1)
this model

Dataset used to train CanisAI/teach-multilingual-gemma-4-e2b-r3

Collection including CanisAI/teach-multilingual-gemma-4-e2b-r3