Gemma4-E2B Darija North — Merged (RAFT Ready)

This is a fully merged (LoRA weights fused into base model) version of google/gemma-4-E2B-it
fine-tuned on ~60K Northern Moroccan Darija (Jebli dialect) samples.

Training

  • Base model: google/gemma-4-E2B-it
  • Dataset: Northern Darija dialect data (~60K samples)
  • Method: LoRA fine-tuning via Unsloth on Kaggle T4x2
  • LoRA rank: 16, alpha: 32
  • This model: LoRA adapter merged into base → full model, no adapter needed

RAFT Ready

Because the adapter is merged, this model can be used directly as a policy model
for RAFT (Reward rAnked Fine-Tuning) without any special adapter handling.

Dialect Features

  • Northern Darija / Jebli phonology: preserved Qaf (ق), not shifted to Gaf
  • Pronouns: uses شمالي gender-neutral forms (نتينا)
  • Spanish loanwords: النيبيرا (nevera), etc.
  • Diminutives: عايل ستيتو / عايلة ستيتوة
  • Time expressions: القايلة, etc.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ChamalyAI/gemma4-E2B-chamaliya", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("ChamalyAI/gemma4-E2B-chamaliya")
Downloads last month
12
Safetensors
Model size
4B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ChamalyAI/gemma4-E2B-chamaliya

Finetuned
(232)
this model