Gemma4-E2B Darija North — Merged (RAFT Ready)

This is a fully merged (LoRA weights fused into base model) version of google/gemma-4-E2B-it
fine-tuned on ~60K Northern Moroccan Darija (Jebli dialect) samples.

Training

Base model: google/gemma-4-E2B-it
Dataset: Northern Darija dialect data (~60K samples)
Method: LoRA fine-tuning via Unsloth on Kaggle T4x2
LoRA rank: 16, alpha: 32
This model: LoRA adapter merged into base → full model, no adapter needed

RAFT Ready

Because the adapter is merged, this model can be used directly as a policy model
for RAFT (Reward rAnked Fine-Tuning) without any special adapter handling.

Dialect Features

Northern Darija / Jebli phonology: preserved Qaf (ق), not shifted to Gaf
Pronouns: uses شمالي gender-neutral forms (نتينا)
Spanish loanwords: النيبيرا (nevera), etc.
Diminutives: عايل ستيتو / عايلة ستيتوة
Time expressions: القايلة, etc.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ChamalyAI/gemma4-E2B-chamaliya", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("ChamalyAI/gemma4-E2B-chamaliya")

Downloads last month: 12

Safetensors

Model size

4B params

Tensor type

F32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ChamalyAI/gemma4-E2B-chamaliya

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it

Finetuned

(232)

this model