Luganda Gemma 1B IT

A fine-tuned version of Google Gemma 3 1B Instruct for English ↔ Luganda translation and Luganda conversational AI.

Highlights

  • BLEU 13.85 on English→Luganda translation, up from 0.06 on the base model
  • chrF++ 46.59 — strong character-level accuracy for morphologically rich Luganda
  • Trained with QLoRA (4-bit quantization + LoRA adapters) — runs on consumer GPUs
  • Only 52 MB adapter on top of the 1B base model

Results

Model BLEU chrF++
Gemma 3 1B base (no fine-tuning) 0.06 8.84
This model 13.85 46.59

Usage

Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

base_model_id = "google/gemma-3-1b-it"
adapter_id = "AmplifiedAccess/Luganda-gemma-1b-it"

# Load with same quantization used during training
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config=BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
        bnb_4bit_use_double_quant=True,
    ),
    device_map={"": 0},
)

model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

Translation (English → Luganda)

prompt = "Translate to Luganda:\nThe farmers need better seeds to improve their harvest."

messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(
    messages, return_tensors="pt", return_dict=True, add_generation_prompt=True
).to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)

prompt_len = inputs["input_ids"].shape[1]
response = tokenizer.decode(outputs[0][prompt_len:], skip_special_tokens=True).strip()
print(response)

Supported prompt formats

The model was trained on varied prompt templates. All of these work:

English → Luganda:

  • Translate to Luganda:\n{text}
  • Convert this to Luganda:\n{text}
  • English to Luganda:\n{text}
  • How do you say this in Luganda?\n{text}

Luganda → English:

  • Translate to English:\n{text}
  • Convert this to English:\n{text}
  • Luganda to English:\n{text}
  • What does this mean in English?\n{text}

Conversational (respond in Luganda):

  • Respond in Luganda: {text}
  • Answer in Luganda:\n{text}
  • Yogera mu Luganda: {text}

Example translations

English Model output
Five prisoners were released yesterday. Abasibe bataano baateereddwa ku kkomera jjo.
The government take long to pay doctors in public hospitals Gavumenti etwala obudde bungi okusasula abasawo mu malwaliro ga gavumenti.
The professor is the presidential advisor. Omusomesa ye munnamateeka wa pulezidenti.
My father inspired me to become a surgeon. Taata wange yamukubiriza okufuuka omusawo.
How much money did they share amongst themselves? Baawaddeyo ssente mmeka?

Training details

Data

  • 94,542 training examples derived from ~24,000 clean English-Luganda parallel sentence pairs
  • Primary source: Sunbird SALT dataset — professionally translated, multi-way parallel corpus covering agriculture, health, society, and other locally relevant topics
  • Three task types: English→Luganda translation, Luganda→English translation, and Luganda conversation
  • Split: 90% train / 5% validation / 5% test

Configuration

Parameter Value
Base model google/gemma-3-1b-it
Method QLoRA (4-bit NF4 + LoRA)
LoRA rank 16
LoRA alpha 32
LoRA dropout 0.05
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable parameters 13.05M (2.0% of total)
Epochs 1
Effective batch size 16 (8 × 2 gradient accumulation)
Learning rate 2e-4 (cosine schedule)
Max sequence length 256
Precision bf16
Optimizer paged AdamW 8-bit
Training time ~8 hours on Tesla T4
Final training loss 1.55
Final validation loss 1.26

Training loss curve

Step Train loss Val loss
500 2.07 2.01
1000 1.74 1.73
2000 1.49 1.48
3000 1.38 1.35
4000 1.29 1.28
5000 1.24 1.26
5500 1.29 1.26

Important notes

  • Use 4-bit quantization for inference — the LoRA weights were trained against 4-bit base weights. Loading in 8-bit or full precision will produce degraded translations.
  • Use do_sample=False for deterministic, highest-quality translations.
  • Use the base model tokenizer (google/gemma-3-1b-it) for best results.
  • The model was trained on 1 epoch. Further training (3+ epochs) and additional data sources (JW300, Kimera corpus) would likely improve scores further.

Limitations

  • Trained primarily on Sunbird SALT data, which covers agriculture, health, and society topics. Performance may be weaker on highly specialized domains (legal, medical, technical).
  • Luganda→English translation quality may lag behind English→Luganda since the base model already has strong English capabilities.
  • Short sentences (under 5 words) may produce less accurate translations.
  • The model may occasionally produce Luganda synonyms or dialectal variants that differ from a specific reference translation while still being semantically correct.

Intended use

  • English ↔ Luganda machine translation
  • Luganda conversational AI and chatbot applications
  • Educational tools for Luganda language learning
  • Research on low-resource African language NLP

Acknowledgments

Framework versions

  • PEFT 0.18.1
  • Transformers 4.x
  • TRL 0.x
  • PyTorch 2.x
  • bitsandbytes 0.45.x

Citation

If you use this model, please cite:

@misc{luganda-gemma-1b-2026,
  title={Luganda Gemma 1B IT: Fine-tuned Gemma 3 1B for English-Luganda Translation},
  author={Amplified Access},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/AmplifiedAccess/Luganda-gemma-1b-it}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AmplifiedAccess/Luganda-gemma-1b-it

Adapter
(214)
this model

Dataset used to train AmplifiedAccess/Luganda-gemma-1b-it