gemma3-4b-ukrainian

LoRA adapter fine-tuned on Gemma 3 4B (pre-trained, google/gemma-3-4b-pt) for improved Ukrainian language support.

Base Model

google/gemma-3-4b-pt

Training Details

Parameter Value
Method LoRA (Low-Rank Adaptation)
LoRA rank r=16
LoRA alpha 32
Target modules All attention + MLP modules
Dataset UKID (Ukrainian Instruction Dataset)
Epochs 3
Best checkpoint step 5800
Best eval_loss 1.1435
Hardware RTX 5090 (32 GB VRAM)
Training time ~21 hours

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model_id = "google/gemma-3-4b-pt"
adapter_id = "aigensa/gemma3-4b-ukrainian"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

inputs = tokenizer("Розкажи про Україну:", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Notes

  • This is a LoRA adapter only - you need the base model separately.
  • Checkpoint-5800 was selected as the best checkpoint (lowest eval_loss). The final checkpoint at step ~8000 showed mild overfitting (eval_loss 1.2011).
  • The base model (gemma-3-4b-pt) is pre-trained, not instruction-tuned.
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TheSolutionArchitect/gemma3-4b-ukrainian

Adapter
(13)
this model