gemma3-4b-ukrainian

LoRA adapter fine-tuned on Gemma 3 4B (pre-trained, google/gemma-3-4b-pt) for improved Ukrainian language support.

Base Model

Training Details

Parameter	Value
Method	LoRA (Low-Rank Adaptation)
LoRA rank	r=16
LoRA alpha	32
Target modules	All attention + MLP modules
Dataset	UKID (Ukrainian Instruction Dataset)
Epochs	3
Best checkpoint	step 5800
Best eval_loss	1.1435
Hardware	RTX 5090 (32 GB VRAM)
Training time	~21 hours

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model_id = "google/gemma-3-4b-pt"
adapter_id = "aigensa/gemma3-4b-ukrainian"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

inputs = tokenizer("Розкажи про Україну:", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Notes

This is a LoRA adapter only - you need the base model separately.
Checkpoint-5800 was selected as the best checkpoint (lowest eval_loss). The final checkpoint at step ~8000 showed mild overfitting (eval_loss 1.2011).
The base model (gemma-3-4b-pt) is pre-trained, not instruction-tuned.

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TheSolutionArchitect/gemma3-4b-ukrainian

Base model

google/gemma-3-4b-pt

Adapter

(13)

this model