Instructions to use AKrasavcev/lora_gemma3_4b_lt_road_signs with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AKrasavcev/lora_gemma3_4b_lt_road_signs with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-3-4b-it-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "AKrasavcev/lora_gemma3_4b_lt_road_signs") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use AKrasavcev/lora_gemma3_4b_lt_road_signs with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AKrasavcev/lora_gemma3_4b_lt_road_signs to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AKrasavcev/lora_gemma3_4b_lt_road_signs to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for AKrasavcev/lora_gemma3_4b_lt_road_signs to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="AKrasavcev/lora_gemma3_4b_lt_road_signs", max_seq_length=2048, )
lora_gemma3_4b_lt_road_signs
LoRA adapter for Gemma 3 4B-IT fine-tuned to generate single-sentence Lithuanian captions for Lithuanian road signs.
Developed by Einstein Files as part of a university research project at the Faculty of Mathematics and Informatics, Vilnius University.
Model details
| Property | Value |
|---|---|
| Base model | google/gemma-3-4b-it |
| Fine-tuning method | LoRA (r=8, α=16) |
| Fine-tuned modules | Attention + MLP (language decoder only) |
| Quantization | NF4 (4-bit) |
| Training epochs | 4 |
| Learning rate | 2e-4 (cosine schedule, 3% warmup) |
| Effective batch size | 4 |
Training data
159 photographs of Lithuanian road signs taken in urban environments, each paired with a human-authored Lithuanian caption. Split: 115 train / 30 val / 14 test. Captions follow a structured convention: sign category and shape, symbolic content, and background context.
Evaluation (test set, n=14)
| Metric | Base model | This adapter | Δ |
|---|---|---|---|
| chrF++ ↑ | 22.09 | 45.31 | +23.22 |
| BERTScore-F1 ↑ | 85.71 | 92.24 | +6.53 |
| CLIPScore ↑ | 28.52 | 29.95 | +1.43 |
The base model consistently produces markdown-formatted multi-paragraph explanations in mixed language. The fine-tuned adapter produces concise, plain-text, single-sentence descriptions in standard Lithuanian.
Usage
from unsloth import FastVisionModel
from PIL import Image
model, tokenizer = FastVisionModel.from_pretrained(
"AKrasavcev/lora_gemma3_4b_lt_road_signs",
load_in_4bit=True,
)
FastVisionModel.for_inference(model)
image = Image.open("road_sign.jpg").convert("RGB")
prompt = "Aprašyk šį Lietuvos kelio ženklą vienu sakiniu lietuvių kalba."
messages = [{
"role": "user",
"content": [
{"type": "image"},
{"type": "text", "text": prompt},
],
}]
inputs = tokenizer(
image,
tokenizer.apply_chat_template(messages, add_generation_prompt=True),
add_special_tokens=False,
return_tensors="pt",
).to("cuda")
output = model.generate(**inputs, max_new_tokens=128, do_sample=False)
caption = tokenizer.decode(
output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True
).strip()
print(caption)
Limitations
- Trained on a small dataset (115 images); performance on rare sign types may be unreliable.
- Vision encoder is frozen; visual grounding improvements are limited by the base CLIP backbone.
- Optimised for Lithuanian road sign conventions — not suitable for general image captioning.
- Downloads last month
- 34