You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Candi Sailor2 8B

Fine-tuned version of sail/Sailor2-8B for Indonesian cultural heritage, especially temples (candi).

Model Details

  • Base model: sail/Sailor2-8B (Qwen2 architecture, 8.5B parameters)
  • Fine-tuning method: LoRA (r=16, alpha=32) merged into base model
  • Training: 2 epochs, L40S GPU, ~10 minutes
  • Language: Indonesian, Javanese

Training Config

Parameter Value
Learning rate 2e-4
Batch size 1 × 8 gradient accumulation
Max seq length 2048
Quantization 4-bit NF4 (training)
Optimizer paged_adamw_8bit
Scheduler cosine with warmup

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "junwatu/candi-sailor2-8b",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("junwatu/candi-sailor2-8b")

messages = [{"role": "user", "content": "Apa itu Candi Borobudur?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)

Example Prompts

  • "Apa itu Candi Borobudur?"
  • "Jelaskan sejarah kerajaan Majapahit"
  • "Apa beda candi Hindu dan Budha?"
  • "Ceritakan legenda Roro Jonggrang"
  • "Bagaimana cara memelihara warisan budaya Indonesia?"

LoRA Adapter

The LoRA adapter is also available separately: junwatu/candi-sailor2-8b-lora

License

Apache 2.0 (same as base model)

Citation

@misc{junwatu2026candi,
  title={Candi Sailor2: Fine-tuned Model for Indonesian Cultural Heritage},
  author={junwatu},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/junwatu/candi-sailor2-8b}
}
@article{sailor2,
  title={Sailor2: Advancing Multilingual Large Language Models for Southeast Asian Languages},
  author={Sea AI Lab},
  journal={arXiv preprint arXiv:2502.12982},
  year={2025}
}
Downloads last month
12
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for junwatu/candi-sailor2-8b

Base model

Qwen/Qwen2.5-7B
Finetuned
sail/Sailor2-8B
Finetuned
(9)
this model

Paper for junwatu/candi-sailor2-8b