---
language:
- jv
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- sft
---
Document Title
Open models for indigenous Indonesian languages
Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.
Beta preview
Bakpia V1 is a family of Javanese language models. It is fine-tuned from available open models using massive synthetic data for Krama Javanese, where the prompts are generated by GPT-4o and the responses are generated by Claude 3 Haiku.
This repository contains the fp16 version of Bakpia V1 1.5B.
| Version | Base Model | URL |
|---------|------------|-----|
| V1 0.5B | Qwen 2 0.5B Instruct | [fp16](huggingface.co/afrizalha/Bakpia-V1-0.5B-Javanese/) |
| V1 1.5B | Qwen 2 1.5B Instruct | [fp16](huggingface.co/afrizalha/Bakpia-V1-1.5B-Javanese/) |
| V1 9B | Gemma 2 9B Instruct | [fp16](huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-fp16)/[4bit](huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-4bit/) |
## Version 1.0
This is the first version of Bakpia.
✨ Training
- 36K input-output pairs
- 64/128 lora r/alpha
- Rank-stabilized lora
✨ Features
- Single-turn QA across various domains.
- Ngoko Javanese not currently supported.
## Use
```
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-1.5B-Javanese")
template = """<|im_start|>system
<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""
input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?"
input = tokenizer([input], return_tensors = "pt").to("cuda")
outputs = model.generate(**input, max_new_tokens = 1024, temperature=.5, use_cache=False, do_sample=True)
print(tokenizer.batch_decode(outputs)[0])
```
## Acknowledgments
- **Developed by:** Afrizal Hasbi Azizy
- **License:** apache-2.0