Model Details
Model Description
This is a model that can answer health-related questions in Myanmar. The model was trained on a small dataset of 800 health-related question and response data points. WYNN747/Burmese-GPT-v3 is the base model, and QLoRA is used to fine-tune it further. The underlying model was too huge to fine-tune and train on Kaggle's free GPUs. As a result, we employ QLoRA to do efficient fine tuning on a based model.
- Developed by: La Min Ko Ko
- Model type: GPT Causual LLM
- Finetuned from model: WYNN747/Burmese-GPT-v3
Uses
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM
config = PeftConfig.from_pretrained("la-min/myanmar-gpt-health-faq")
model = AutoModelForCausalLM.from_pretrained("WYNN747/Burmese-GPT-v3")
model = PeftModel.from_pretrained(model, "la-min/myanmar-gpt-health-faq")
For inferencing, use this simple code.
Before running, install the necessary packages.
%pip install accelerate peft bitsandbytes transformers datasets==2.16.0
PEFT_MODEL = "la-min/myanmar-gpt-health-faq"
config = PeftConfig.from_pretrained(PEFT_MODEL)
model = AutoModelForCausalLM.from_pretrained(
config.base_model_name_or_path,
return_dict=True,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
tokenizer.pad_token = tokenizer.eos_token
model = PeftModel.from_pretrained(model, PEFT_MODEL)
This is for generation config, can adjust
generation_config = model.generation_config
generation_config.max_new_tokens = 300
generation_config.temperature = 0.9
generation_config.top_p = 0.7
generation_config.num_return_sequences = 1
generation_config.pad_token_id = tokenizer.eos_token_id
generation_config.eos_token_id = tokenizer.eos_token_id
def generate_text(model_path, sequence):
model = model_path
ids = tokenizer.encode(f'{sequence}', return_tensors='pt')
final_outputs = model.generate(
input_ids=ids,
generation_config=generation_config,
)
response = tokenizer.decode(final_outputs[0], skip_special_tokens=True)
print(response)
sequence = "[Q] အားဆေး ဘာလို့သောက်ကြတာလဲ"
generate_text(model, sequence)
Training Results & Parameters

This is parameters for trainingArguments.
per_device_train_batch_size=8,
gradient_accumulation_steps=4,
learning_rate=2e-5,
fp16=True,
save_total_limit=3,
logging_steps=10,
output_dir=".....",
max_steps=260,
optim="paged_adamw_8bit",
lr_scheduler_type="cosine",
warmup_ratio=0.05,
report_to="none",
Summary
Due to a lack of training data, the model output cannot produce satisfactory results. More data is needed for training.
Framework versions
- PEFT 0.7.1
- Downloads last month
- 9
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for la-min/myanmar-gpt-health-faq
Base model
WYNN747/Burmese-GPT-v3