TAIDE-7B-Thai-Pretrain-LoRA
Model Description
TAIDE-7B-Thai-Pretrain-LoRA is a machine translation model designed to translate between Traditional Chinese and Thai. The model has 7 billion parameters and is built upon the Trustworthy AI Dialogue Engine by Taiwan (TAIDE). It employs a two-stage fine-tuning process to enhance its proficiency in Thai and align its translation capabilities with the nuances of both Traditional Chinese and Thai languages.
Training Methodology
Utilized the Advanced Language Model-based Translator (ALMA) strategy developed by Xu et al. (2024):
- Initial Pre-Training Stage:
- Objective: To build a robust foundational understanding of the Thai language.
- Method: Continue pre-train on a comprehensive dataset of one million Thai instances.
- Fine-Tuning Stage:
- Objective: To align the model's translation capabilities with the specific nuances of both languages.
- Method: Fine-tuned on a smaller set of high-quality Traditional Chinese-Thai parallel data.
A quick start to use our model.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Wilailack/TAIDE-7B-Thai-Pretrain-LoRA", use_fast=True)
model = AutoModelForCausalLM.from_pretrained("Wilailack/TAIDE-7B-Thai-Pretrain-LoRA", torch_dtype=torch.bfloat16, device_map="auto")
# Add the source setence into the prompt template
prompt="Translate this from Chinese to Thai:\nChinese: 我最愛的就是你!\nThai:"
input_ids = tokenizer(prompt, return_tensors="pt", padding=True, max_length=200, truncation=True).input_ids
# Translation
with torch.no_grad():
generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_new_tokens=200, do_sample=True, temperature=0.6, top_p=0.9)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)
- Downloads last month
- 279
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.