TAIDE-7B-Thai-Pretrain-LoRA

Model Description

TAIDE-7B-Thai-Pretrain-LoRA is a machine translation model designed to translate between Traditional Chinese and Thai. The model has 7 billion parameters and is built upon the Trustworthy AI Dialogue Engine by Taiwan (TAIDE). It employs a two-stage fine-tuning process to enhance its proficiency in Thai and align its translation capabilities with the nuances of both Traditional Chinese and Thai languages.

Training Methodology

Utilized the Advanced Language Model-based Translator (ALMA) strategy developed by Xu et al. (2024):

Initial Pre-Training Stage:

Objective: To build a robust foundational understanding of the Thai language.
Method: Continue pre-train on a comprehensive dataset of one million Thai instances.

Fine-Tuning Stage:

Objective: To align the model's translation capabilities with the specific nuances of both languages.
Method: Fine-tuned on a smaller set of high-quality Traditional Chinese-Thai parallel data.

A quick start to use our model.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Wilailack/TAIDE-7B-Thai-Pretrain-LoRA", use_fast=True)
model = AutoModelForCausalLM.from_pretrained("Wilailack/TAIDE-7B-Thai-Pretrain-LoRA", torch_dtype=torch.bfloat16, device_map="auto")

# Add the source setence into the prompt template
prompt="Translate this from Chinese to Thai:\nChinese: 我最愛的就是你！\nThai:"
input_ids = tokenizer(prompt, return_tensors="pt", padding=True, max_length=200, truncation=True).input_ids

# Translation
with torch.no_grad():
    generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_new_tokens=200, do_sample=True, temperature=0.6, top_p=0.9)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)

Wilailack
/

TAIDE-7B-Thai-Pretrain-LoRA

TAIDE-7B-Thai-Pretrain-LoRA

Model Description

Training Methodology

Space using Wilailack/TAIDE-7B-Thai-Pretrain-LoRA 1