Edit model card

TAIDE-7B-Thai-Pretrain-LoRA

Model Description

TAIDE-7B-Thai-Pretrain-LoRA is a machine translation model designed to translate between Traditional Chinese and Thai. The model has 7 billion parameters and is built upon the Trustworthy AI Dialogue Engine by Taiwan (TAIDE). It employs a two-stage fine-tuning process to enhance its proficiency in Thai and align its translation capabilities with the nuances of both Traditional Chinese and Thai languages.

Training Methodology

Utilized the Advanced Language Model-based Translator (ALMA) strategy developed by Xu et al. (2024):

  1. Initial Pre-Training Stage:
  • Objective: To build a robust foundational understanding of the Thai language.
  • Method: Continue pre-train on a comprehensive dataset of one million Thai instances.
  1. Fine-Tuning Stage:
  • Objective: To align the model's translation capabilities with the specific nuances of both languages.
  • Method: Fine-tuned on a smaller set of high-quality Traditional Chinese-Thai parallel data.

A quick start to use our model.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Wilailack/TAIDE-7B-Thai-Pretrain-LoRA", use_fast=True)
model = AutoModelForCausalLM.from_pretrained("Wilailack/TAIDE-7B-Thai-Pretrain-LoRA", torch_dtype=torch.bfloat16, device_map="auto")

# Add the source setence into the prompt template
prompt="Translate this from Chinese to Thai:\nChinese: 我最愛的就是你!\nThai:"
input_ids = tokenizer(prompt, return_tensors="pt", padding=True, max_length=200, truncation=True).input_ids

# Translation
with torch.no_grad():
    generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_new_tokens=200, do_sample=True, temperature=0.6, top_p=0.9)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)
Downloads last month
279
Safetensors
Model size
6.94B params
Tensor type
FP16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using Wilailack/TAIDE-7B-Thai-Pretrain-LoRA 1