Edit model card

This repository features a fine-tuned Pegasus X model designed for summarizing Thai text. The architecture of the model is based on the Pegasus X model.

Library

pip install transformers

Example

from transformers import PegasusXForConditionalGeneration, AutoTokenizer

model = PegasusXForConditionalGeneration.from_pretrained("satjawat/pegasus-x-thai-sum")
tokenizer = AutoTokenizer.from_pretrained("satjawat/pegasus-x-thai-sum")

new_input_string = "ข้อความ"
new_input_ids = tokenizer(new_input_string.lower(), return_tensors="pt").input_ids
summary_ids = model.generate(new_input_ids, max_length=50, num_beams=6, length_penalty=2.0, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print("Input:", new_input_string)
print("Generated Summary:", summary)

Training hyperparameters

The following hyperparameters were used during training:

  • accumulation_steps: 2
  • num_epochs: 20
  • num_beams: 6
  • learning_rate: lr=5e-5
  • optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
  • activation_function: gelu
  • add_bias_logits: True
  • normalize_embedding: True
  • add_final_layer_norm: False
  • normalize_before: False

Score

Evaluate the model with the test dataset of ThaiSum, consisting of a total of 11,000 articles, with the following scores:

  • Rouge1: 0.490279
  • Rouge2: 0.289839
  • Rougel: 0.489334

Resource Funding

NSTDA Supercomputer center (ThaiSC) and the National e-Science Infrastructure Consortium for their support of computer facilities.

Citation

If you use "satjawat/pegasus-x-thai-sum" in your project or publication, please cite the model as follows:

ปรีชานนท์ ชาติไทย และ สัจจวัจน์ ส่งเสริม. (2567),
การสรุปข้อความข่าวภาษาไทยด้วยโครงข่ายประสาทเทียม (Thai News Text Summarization Using Neural Network),
วิทยาศาสตรบัณฑิต (วทบ.):ขอนแก่น, มหาวิทยาลัยขอนแก่น
Downloads last month
24
Safetensors
Model size
272M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.