metadata

license: other
license_name: qwen
language:
  - th
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - openthaigpt
  - qwen

🇹🇭 OpenThaiGPT 7b 1.5 Instruct

More Info

🇹🇭 OpenThaiGPT 7b Version 1.5 is an advanced 7-billion-parameter Thai language chat model based on Qwen v2.5 released on September 30, 2024. It has been specifically fine-tuned on over 2,000,000 Thai instruction pairs and is capable of answering Thai-specific domain questions.

Online Demo:

https://demo72b.aieat.or.th/

Example code for API Calling

https://github.com/OpenThaiGPT/openthaigpt1.5_api_examples

Highlights

State-of-the-art Thai language LLM, achieving the highest average scores across various Thai language exams compared to other open-source Thai LLMs.
Multi-turn conversation support for extended dialogues.
Retrieval Augmented Generation (RAG) compatibility for enhanced response generation.
Impressive context handling: Processes up to 131,072 tokens of input and generates up to 8,192 tokens, enabling detailed and complex interactions.

Benchmark on OpenThaiGPT Eval

** Please take a look at openthaigpt/openthaigpt1.5-7b-instruct for this model's evaluation result.

Exam names	scb10x/llama-3-typhoon-v1.5x-8b-instruct	meta-llama/Llama-3.1-7B-Instruct	Qwen/Qwen2.5-7B-Instruct_stat	openthaigpt/openthaigpt1.5-7b
01_a_level	46.67%	47.50%	58.33%	60.00%
02_tgat	32.00%	36.00%	32.00%	36.00%
03_tpat1	52.50%	55.00%	57.50%	57.50%
04_investment_consult	56.00%	48.00%	68.00%	76.00%
05_facebook_beleble_th_200	78.00%	73.00%	79.00%	81.00%
06_xcopa_th_200	79.50%	69.00%	80.50%	81.00%
07_xnli2.0_th_200	56.50%	55.00%	53.00%	54.50%
08_onet_m3_thai	48.00%	32.00%	72.00%	64.00%
09_onet_m3_social	75.00%	50.00%	90.00%	80.00%
10_onet_m3_math	25.00%	18.75%	31.25%	31.25%
11_onet_m3_science	46.15%	42.31%	46.15%	46.15%
12_onet_m3_english	70.00%	76.67%	86.67%	83.33%
13_onet_m6_thai	47.69%	29.23%	46.15%	53.85%
14_onet_m6_math	29.41%	17.65%	29.41%	29.41%
15_onet_m6_social	50.91%	43.64%	56.36%	58.18%
16_onet_m6_science	42.86%	32.14%	57.14%	57.14%
17_onet_m6_english	65.38%	71.15%	78.85%	80.77%
Micro Average	60.65%	55.60%	64.41%	65.78%

Thai language multiple choice exams, Test on unseen test set, Zero-shot learning. Benchmark source code and exams information: https://github.com/OpenThaiGPT/openthaigpt_eval

(Updated on: 30 September 2024)

Benchmark on scb10x/thai_exam

Models	Thai Exam (Acc)
api/claude-3-5-sonnet-20240620	69.2
openthaigpt/openthaigpt1.5-72b-instruct*	64.07
api/gpt-4o-2024-05-13	63.89
hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4	63.54
Qwen/Qwen2-72B-Instruct	58.23
meta-llama/Meta-Llama-3.1-70B-Instruct	58.23
scb10x/llama-3-typhoon-v1.5x-70b-instruct	58.76
Qwen/Qwen2.5-14B-Instruct	57.35
api/gpt-4o-mini-2024-07-18	54.51
openthaigpt/openthaigpt1.5-7b-instruct*	52.04
SeaLLMs/SeaLLMs-v3-7B-Chat	51.33
openthaigpt/openthaigpt-1.0.0-70b-chat	50.09

* Evaluated by OpenThaiGPT team using scb10x/thai_exam.

Licenses

Built with Qwen
Qwen License: Allow Research and Commercial uses but if your user base exceeds 100 million monthly active users, you need to negotiate a separate commercial license. Please see LICENSE file for more information.

Supports

Official website: https://openthaigpt.aieat.or.th
Facebook page: https://web.facebook.com/groups/openthaigpt
A Discord server for discussion and support here
E-mail: kobkrit@aieat.or.th

Prompt Format

Prompt format is based on ChatML.

<|im_start|>system\n{sytem_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n

System prompt:

คุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์

Examples

Single Turn Conversation Example

<|im_start|>system\nคุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์<|im_end|>\n<|im_start|>user\nสวัสดีครับ<|im_end|>\n<|im_start|>assistant\n

Single Turn Conversation with Context (RAG) Example

<|im_start|>system\nคุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์<|im_end|>\n<|im_start|>user\nกรุงเทพมหานคร เป็นเมืองหลวง นครและมหานครที่มีประชากรมากที่สุดของประเทศไทย กรุงเทพมหานครมีพื้นที่ทั้งหมด 1,568.737 ตร.กม. มีประชากรตามทะเบียนราษฎรกว่า 8 ล้านคน\nกรุงเทพมหานครมีพื้นที่เท่าไร่<|im_end|>\n<|im_start|>assistant\n

Multi Turn Conversation Example

First turn

<|im_start|>system\nคุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์<|im_end|>\n<|im_start|>user\nสวัสดีครับ<|im_end|>\n<|im_start|>assistant\n

Second turn

<|im_start|>system\nคุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์<|im_end|>\n<|im_start|>user\nสวัสดีครับ<|im_end|>\n<|im_start|>assistant\nสวัสดีครับ ยินดีต้อนรับครับ คุณต้องการให้ฉันช่วยอะไรครับ?<|im_end|>\n<|im_start|>user\nกรุงเทพมหานคร ชื่อเต็มยาวๆคืออะไร<|im_end|>\n<|im_start|>assistant\n

Result

<|im_start|>system\nคุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์<|im_end|>\n<|im_start|>user\nสวัสดีครับ<|im_end|>\n<|im_start|>assistant\nสวัสดีครับ ยินดีต้อนรับครับ คุณต้องการให้ฉันช่วยอะไรครับ?<|im_end|>\n<|im_start|>user\nกรุงเทพมหานคร ชื่อเต็มยาวๆคืออะไร<|im_end|>\n<|im_start|>assistant\nชื่อเต็มของกรุงเทพมหานครคือ \"กรุงเทพมหานคร อมรรัตนโกสินทร์ มหินทรายุธยา มหาดิลกภพ นพรัตนราชธานีบูรีรมย์ อุดมราชนิเวศน์มหาสถาน อมรพิมานอวตารสถิต สักกะทัตติยวิษณุกรรมประสิทธิ์\"

How to use

Huggingface

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "openthaigpt/openthaigpt1.5-72b-instruct"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "ประเทศไทยคืออะไร"
messages = [
    {"role": "system", "content": "คุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์"},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

vLLM

Install VLLM (https://github.com/vllm-project/vllm)
Run server

vllm serve openthaigpt/openthaigpt1.5-72b-instruct --tensor-parallel-size 4

Note, change --tensor-parallel-size 4 to the amount of available GPU cards.

Run inference (CURL example)

curl -X POST 'http://127.0.0.1:8000/v1/completions' \
-H 'Content-Type: application/json' \
-d '{
  "model": ".",
  "prompt": "<|im_start|>system\nคุณคือผู้ช่วยตอบคำถามที่ฉลาดและซื่อสัตย์<|im_end|>\n<|im_start|>user\nสวัสดีครับ<|im_end|>\n<|im_start|>assistant\n",
  "max_tokens": 512,
  "temperature": 0.7,
  "top_p": 0.8,
  "top_k": 40,
  "stop": ["<|im_end|>"]
}'

Processing Long Texts

The current config.json is set for context length up to 32,768 tokens. To handle extensive inputs exceeding 32,768 tokens, we utilize YaRN, a technique for enhancing model length extrapolation, ensuring optimal performance on lengthy texts.

For supported frameworks, you could add the following to config.json to enable YaRN:

{
  ...
  "rope_scaling": {
    "factor": 4.0,
    "original_max_position_embeddings": 32768,
    "type": "yarn"
  }
}

GPU Memory Requirements

Number of Parameters	FP 16 bits	8 bits (Quantized)	4 bits (Quantized)	Example Graphic Card for 4 bits
7b	24 GB	12 GB	6 GB	Nvidia RTX 4060 8GB
13b	48 GB	24 GB	12 GB	Nvidia RTX 4070 16GB
72b	192 GB	96 GB	48 GB	Nvidia RTX 4090 24GB x 2 cards

Authors

Sumeth Yuenyong (sumeth.yue@mahidol.edu)
Kobkrit Viriyayudhakorn (kobkrit@aieat.or.th)
Apivadee Piyatumrong (apivadee.piy@nectec.or.th)
Jillaphat Jaroenkantasima (autsadang41@gmail.com)
Thaweewat Rugsujarit (thaweewr@scg.com)
Norapat Buppodom (new@norapat.com)
Koravich Sangkaew (kwankoravich@gmail.com)
Peerawat Rojratchadakorn (peerawat.roj@gmail.com)
Surapon Nonesung (nonesungsurapon@gmail.com)
Chanon Utupon (chanon.utupon@gmail.com)
Sadhis Wongprayoon (sadhis.tae@gmail.com)
Nucharee Thongthungwong (nuchhub@hotmail.com)
Chawakorn Phiantham (mondcha1507@gmail.com)
Patteera Triamamornwooth (patt.patteera@gmail.com)
Nattarika Juntarapaoraya (natt.juntara@gmail.com)
Kriangkrai Saetan (kraitan.ss21@gmail.com)
Pitikorn Khlaisamniang (pitikorn32@gmail.com)

Disclaimer: Provided responses are not guaranteed.

openthaigpt
/

openthaigpt1.5-7b-instruct