File size: 11,533 Bytes
8e58d74
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
Quantization made by Richard Erkhov.

[Github](https://github.com/RichardErkhov)

[Discord](https://discord.gg/pvy7H8DZMG)

[Request more models](https://github.com/RichardErkhov/quant_request)


llama-3-typhoon-v1.5x-70b-instruct - GGUF
- Model creator: https://huggingface.co/scb10x/
- Original model: https://huggingface.co/scb10x/llama-3-typhoon-v1.5x-70b-instruct/


| Name | Quant method | Size |
| ---- | ---- | ---- |
| [llama-3-typhoon-v1.5x-70b-instruct.Q2_K.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.Q2_K.gguf) | Q2_K | 24.56GB |
| [llama-3-typhoon-v1.5x-70b-instruct.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.IQ3_XS.gguf) | IQ3_XS | 27.29GB |
| [llama-3-typhoon-v1.5x-70b-instruct.IQ3_S.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.IQ3_S.gguf) | IQ3_S | 28.79GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.Q3_K_S.gguf) | Q3_K_S | 28.79GB |
| [llama-3-typhoon-v1.5x-70b-instruct.IQ3_M.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.IQ3_M.gguf) | IQ3_M | 29.74GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q3_K.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.Q3_K.gguf) | Q3_K | 31.91GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.Q3_K_M.gguf) | Q3_K_M | 31.91GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.Q3_K_L.gguf) | Q3_K_L | 34.59GB |
| [llama-3-typhoon-v1.5x-70b-instruct.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.IQ4_XS.gguf) | IQ4_XS | 35.64GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q4_0.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/blob/main/llama-3-typhoon-v1.5x-70b-instruct.Q4_0.gguf) | Q4_0 | 37.22GB |
| [llama-3-typhoon-v1.5x-70b-instruct.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | IQ4_NL | 37.58GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q4_K_S | 37.58GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q4_K.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q4_K | 39.6GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q4_K_M | 39.6GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q4_1.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q4_1 | 41.27GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q5_0.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q5_0 | 45.32GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q5_K_S | 45.32GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q5_K.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q5_K | 46.52GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q5_K_M | 46.52GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q5_1.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q5_1 | 49.36GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q6_K.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q6_K | 53.91GB |
| [llama-3-typhoon-v1.5x-70b-instruct.Q8_0.gguf](https://huggingface.co/RichardErkhov/scb10x_-_llama-3-typhoon-v1.5x-70b-instruct-gguf/tree/main/) | Q8_0 | 69.83GB |




Original model description:
---
language:
- th
- en
pipeline_tag: text-generation
license: llama3
---
**Llama-3-Typhoon-1.5X-70B-instruct: Thai Large Language Model (Instruct)**

**Llama-3-Typhoon-1.5X-70B-instruct** is a 70 billion parameter instruct model designed for Thai 🇹🇭 language. It demonstrates competitive performance with GPT-4-0612, and is optimized for **application** use cases, **Retrieval-Augmented Generation (RAG), constrained generation**, and **reasoning** tasks.

Built on Typhoon 1.5 70B (not yet released) and Llama 3 70B Instruct. this model is a result of our experiment on **cross-lingual transfer**. It utilizes the [task-arithmetic model editing](https://arxiv.org/abs/2212.04089) technique, combining the Thai understanding capability of Typhoon with the human alignment performance of Llama 3 Instruct.

Remark: To acknowledge Meta's efforts in creating the foundation model and comply with the license, we explicitly include "llama-3" in the model name.

## **Model Description**

- **Model type**: A 70B instruct decoder-only model based on the Llama architecture
- **Requirement**: Transformers 4.38.0 or newer
- **Primary Language(s)**: Thai 🇹🇭 and English 🇬🇧
- **License**: [**Llama 3 Community License**](https://llama.meta.com/llama3/license/)

## **Performance**

We evaluated the model's performance in **Language & Knowledge Capabilities** and **Instruction Following Capabilities**.

- **Language & Knowledge Capabilities**:
    - Assessed using multiple-choice question-answering datasets such as ThaiExam and MMLU.
- **Instruction Following Capabilities**:
    - Evaluated based on beta users' feedback, focusing on two factors:
        - **Human Alignment & Reasoning**: Ability to generate responses that are clear and logically structured across multiple steps.
            - Evaluated using [MT-Bench](https://arxiv.org/abs/2306.05685) — How LLMs can align with human needs.
        - **Instruction-following**: Ability to adhere to specified constraints in the instructions.
            - Evaluated using [IFEval](https://arxiv.org/abs/2311.07911) — How LLMs can follow specified constraints, such as formatting and brevity.
- **Agentic Capabilities**:
    - Evaluated in agent use-cases using [Hugging Face's Transformer Agents](https://huggingface.co/blog/agents) and the associated [benchmark](https://huggingface.co/blog/open-source-llms-as-agents).

Remark: We developed the Thai (TH) pairs by translating the original datasets into Thai through machine and human methods.

### ThaiExam

| Model | ONET | IC | TGAT | TPAT-1 | A-Level | Average (ThaiExam) | MMLU |
| --- | --- | --- | --- | --- | --- | --- | --- |
| Typhoon-1.5X 70B | **0.565** | 0.68 | **0.778** | **0.517** | 0.56 | **0.620** | 0.7945 |
| gpt-4-0612 | 0.493 | **0.69** | 0.744 | 0.509 | **0.616** | 0.610 | **0.864**** |
| --- | --- | --- | --- | --- | --- | --- | --- |
| gpt-4o | 0.62 | 0.63 | 0.789 | 0.56 | 0.623 | 0.644 | 0.887** |

** We report the MMLU score that is reported in [GPT-4o Tech Report](https://openai.com/index/hello-gpt-4o/). 

### MT-Bench

| Model | MT-Bench Thai | MT-Bench English |
| --- | --- | --- |
| Typhoon-1.5X 70B | **8.029** | **8.797** |
| gpt-4-0612 | 7.801 | 8.671 |
| --- | --- | --- |
| gpt-4o | 8.514 | 9.184 |

### IFEval

| Model | IFEval Thai | IFEval English |
| --- | --- | --- |
| Typhoon-1.5X 70B | **0.645** | **0.810** |
| gpt-4-0612 | 0.612 | 0.793* |
| --- | --- | --- |
| gpt-4o | 0.737 | 0.871 |

* We report the number from IFEval paper.

### Agent

| Model | GAIA - Thai/English | GSM8K - Thai/English | HotpotQA - Thai/English |
| --- | --- | --- | --- |
| gpt-3.5-turbo-0125 | **18.42**/37.5 | 70/80 | 39.56/59 |
| Typhoon-1.5X 70B | 17.10/36.25 | 80/95 | 52.7/65.83 |
| gpt-4-0612 | 17.10/**38.75** | **90**/**100** | **56.41**/**76.25** |
| --- | --- | --- | --- |
| gpt-4o | 44.73/57.5 | 100/100 | 71.64/76.58 |

## Insight

We utilized **model editing** techniques and found that the most critical feature for generating accurate Thai answers is located in the backend (the upper layers of the transformer block). Accordingly, we incorporated a high ratio of Typhoon components in these backend layers to enhance our model’s performance.

## **Usage Example**

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "scb10x/llama-3-typhoon-v1.5x-70b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
) # We don't recommend using BNB 4-bit (load_in_4bit) here. Instead, use AWQ, as detailed here: https://huggingface.co/scb10x/llama-3-typhoon-v1.5x-70b-instruct-awq.

messages = [...] # add message here

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=512,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.4,
    top_p=0.95,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
```

## **Chat Template**

We use the Llama 3 chat template.

```python
{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}
```

## **Intended Uses & Limitations**

This model is experimental and might not be fully evaluated for all use cases. Developers should assess risks in the context of their specific applications.

## **Follow us**

[**https://twitter.com/opentyphoon**](https://twitter.com/opentyphoon)

## **Support**

[**https://discord.gg/CqyBscMFpg**](https://discord.gg/CqyBscMFpg)

## **SCB 10X Typhoon Team**

- Kunat Pipatanakul, Potsawee Manakul, Sittipong Sripaisarnmongkol, Natapong Nitarach, Pathomporn Chokchainant, Kasima Tharnpipitchai
- If you find Typhoon-1.5X useful for your work, please cite it using:

```
@article{pipatanakul2023typhoon,
    title={Typhoon: Thai Large Language Models}, 
    author={Kunat Pipatanakul and Phatrasek Jirabovonvisut and Potsawee Manakul and Sittipong Sripaisarnmongkol and Ruangsak Patomwong and Pathomporn Chokchainant and Kasima Tharnpipitchai},
    year={2023},
    journal={arXiv preprint arXiv:2312.13951},
    url={https://arxiv.org/abs/2312.13951}
}
```

## **Contact Us**

- General & Collaboration: [**kasima@scb10x.com**](mailto:kasima@scb10x.com), [**pathomporn@scb10x.com**](mailto:pathomporn@scb10x.com)
- Technical: [**kunat@scb10x.com**](mailto:kunat@scb10x.com)