RichardErkhov commited on
Commit
a04823c
1 Parent(s): b25a244

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +146 -0
README.md ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ SeaLLMs-v3-1.5B - GGUF
11
+ - Model creator: https://huggingface.co/SeaLLMs/
12
+ - Original model: https://huggingface.co/SeaLLMs/SeaLLMs-v3-1.5B/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [SeaLLMs-v3-1.5B.Q2_K.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q2_K.gguf) | Q2_K | 0.63GB |
18
+ | [SeaLLMs-v3-1.5B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.IQ3_XS.gguf) | IQ3_XS | 0.68GB |
19
+ | [SeaLLMs-v3-1.5B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.IQ3_S.gguf) | IQ3_S | 0.71GB |
20
+ | [SeaLLMs-v3-1.5B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q3_K_S.gguf) | Q3_K_S | 0.71GB |
21
+ | [SeaLLMs-v3-1.5B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.IQ3_M.gguf) | IQ3_M | 0.72GB |
22
+ | [SeaLLMs-v3-1.5B.Q3_K.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q3_K.gguf) | Q3_K | 0.77GB |
23
+ | [SeaLLMs-v3-1.5B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q3_K_M.gguf) | Q3_K_M | 0.77GB |
24
+ | [SeaLLMs-v3-1.5B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q3_K_L.gguf) | Q3_K_L | 0.82GB |
25
+ | [SeaLLMs-v3-1.5B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.IQ4_XS.gguf) | IQ4_XS | 0.84GB |
26
+ | [SeaLLMs-v3-1.5B.Q4_0.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q4_0.gguf) | Q4_0 | 0.87GB |
27
+ | [SeaLLMs-v3-1.5B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.IQ4_NL.gguf) | IQ4_NL | 0.88GB |
28
+ | [SeaLLMs-v3-1.5B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q4_K_S.gguf) | Q4_K_S | 0.88GB |
29
+ | [SeaLLMs-v3-1.5B.Q4_K.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q4_K.gguf) | Q4_K | 0.92GB |
30
+ | [SeaLLMs-v3-1.5B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q4_K_M.gguf) | Q4_K_M | 0.92GB |
31
+ | [SeaLLMs-v3-1.5B.Q4_1.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q4_1.gguf) | Q4_1 | 0.95GB |
32
+ | [SeaLLMs-v3-1.5B.Q5_0.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q5_0.gguf) | Q5_0 | 1.02GB |
33
+ | [SeaLLMs-v3-1.5B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q5_K_S.gguf) | Q5_K_S | 1.02GB |
34
+ | [SeaLLMs-v3-1.5B.Q5_K.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q5_K.gguf) | Q5_K | 1.05GB |
35
+ | [SeaLLMs-v3-1.5B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q5_K_M.gguf) | Q5_K_M | 1.05GB |
36
+ | [SeaLLMs-v3-1.5B.Q5_1.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q5_1.gguf) | Q5_1 | 1.1GB |
37
+ | [SeaLLMs-v3-1.5B.Q6_K.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q6_K.gguf) | Q6_K | 1.19GB |
38
+ | [SeaLLMs-v3-1.5B.Q8_0.gguf](https://huggingface.co/RichardErkhov/SeaLLMs_-_SeaLLMs-v3-1.5B-gguf/blob/main/SeaLLMs-v3-1.5B.Q8_0.gguf) | Q8_0 | 1.53GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ license: other
46
+ license_name: seallms
47
+ license_link: https://huggingface.co/SeaLLMs/SeaLLM-13B-Chat/blob/main/LICENSE
48
+ language:
49
+ - en
50
+ - zh
51
+ - id
52
+ - vi
53
+ - th
54
+ - ms
55
+ - tl
56
+ - ta
57
+ - jv
58
+ tags:
59
+ - sea
60
+ - multilingual
61
+
62
+ ---
63
+
64
+ # *SeaLLMs-v3* - Large Language Models for Southeast Asia
65
+
66
+ <p align="center">
67
+ <a href="https://damo-nlp-sg.github.io/SeaLLMs/" target="_blank" rel="noopener">Website</a>
68
+ &nbsp;&nbsp;
69
+ <a href="https://huggingface.co/SeaLLMs/SeaLLMs-v3-1.5B" target="_blank" rel="noopener">Model</a>
70
+ &nbsp;&nbsp;
71
+ <a href="https://huggingface.co/spaces/SeaLLMs/SeaLLM-Chat" target="_blank" rel="noopener"> 🤗 DEMO</a>
72
+ &nbsp;&nbsp;
73
+ <a href="https://github.com/DAMO-NLP-SG/SeaLLMs" target="_blank" rel="noopener">Github</a>
74
+ &nbsp;&nbsp;
75
+ <a href="https://arxiv.org/pdf/2407.19672" target="_blank" rel="noopener">[NEW] Technical Report</a>
76
+ </p>
77
+
78
+
79
+ We introduce **SeaLLMs-v3**, the latest series of the SeaLLMs (Large Language Models for Southeast Asian languages) family. It achieves state-of-the-art performance among models with similar sizes, excelling across a diverse array of tasks such as world knowledge, mathematical reasoning, translation, and instruction following. In the meantime, it was specifically enhanced to be more trustworthy, exhibiting reduced hallucination and providing safe responses, particularly in queries closed related to Southeast Asian culture.
80
+
81
+ ## 🔥 Highlights
82
+
83
+ - State-of-the-art performance compared to open-source models of similar sizes, evaluated across various dimensions such as human exam questions, instruction-following, mathematics, and translation.
84
+ - Significantly enhanced instruction-following capability, especially in multi-turn settings.
85
+ - Ensures safety in usage with significantly reduced instances of hallucination and sensitivity to local contexts.
86
+
87
+ ## Uses
88
+
89
+ SeaLLMs is tailored for handling a wide range of languages spoken in the SEA region, including English, Chinese, Indonesian, Vietnamese, Thai, Tagalog, Malay, Burmese, Khmer, Lao, Tamil, and Javanese.
90
+
91
+ This page introduces the **SeaLLMs-v3-1.5B** model, which can be easily fine-tuned for your specific downstream tasks, especially in SEA languages.
92
+ Note that this is a base model, if you are looking for a model that can be directly applicable to your downstream applications, you may want to check the chat version model: **[SeaLLMs-v3-1.5B-Chat](https://huggingface.co/SeaLLMs/SeaLLMs-v3-1.5B-Chat)**.
93
+
94
+
95
+ ## Evaluation
96
+
97
+ ## Evaluation
98
+
99
+ We evaluate SeaLLMs-v3-1.5B mainly using human exam questions.
100
+
101
+ #### Multilingual World Knowledge - M3Exam
102
+
103
+ [M3Exam](https://arxiv.org/abs/2306.05179) consists of local exam questions collected from each country. It reflects the model's world knowledge (e.g., with language or social science subjects) and reasoning abilities (e.g., with mathematics or natural science subjects).
104
+
105
+ | Model | en | zh | id | th | vi | avg | avg_sea |
106
+ | :------------------ | --------: | --------: | --------: | --------: | --------: | --------: | --------: |
107
+ | Gemma-2B | 0.411 | 0.267 | 0.296 | 0.283 | 0.313 | 0.314 | 0.297 |
108
+ | Sailor-1.8B | 0.270 | 0.239 | 0.250 | 0.261 | 0.260 | 0.256 | 0.257 |
109
+ | Sailor-4B | 0.387 | 0.295 | 0.275 | 0.296 | 0.311 | 0.313 | 0.294 |
110
+ | Qwen2-1.5B | 0.628 | **0.753** | 0.409 | 0.352 | 0.443 | 0.517 | 0.401 |
111
+ | **SeaLLMs-v3-1.5B** | **0.635** | 0.745 | **0.424** | **0.371** | **0.465** | **0.528** | **0.420** |
112
+
113
+ #### Multilingual World Knowledge - MMLU
114
+
115
+ [MMLU](https://arxiv.org/abs/2009.03300) questions are translated to SEA languages for evaluation, which primarily tests the cross-lingual alignment of the model as the required knowledge is still mainly Western-focused.
116
+
117
+ | Model | en | zh | id | th | vi | avg | avg_sea |
118
+ | :------------------ | --------: | --------: | --------: | --------: | --------: | --------: | --------: |
119
+ | Gemma-2B | 0.374 | 0.304 | 0.315 | 0.292 | 0.305 | 0.318 | 0.304 |
120
+ | Sailor-1.8B | 0.293 | 0.251 | 0.268 | 0.256 | 0.256 | 0.265 | 0.260 |
121
+ | Sailor-4B | 0.333 | 0.267 | 0.299 | 0.278 | 0.282 | 0.292 | 0.286 |
122
+ | Qwen2-1.5B | 0.552 | **0.491** | 0.426 | 0.366 | 0.398 | 0.447 | 0.397 |
123
+ | **SeaLLMs-v3-1.5B** | **0.553** | 0.487 | **0.443** | **0.377** | **0.423** | **0.456** | **0.414** |
124
+
125
+ ## Acknowledgement to Our Linguists
126
+
127
+ We would like to express our special thanks to our professional and native linguists, Tantong Champaiboon, Nguyen Ngoc Yen Nhi and Tara Devina Putri, who helped build, evaluate, and fact-check our sampled pretraining and SFT dataset as well as evaluating our models across different aspects, especially safety.
128
+
129
+
130
+ ## Citation
131
+
132
+ If you find our project useful, we hope you would kindly star our repo and cite our work as follows:
133
+
134
+ ```
135
+ @article{damonlp2024seallm3,
136
+ author = {Wenxuan Zhang*, Hou Pong Chan*, Yiran Zhao*, Mahani Aljunied*,
137
+ Jianyu Wang*, Chaoqun Liu, Yue Deng, Zhiqiang Hu, Weiwen Xu,
138
+ Yew Ken Chia, Xin Li, Lidong Bing},
139
+ title = {SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages},
140
+ year = {2024},
141
+ url = {https://arxiv.org/abs/2407.19672}
142
+ }
143
+ ```
144
+
145
+ Corresponding Author: l.bing@alibaba-inc.com
146
+