SeaLLMs
/

SeaLLM-7B-v2-gguf

GGUF

4 papers

Model card Files Files and versions Community

nxphi47 commited on Feb 7

Commit

840d054

•

1 Parent(s): e0b2cb4

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -3

README.md CHANGED Viewed

@@ -30,8 +30,8 @@ We introduce [SeaLLM-7B-v2](https://huggingface.co/SeaLLMs/SeaLLM-7B-v2), the st
 - Technical report: [Arxiv: SeaLLMs - Large Language Models for Southeast Asia](https://arxiv.org/pdf/2312.00738.pdf).
 - Model weights:
   - [SeaLLM-7B-v2](https://huggingface.co/SeaLLMs/SeaLLM-7B-v2).
-  - [SeaLLM-7B-v2-gguf](https://huggingface.co/SeaLLMs/SeaLLM-7B-v2-gguf).
-  - [SeaLLM-7B-v2-GGUF (by Lonestriker)](https://huggingface.co/LoneStriker/SeaLLM-7B-v2-GGUF).
 <blockquote style="color:red">
@@ -155,7 +155,9 @@ You are a helpful assistant.</s><|im_start|>user
 Hello world</s><|im_start|>assistant
 Hi there, how can I help?</s>"""
-# NOTE previous commit has \n between </s> and <|im_start|>, that was incorrect!
 # ! ENSURE 1 and only 1 bos `<s>` at the beginning of sequence
 print(tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt)))
@@ -171,6 +173,7 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
 device = "cuda" # the device to load the model onto
 model = AutoModelForCausalLM.from_pretrained("SeaLLMs/SeaLLM-7B-v2", torch_dtype=torch.bfloat16, device_map=device)
 tokenizer = AutoTokenizer.from_pretrained("SeaLLMs/SeaLLM-7B-v2")
@@ -201,6 +204,8 @@ from vllm import LLM, SamplingParams
 TURN_TEMPLATE = "<|im_start|>{role}\n{content}</s>"
 TURN_PREFIX = "<|im_start|>{role}\n"
 def seallm_chat_convo_format(conversations, add_assistant_prefix: bool, system_prompt=None):
     # conversations: list of dict with key `role` and `content` (openai format)
     if conversations[0]['role'] != 'system' and system_prompt is not None:

 - Technical report: [Arxiv: SeaLLMs - Large Language Models for Southeast Asia](https://arxiv.org/pdf/2312.00738.pdf).
 - Model weights:
   - [SeaLLM-7B-v2](https://huggingface.co/SeaLLMs/SeaLLM-7B-v2).
+  - [SeaLLM-7B-v2-gguf](https://huggingface.co/SeaLLMs/SeaLLM-7B-v2-gguf). Run with LM-studio: [SeaLLM-7B-v2-q4_0](https://huggingface.co/SeaLLMs/SeaLLM-7B-v2-gguf/blob/main/SeaLLM-7B-v2.q4_0.gguf) and SeaLLM-7B-v2-q8_0.
+  - [SeaLLM-7B-v2-GGUF (thanks Lonestriker)](https://huggingface.co/LoneStriker/SeaLLM-7B-v2-GGUF). NOTE: Lonestriker's GGUF uses old and incorrect chat format (see below).
 <blockquote style="color:red">
 Hello world</s><|im_start|>assistant
 Hi there, how can I help?</s>"""
+# NOTE: previous commit has \n between </s> and <|im_start|>, that was incorrect!
+# <|im_start|> is not a special token.
+# Transformers chat_template should be consistent with vLLM format below.
 # ! ENSURE 1 and only 1 bos `<s>` at the beginning of sequence
 print(tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt)))
 device = "cuda" # the device to load the model onto
+# use bfloat16 to ensure the best performance.
 model = AutoModelForCausalLM.from_pretrained("SeaLLMs/SeaLLM-7B-v2", torch_dtype=torch.bfloat16, device_map=device)
 tokenizer = AutoTokenizer.from_pretrained("SeaLLMs/SeaLLM-7B-v2")
 TURN_TEMPLATE = "<|im_start|>{role}\n{content}</s>"
 TURN_PREFIX = "<|im_start|>{role}\n"
+# There is no \n between </s> and <|im_start|>.
 def seallm_chat_convo_format(conversations, add_assistant_prefix: bool, system_prompt=None):
     # conversations: list of dict with key `role` and `content` (openai format)
     if conversations[0]['role'] != 'system' and system_prompt is not None: