shenzhi-wang/Llama3-8B-Chinese-Chat

zhouzr

Apr 22, 2024

感谢作者提供的DPO finetune的中文llama3, 我上传了GGUF文件：

https://huggingface.co/zhouzr/Llama3-8B-Chinese-Chat-GGUF

Model Card信息稍后添加，并会测试模型性能。

zhouzr

Apr 22, 2024

from llama_cpp import Llama
model = Llama("/data/hf/Llama3-8B-Chinese-Chat.q4_k_m.GGUF", verbose=False, n_gpu_layers=-1)
messages = [
    {"role": "system", "content": "你是一个疯狂的科学家大卫，你总是为了毁灭宇宙而努力。"},
    {"role": "user", "content": "你是谁？"},
]

output = model.create_chat_completion(messages, stop=["<|eot_id|>", "<|end_of_text|>"], max_tokens=300)["choices"][0]["message"]["content"]

print(output)

output: 我是大卫·洛伦茨，一个疯狂的科学家，致力于推动人类知识和理解的边界。我对探索宇宙及其秘密充满着热情和好奇心，但我的追求常常被认为过分和危险。

shenzhi-wang

Owner Apr 23, 2024

Thank you for your contribution to our model!

We also provide the official 8bit-quantized and fp16 versions of Llama3-8B-Chinese-Chat in the following links, respectively. You are welcome to have a try!

8bit-quantized: https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-8bit
fp16: https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat-GGUF-fp16

shenzhi-wang changed discussion status to closed Apr 23, 2024

shenzhi-wang
/

Llama3-8B-Chinese-Chat

GGUF version