Edit model card

Llama3-8b-chinese-chat-32k

训练方式

使用方法

和原版相同

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "yuyijiong/Llama3-8B-Chinese-Chat-32k"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype="auto", device_map="auto"
)

messages = [
    {"role": "user", "content": "写一首诗吧"},
]

input_ids = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=32768,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

Long-Context Performance

相比原始版本,拥有更强的长上下文能力

LongBench (en)

model hotpotqa multifieldqa_en passage_retrieval_en qmsum trec
llama3-8b-chinese-chat 45.88 50.56 68.00 22.52 73.00
llama3-8b-chinese-chat-32k 47.64 49.98 100.00 25.13 75.0

Longbench (zh)

model dureader multifieldqa_zh passage_retrieval_zh vcsum lsht
llama3-8b-chinese-chat 29.08 58.4 93.5 14.61 28.25
llama3-8b-chinese-chat-32k 32.31 58.66 82.5 16.15 38.5
Downloads last month
2
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference API
Input a message to start chatting with yuyijiong/Llama3-8B-Chinese-Chat-32k.
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Dataset used to train yuyijiong/Llama3-8B-Chinese-Chat-32k

Collection including yuyijiong/Llama3-8B-Chinese-Chat-32k