yuyijiong's picture
Update README.md
9759c6e verified
|
raw
history blame
2.4 kB
---
license: llama3
datasets:
- yuyijiong/Long-Instruction-with-Paraphrasing
language:
- zh
- en
pipeline_tag: text-generation
---
# Llama3-8b-chinese-chat-32k
* 📄[Paper](https://arxiv.org/abs/2312.11193)
* 📚[Dataset Download](https://huggingface.co/datasets/yuyijiong/Long-Instruction-with-Paraphrasing)
* ✨[GitHub
](https://github.com/yuyijiong/train_with_paraphrasing)
## 训练方式
* 使用 NTK-aware 方法扩展上下文长度至 **32k**
* 以 [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) 为基础
在 [Long-Instruction-with-Paraphrasing](https://huggingface.co/datasets/yuyijiong/Long-Instruction-with-Paraphrasing)
数据集上,使用 QLora 微调 1 epoch。
## 使用方法
和原版相同
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "yuyijiong/Llama3-8B-Chinese-Chat-32k"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype="auto", device_map="auto"
)
messages = [
{"role": "user", "content": "写一首诗吧"},
]
input_ids = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
outputs = model.generate(
input_ids,
max_new_tokens=32768,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
```
## Long-Context Performance
相比原始版本,拥有更强的长上下文能力
### LongBench (en)
| model | hotpotqa | multifieldqa_en| passage_retrieval_en |qmsum| trec |
|---------------------------|-----------|--|----------------------|--|----------|
| llama3-8b-chinese-chat | 45.88 |50.56| 68.00 |22.52| 73.00 |
| llama3-8b-chinese-chat-32k| **47.64** |49.98| **100.00** |**25.13**| **75.0** |
## Longbench (zh)
| model | dureader | multifieldqa_zh | passage_retrieval_zh | vcsum | lsht |
|----------------------------|-----------|-----------------|----------------------|-----------|----------|
| llama3-8b-chinese-chat | 29.08 | 58.4 | 93.5 | 14.61 | 28.25 |
| llama3-8b-chinese-chat-32k | **32.31** | **58.66** | 82.5 | **16.15** | **38.5** |