|
--- |
|
license: cc-by-nc-4.0 |
|
datasets: |
|
- yuyijiong/Long-Instruction-Chinese |
|
language: |
|
- zh |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
* [LongAlpaca](https://huggingface.co/Yukang/LongAlpaca-7B)通过对 llama2-chat 进行少量长文本数据的微调,展现出了优秀的长文本对话能力。 |
|
* LongAlpaca-7b-chinese 和 LongAlpaca 使用类似的训练方法:先使用线性位置插值,然后通过少量长文本数据的微调,使其获得优秀的长文本对话能力。 |
|
* 使用的数据集与LongAlpaca较为类似,但增加了多文档问答的数据。 |
|
* 此模型由atom-7b-chat经过lora微调得到,通过线性位置插值,将文本长度从4k扩展到32k,可以完成上万字的多文档检索、论文总结等任务,已经能满足绝大部分需要,而短对话能力几乎没有下降。\ |
|
使用方法: |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from transformers.generation import GenerationConfig |
|
import os |
|
os.environ["CUDA_VISIBLE_DEVICES"] = "0" |
|
|
|
model_path="yuyijiong/LongAlpaca-7b-chinese" |
|
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) |
|
|
|
# use auto mode, automatically select precision based on the device. |
|
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", load_in_8bit=True).eval() |
|
|
|
|
|
question="中国的首都是什么?" |
|
input_text = "<s>Human: " + question + "\n</s><s>Assistant: " |
|
input_ids = tokenizer(input_text, return_tensors='pt').input_ids.to(model.device) |
|
|
|
with torch.no_grad(): |
|
with torch.autocast('cuda'): |
|
output = model.generate(input_ids=input_ids, |
|
max_new_tokens=max_new_tokens, |
|
do_sample=True, |
|
temperature=0.85, |
|
top_k=None, |
|
top_p=0.9, |
|
use_cache=True, |
|
**kwargs) |
|
|
|
reply = tokenizer.decode(output[0], skip_special_tokens=False) |
|
reply_return=reply.split('Assistant:')[-1].replace('</s>', '') |
|
|
|
print('模型回答:', reply_return) |
|
``` |