--- language: - zh pipeline_tag: text-generation --- FP16 Model converted from AquilaChat-7b v0.6 Pytorch Model: https://github.com/FlagAI-Open/FlagAI/tree/master/examples/Aquila/Aquila-chat Support Inference with AutoModelForCausalLM, ORTModelForCausalLM and OVModelForCausalLM ```python #!pip install transformers>=4.29.2 #!pip install optimum>=1.8.7 optimum-intel[openvino]==1.9.1 import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained('sammysun0711/aquilachat-7b-hf') model = AutoModelForCausalLM.from_pretrained('sammysun0711/aquilachat-7b-hf', trust_remote_code=True) model = model.eval() # from optimum.onnxruntime import ORTModelForCausalLM # model = ORTModelForCausalLM.from_pretrained('sammysun0711/aquilachat-7b-hf', export=True, use_cache=True, trust_remote_code=True) # from optimum.intel import OVModelForCausalLM # model = OVModelForCausalLM.from_pretrained('sammysun0711/aquilachat-7b-hf', export=True, use_cache=True, trust_remote_code=True) question = '北京为什么是中国的首都?' prompt = ( '''A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.''' f'''###Human: {question}###Assistant:''' ) with torch.no_grad(): ret = model.generate( **tokenizer(prompt, return_tensors='pt').to('cpu'), do_sample=False, max_new_tokens=200, use_cache=True ) print(tokenizer.decode(ret.tolist()[0])) ``` > 北京是中国的首都,是因为它在中国历史和文化中具有重要的地位,被选中作为中国的政治中心。在中国古代,北京是几个朝代的首都,如辽、金、元、明、清朝。在这些朝代,北京都是政治、经济、文化中心和军事重镇。此外,北京还是现代中国的政治中心,有着重要的国际地位。 AquilaChat-7B开源模型使用《智源Aquila系列模型许可协议》, 原始代码基于Apache Licence 2.0。