|
--- |
|
license: mit |
|
tags: |
|
- text generation |
|
- RAG |
|
- baichuan2 |
|
--- |
|
|
|
This model is a 7B Chinese version of [Self-RAG](https://huggingface.co/selfrag/selfrag_llama2_7b). |
|
|
|
It is trained on Baichuan2-7B-Chat with a sample of [belle](https://github.com/LianjiaTech/BELLE) sft data, acompanying with interleaving passages from zhwiki. The reflection tokens are aligned with the original verison (in English), so the usage is the same. Hope you enjoy. |
|
|
|
### Usage |
|
``` |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from vllm import LLM, SamplingParams |
|
|
|
model = LLM(YOUR_MODEL_PATH, dtype="half") |
|
sampling_params = SamplingParams(temperature=0.0, top_p=1.0, max_tokens=100, skip_special_tokens=False) |
|
|
|
def format_prompt(input, paragraph=None): |
|
prompt = "### Instruction:\n{0}\n\n### Response:\n".format(input) |
|
if paragraph is not None: |
|
prompt += "[Retrieval]<paragraph>{0}</paragraph>".format(paragraph) |
|
return prompt |
|
|
|
query_1 = "你好呀" |
|
query_2 = "故宫三大殿是哪些?" |
|
queries = [query_1, query_2] |
|
|
|
preds = model.generate([format_prompt(query) for query in queries], sampling_params) |
|
for pred in preds: |
|
print("Model prediction: {0}".format(pred.outputs[0].text)) |
|
# Model prediction: [No Retrieval] 你好!有什么我可以帮你解答的问题吗? [Utility:5] </s> |
|
# Model prediction: [Relevant] 太和殿、中和殿、保和殿 [Utility:5] </s> |
|
``` |
|
|
|
|