Edit model card

This model is a 7B Chinese version of Self-RAG.

It is trained on Baichuan2-7B-Chat with a sample of belle sft data, acompanying with interleaving passages from zhwiki. The reflection tokens are aligned with the original verison (in English), so the usage is the same. Hope you enjoy.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from vllm import LLM, SamplingParams

model = LLM(YOUR_MODEL_PATH, dtype="half")
sampling_params = SamplingParams(temperature=0.0, top_p=1.0, max_tokens=100, skip_special_tokens=False)

def format_prompt(input, paragraph=None):
prompt = "### Instruction:\n{0}\n\n### Response:\n".format(input)
if paragraph is not None:
prompt += "[Retrieval]<paragraph>{0}</paragraph>".format(paragraph)
return prompt

query_1 = "你好呀"
query_2 = "故宫三大殿是哪些?"
queries = [query_1, query_2]

preds = model.generate([format_prompt(query) for query in queries], sampling_params)
for pred in preds:
print("Model prediction: {0}".format(pred.outputs[0].text))
# Model prediction: [No Retrieval] 你好!有什么我可以帮你解答的问题吗? [Utility:5] </s>
# Model prediction: [Retrieval] <paragraph> ... (this query requires factual grounding, call a retriever) </paragraph> [Relevant] 太和殿、中和殿、保和殿 [Utility:5] </s>

Data

The data used to train the model is also available (FINAL_OUTPUT_4w.jsonl), which is constructed using Belle SFT data and Wikipedia Chinese docs. Hope you enjoy it!

Downloads last month
17
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.