Uploaded model
- Developed by: jingwang
- License: apache-2.0
- Finetuned from model : unsloth/mistral-7b-v0.3-bnb-4bit
This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.
install dependencies in google colab
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes
inference
from unsloth import FastLanguageModel
from typing import Dict, List, Tuple, Union, Any
import pandas
from tqdm import trange, tqdm
import torch
class FormatPrompt_QA_with_citation():
'''format prompt class'''
def __init__(self, eos_token:str='</s>') -> None:
self.inputs = ['context','question'] # required input fields
self.outputs = ['answer', 'citation'] # for training, and model inference output fields
self.eos_token = eos_token
def __call__(self, instance: Dict[str, Any]) -> str:
'''
function call operator
Args:
instance: dictionary with keys: 'question', 'answer'
Returns:
prompt: formatted prompt
'''
return self.formatting_prompt_func(instance)
def formatting_prompt_func(self, instance: dict) -> str:
'''format prompt for domain specific QA
note this is for fine-tuning pre-trained model,
if starting with instuct tuned model, use `tokenizer.apply_chat_template(messages)` instead
'''
assert all([ item in instance.keys() for item in self.inputs ]), logging.info(f"instance must have {self.inputs}!")
prompt = f"""<s> [INST] Context: {str(instance["context"])}\
Question: {str(instance["question"])}
Answer: [/INST]"""
if ('answer' in instance):
if ('citation' in instance):
answer = {"answer":str(instance['answer']), "citation":str(instance['citation'])}
else:
answer = {"answer":str(instance['answer']), "citation":""}
prompt += json.dumps(answer, ensure_ascii=False) + self.eos_token # json format
else:
pass
return prompt
formatting_func = FormatPrompt_context_QA()
# pull model from huggingface
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "jingwang/mistral_qa_citation",
max_seq_length = 2048,
dtype = None,
load_in_4bit = True,
)
# inference
FastLanguageModel.for_inference(model)
example = {'context': 'John Gadsby Chapman , The Baptism of Pocahontas (1840). A copy is on display in the Rotunda of the United States Capitol . During her stay at Henricus, Pocahontas met John Rolfe. Rolfe\'s English-born wife Sarah Hacker and child Bermuda had died on the way to Virginia after the wreck of the ship Sea Venture on the Summer Isles, now known as Bermuda. He established the Virginia plantation Varina Farms , where he cultivated a new strain of tobacco . Rolfe was a pious man and agonized over the potential moral repercussions of marrying a heathen, though in fact Pocahontas had accepted the Christian faith and taken the baptismal name Rebecca. In a long letter to the governor requesting permission to wed her, he expressed his love for Pocahontas and his belief that he would be saving her soul. He wrote that he was: motivated not by the unbridled desire of carnal affection, but for the good of this plantation, for the honor of our country, for the Glory of God, for my own salvation... namely Pocahontas, to whom my hearty and best thoughts are, and have been a long time so entangled, and enthralled in so intricate a labyrinth that I was even a-wearied to unwind myself thereout. [41] The couple were married on April 5, 1614, by chaplain Richard Buck , probably at Jamestown. For two years they lived at Varina Farms, across the James River from Henricus. Their son Thomas was born in January 1615. [42] The marriage created a climate of peace between the Jamestown colonists and Powhatan\'s tribes; it endured for eight years as the "Peace of Pocahontas". [43] In 1615, Ralph Hamor wrote, "Since the wedding we have had friendly commerce and trade not only with Powhatan but also with his subjects round about us." [44] The marriage was controversial in the British court at the time because "a commoner" had "the audacity" to marry a "princess." [45] [46]',
'question': 'Who did Pocahontas marry?',
#'answer': 'Pocahontas married John Rolfe',
#'citation': 'The couple were married on April 5, 1614, by chaplain Richard Buck , probably at Jamestown.'
}
inputs = tokenizer([formatting_func(example)], return_tensors="pt", padding=False).to(model.device)
input_length = inputs.input_ids.shape[-1]
with torch.no_grad():
output = model.generate(**inputs,
do_sample=False,
temperature=0.5,
max_new_tokens=1024,
pad_token_id=tokenizer.eos_token_id,
use_cache=False,
)
response = tokenizer.decode(
output[0][input_length::], # response only, remove prompts
skip_special_tokens=True,
)
print(response)
>> {"answer": "Pocahontas married John Rolfe", "citation": "In a long letter to the governor requesting permission to wed her, he expressed his love for Pocahontas and his belief that he would be saving her soul. He wrote that he was: motivated not by the unbridled desire of carnal affection, but for the good of this plantation, for the honor of our country, for the Glory of God, for my own salvation... namely Pocahontas"}
Model tree for jingwang/mistral_qa_citation
Base model
mistralai/Mistral-7B-v0.3
Quantized
unsloth/mistral-7b-v0.3-bnb-4bit