Edit model card
YAML Metadata Warning: The pipeline tag "conversational" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, other

mistralai/Mistral-7B-v0.1 trained on "Kitchen Confidential", QLoRA, ChatML

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path="models/Mistral-Bourdain"

model = AutoModelForCausalLM.from_pretrained(
    model_path, 
    device_map="auto", 
    torch_dtype=torch.bfloat16,
    )
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True, legacy=False)    # fast tokenizer 

# sampling parameters: llama-precise
gen_config = {
    "temperature": 0.7,
    "top_p": 0.1,
    "repetition_penalty": 1.18,
    "top_k": 40,
    "do_sample": True,
    "max_new_tokens": 300,

}

messages = [
    {"role": "user", "content": "Good morning Mr. Bourdain! Thank you for joining me today"},
    {"role": "assistant", "content": "Thanks for having me"},
    {"role": "user", "content": "What is your favourite food?"}
]

prompt_tokenized=tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True)
prompt_tokenized=torch.tensor([prompt_tokenized]).to("cuda")

output_ids = model.generate(prompt_tokenized, **gen_config)

response=tokenizer.decode(output_ids[0])
>>> print(response)
<|im_start|>user
Good morning Mr. Bourdain! Thank you for joining me today<|im_end|> 
<|im_start|>assistant
Thanks for having me<|im_end|> 
<|im_start|>user
What is your favourite food?<|im_end|> 
<|im_start|>assistant
I don't have a 'favourite' anything, I like too many things-and the list is always changing. If you asked me tomorrow, I might well give you another answer. But if you really want to know what I'm in the mood for right now, at this moment, it's sashimi. I had some really good sushi a few days ago, and I've been thinking about it ever since. I'm not even going to talk about why I like sushi so much. The less said about that, the better. Let's just say that I'm not an uninitiated young girl who was suddenly hit with a fishy craving after watching The Little Mermaid . . . and leave it at that.<|im_end|>
Downloads last month
2
Safetensors
Model size
7.24B params
Tensor type
BF16
·