Edit model card

This is a generative model converted to fp16 format based on IlyaGusev/saiga_mistral_7b_lora

Install vLLM:

pip install vllm

Start server:

python -u -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --model Gaivoronsky/Mistral-7B-Saiga

Client:

import openai

openai.api_base = "http://localhost:8000/v1"
openai.api_key = "none"

DEFAULT_MESSAGE_TEMPLATE = "<s>{role}\n{content}</s>"
DEFAULT_RESPONSE_TEMPLATE = "<s>bot\n"
DEFAULT_SYSTEM_PROMPT = "Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им."


class Conversation:
    def __init__(
        self,
        message_template=DEFAULT_MESSAGE_TEMPLATE,
        system_prompt=DEFAULT_SYSTEM_PROMPT,
        response_template=DEFAULT_RESPONSE_TEMPLATE
    ):
        self.message_template = message_template
        self.response_template = response_template
        self.messages = [{
            "role": "system",
            "content": system_prompt
        }]

    def add_user_message(self, message):
        self.messages.append({
            "role": "user",
            "content": message
        })

    def add_bot_message(self, message):
        self.messages.append({
            "role": "bot",
            "content": message
        })

    def get_prompt(self):
        final_text = ""
        for message in self.messages:
            message_text = self.message_template.format(**message)
            final_text += message_text
        final_text += DEFAULT_RESPONSE_TEMPLATE
        return final_text.strip()


query = "Сколько весит жираф?"
conversation = Conversation()
conversation.add_user_message(query)
prompt = conversation.get_prompt()

response = openai.ChatCompletion.create(
        model="Gaivoronsky/Mistral-7B-Saiga",
        messages=[{"role": "user", "content": prompt}],
        system=DEFAULT_SYSTEM_PROMPT,
        max_tokens=512,
        stop=['</s>']
)
response['choices'][0]['message']['content']
Downloads last month
8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train Gaivoronsky/Mistral-7B-Saiga