Edit model card

Saiga/Gemma2 10B, Russian Gemma-2-based chatbot

Based on Gemma-2 9B Instruct.

Prompt format

Gemma-2 prompt format:

<start_of_turn>system
Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им.<end_of_turn>
<start_of_turn>user
Как дела?<end_of_turn>
<start_of_turn>model
Отлично, а у тебя?<end_of_turn>
<start_of_turn>user
Шикарно. Как пройти в библиотеку?<end_of_turn>
<start_of_turn>model

Code example

# Исключительно ознакомительный пример.
# НЕ НАДО ТАК ИНФЕРИТЬ МОДЕЛЬ В ПРОДЕ.
# См. https://github.com/vllm-project/vllm или https://github.com/huggingface/text-generation-inference

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

MODEL_NAME = "IlyaGusev/saiga_gemma2_10b"

model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    load_in_8bit=True,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model.eval()

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
generation_config = GenerationConfig.from_pretrained(MODEL_NAME)
print(generation_config)

inputs = ["Почему трава зеленая?", "Сочини длинный рассказ, обязательно упоминая следующие объекты. Дано: Таня, мяч"]
for query in inputs:
    prompt = tokenizer.apply_chat_template([{
        "role": "user",
        "content": query
    }], tokenize=False, add_generation_prompt=True)
    data = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
    data = {k: v.to(model.device) for k, v in data.items()}
    output_ids = model.generate(**data, generation_config=generation_config)[0]
    output_ids = output_ids[len(data["input_ids"][0]):]
    output = tokenizer.decode(output_ids, skip_special_tokens=True).strip()
    print(query)
    print(output)
    print()
    print("==============================")
    print()

Versions

v1:

Evaluation

Pivot: gemma_2_9b_it_abliterated

model length_controlled_winrate win_rate standard_error avg_length
gemma_2_9b_it_abliterated 50.00 50.00 0.00 1126
saiga_gemma2_10b, v1 48.66 45.54 2.45 1066
Downloads last month
28
Safetensors
Model size
9.24B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train IlyaGusev/saiga_gemma2_10b