Meno-Tiny-0.1

Meno-Tiny-0.1 is a descendant of the Qwen2.5-1.5B-Instruct model, which was fine-tuned on a special Russian instruct dataset. It is a 1.5B parameter language model with a decoder. It is based on the Transformer architecture with SwiGLU activation, attention QKV bias, group query attention, etc. The name "Meno" is associated with the adaptation of this model for answering questions from text in the RAG pipeline (in honor of the theory of knowledge as recollection from the Socratic dialogue "Meno").

Requirements

The code of Meno-Tiny-0.1 has been in the latest Hugging face transformers and we advise you to use the latest version of transformers.

With transformers<4.37.0, you will encounter the following error:

KeyError: 'qwen2'

Quickstart

Here, we provide a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate content.

Meno-Tiny-0.1 was specifically "Russified" during the fine-tuning stage, but it retained the ability to answer in English. The following are two examples of communication with Meno-Tiny-0.1 in English and Russian.

1. Example of communication in English

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model_name = "bond005/meno-tiny-0.1"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."  # in English
messages = [
    {"role": "system", "content": "You are Meno, created by Ivan Bondarenko. You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    generation_config=gen_config
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

2. Example of communication in Russian

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model_name = "bond005/meno-tiny-0.1"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)

prompt = "ะะฐะฟะธัˆะธ ะบั€ะฐั‚ะบะพะต ะฒะฒะตะดะตะฝะธะต ะฒ ะฑะพะปัŒัˆะธะต ัะทั‹ะบะพะฒั‹ะต ะผะพะดะตะปะธ."  # in Russian
messages = [
    {"role": "system", "content": "ะขั‹ - ะœะตะฝะพะฝ, ั€ะฐะทั€ะฐะฑะพั‚ะฐะฝะฝั‹ะน ะ˜ะฒะฐะฝะพะผ ะ‘ะพะฝะดะฐั€ะตะฝะบะพ. ะขั‹ ะฟะพะปะตะทะฝั‹ะน ะฐััะธัั‚ะตะฝั‚."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    generation_config=gen_config
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Abilities of Meno-Tiny-0.1

Using Meno-Tiny-0.1 with different system and user prompts allows you to discover its various abilities. The main tasks that Meno-Tiny-0.1 can solve, including in the few-shot prompting mode, are:

  • Answering questions about the text;
  • Summarization;
  • Determining text toxicity and detoxifying the text;
  • Anaphora resolution;
  • Correcting speech recognition errors;
  • and so on.

Below are some examples of how to communicate with Meno-Tiny-0.1 in Russian in order to solve a variety of specialized tasks.

1. The answer to the question about the document

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model_name = "bond005/meno-tiny-0.1"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)

prompt = "ะžั‚ะฒะตั‚ัŒ ะฝะฐ ะฒะพะฟั€ะพั ะฟะพ ั‚ะตะบัั‚ัƒ.\n\nะ’ะพะฟั€ะพั: {question}\n\nะขะตะบัั‚: {context}".format(
    question="ะ“ะดะต ะถะธะฒัƒั‚ ะฟะธะฝะณะฒะธะฝั‹?",
    context="ะั‹ะฝะต ะฟะธะฝะณะฒะธะฝั‹ ะฝะฐะธะฑะพะปะตะต ั€ะฐะทะฝะพะพะฑั€ะฐะทะฝั‹ ะฝะฐ ะพัั‚ั€ะพะฒะฐั… ะกัƒะฑะฐะฝั‚ะฐั€ะบั‚ะธะบะธ; ะฒ ั†ะตะปะพะผ ั€ะฐัะฟั€ะพัั‚ั€ะฐะฝะตะฝะธะต ะณั€ัƒะฟะฟั‹ ัะฒัะทะฐะฝะพ ั ั…ะพะปะพะดะฝั‹ะผะธ ะพะบะตะฐะฝะธั‡ะตัะบะธะผะธ ั‚ะตั‡ะตะฝะธัะผะธ ะฎะถะฝะพะณะพ ะฟะพะปัƒัˆะฐั€ะธั, ะฒะดะพะปัŒ ะบะพั‚ะพั€ั‹ั… ะฟะธะฝะณะฒะธะฝั‹ ะฟั€ะพะฝะธะบะฐัŽั‚ ะดะฐะปะตะบะพ ะฝะฐ ัะตะฒะตั€ โ€“ ะฒ ััƒะฑั‚ั€ะพะฟะธะบะธ ะฎะถะฝะพะน ะะผะตั€ะธะบะธ (ะณัƒะผะฑะพะปัŒะดั‚ะพะฒ ะธ ะผะฐะณะตะปะปะฐะฝะพะฒ ะฟะธะฝะณะฒะธะฝั‹), ะั„ั€ะธะบะธ (ะพั‡ะบะพะฒั‹ะน ะฟะธะฝะณะฒะธะฝ Spheniscus demersus), ะะฒัั‚ั€ะฐะปะธะธ (ะผะฐะปั‹ะน ะฟะธะฝะณะฒะธะฝ) ะธ ะดะฐะถะต ะบ ัะบะฒะฐั‚ะพั€ะธะฐะปัŒะฝั‹ะผ ะžัั‚ั€ะพะฒะฐะผ ะ“ะฐะปะฐะฟะฐะณะพั (ัะฝะดะตะผะธั‡ะฝั‹ะน ะณะฐะปะฐะฟะฐะณะพััะบะธะน ะฟะธะฝะณะฒะธะฝ, Spheniscus mendiculus). ะะฐ ะคะพะปะบะปะตะฝะดัะบะธั… ะพัั‚ั€ะพะฒะฐั… ัะธะผะฟะฐั‚ั€ะธั‡ะฝะพ ะพะฑะธั‚ะฐัŽั‚ 5 ะฒะธะดะพะฒ. ะ›ะธัˆัŒ 3 ะฒะธะดะฐ โ€“ ะธะผะฟะตั€ะฐั‚ะพั€ัะบะธะน, ะฐะฝั‚ะฐั€ะบั‚ะธั‡ะตัะบะธะน (Pygoscelis antarcticus) ะฟะธะฝะณะฒะธะฝั‹ ะธ ะฟะธะฝะณะฒะธะฝ ะะดะตะปะธ (Pygoscelis adeliae) โ€“ ะฝะฐัะตะปััŽั‚ ะฑะตั€ะตะณะพะฒัƒัŽ ะบั€ะพะผะบัƒ ะปะตะดะพะฒะพะณะพ ั‰ะธั‚ะฐ ะะฝั‚ะฐั€ะบั‚ะธะดั‹. ะกะตะฒะตั€ะฝะฐั ะณั€ะฐะฝะธั†ะฐ ั€ะฐัะฟั€ะพัั‚ั€ะฐะฝะตะฝะธั ะฑะพะปัŒัˆะธะฝัั‚ะฒะฐ ะฟะธะฝะณะฒะธะฝะพะฒ ะพะฟั€ะตะดะตะปัะตั‚ัั ะธะทะพั‚ะตั€ะผะพะน ะผะพั€ัะบะพะน ะฒะพะดั‹ +15โ€ฆ+16 ยฐะก."
)
messages = [
    {"role": "system", "content": "ะขั‹ - ะœะตะฝะพะฝ, ั€ะฐะทั€ะฐะฑะพั‚ะฐะฝะฝั‹ะน ะ˜ะฒะฐะฝะพะผ ะ‘ะพะฝะดะฐั€ะตะฝะบะพ. ะขั‹ ะฟะพะปะตะทะฝั‹ะน ะฐััะธัั‚ะตะฝั‚."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    generation_config=gen_config
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

2. Summarization

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model_name = "bond005/meno-tiny-0.1"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)

prompt = "ะกั‚ะฐะปะธ ะธะทะฒะตัั‚ะฝั‹ ั€ะตะทัƒะปัŒั‚ะฐั‚ั‹, ะฟะพะปัƒั‡ะตะฝะฝั‹ะต ะพั‚ะบั€ั‹ั‚ะพะน ัะธัั‚ะตะผะพะน ยซะŸะธัะตั†ยป ะฝะฐ ะตะถะตะณะพะดะฝะพะน ะฐะบั†ะธะธ ยซะขะพั‚ะฐะปัŒะฝั‹ะน ะดะธะบั‚ะฐะฝั‚ยป, ะบะพั‚ะพั€ะฐั ัะพัั‚ะพัะปะฐััŒ 20 ะฐะฟั€ะตะปั. ะะฐะฟะพะผะฝะธะผ, ั‡ั‚ะพ ยซะŸะธัะตั†ยป ะฑั‹ะป ั€ะฐะทั€ะฐะฑะพั‚ะฐะฝ ะฝะฐัƒั‡ะฝั‹ะผ ัะพั‚ั€ัƒะดะฝะธะบะพะผ ะ›ะฐะฑะพั€ะฐั‚ะพั€ะธะธ ะฟั€ะธะบะปะฐะดะฝั‹ั… ั†ะธั„ั€ะพะฒั‹ั… ั‚ะตั…ะฝะพะปะพะณะธะน ะœะตะถะดัƒะฝะฐั€ะพะดะฝะพะณะพ ะฝะฐัƒั‡ะฝะพ-ะพะฑั€ะฐะทะพะฒะฐั‚ะตะปัŒะฝะพะณะพ ะผะฐั‚ะตะผะฐั‚ะธั‡ะตัะบะพะณะพ ั†ะตะฝั‚ั€ะฐ ะะ“ะฃ ะธ ัะพะพัะฝะพะฒะฐั‚ะตะปะตะผ ัั‚ะฐั€ั‚ะฐะฟะฐ ยซะกะธะฑะธั€ัะบะธะต ะฝะตะนั€ะพัะตั‚ะธยป ะ˜ะฒะฐะฝะพะผ ะ‘ะพะฝะดะฐั€ะตะฝะบะพ. ะ’ะฟะตั€ะฒั‹ะต ะธัะบัƒััั‚ะฒะตะฝะฝั‹ะน ะธะฝั‚ะตะปะปะตะบั‚ ัะพั€ะตะฒะฝะพะฒะฐะปัั ะฒ ะณั€ะฐะผะพั‚ะฝะพัั‚ะธ ั ั‡ะตะปะพะฒะตั‡ะตัะบะธะผ ะฒ ั€ะฐะผะบะฐั… ะทะฐะดะฐั‡ะธ ะดะธะบั‚ะฐะฝั‚ะฐ, ะธ ัะพะทะดะฐั‚ะตะปัŒ ยซะŸะธัั†ะฐยป ะฟั€ะตะดะฟะพะปะฐะณะฐะป, ั‡ั‚ะพ ะฟะพะปะพะถะธั‚ะตะปัŒะฝะพะน ะพั†ะตะฝะบะธ ั‚ะพั‚ ะฝะต ะฟะพะปัƒั‡ะธั‚ โ€” ัะบะพั€ะตะต ะฒัะตะณะพ, ัะธัั‚ะตะผะฐ ะดะพะฟัƒัั‚ะธั‚ ะผะธะฝะธะผัƒะผ ะพั€ั„ะพะณั€ะฐั„ะธั‡ะตัะบะธั… ะพัˆะธะฑะพะบ, ะพะดะฝะฐะบะพ ั ั€ะฐััั‚ะฐะฒะปะตะฝะธะตะผ ะทะฝะฐะบะพะฒ ะฟั€ะตะฟะธะฝะฐะฝะธั ะฒั€ัะด ะปะธ ัะฟั€ะฐะฒะธั‚ัั. \n\nะ ะฐะทั€ะฐะฑะพั‚ั‡ะธะบะฐะผ ยซะŸะธัั†ะฐยป ะฑั‹ะปะพ ะฒะฐะถะฝะพ ัะพะฑั€ะฐั‚ัŒ ัั‚ะฐั‚ะธัั‚ะธะบัƒ ะพ ั€ะฐะทะฝะพะพะฑั€ะฐะทะธะธ ัะพะฒะตั€ัˆะฐะตะผั‹ั… ะธะผ ะพัˆะธะฑะพะบ ะธ ะฝะตั‚ะพั‡ะฝะพัั‚ะตะน, ั‡ั‚ะพะฑั‹ ะฒ ะดะฐะปัŒะฝะตะนัˆะตะผ ัƒัะพะฒะตั€ัˆะตะฝัั‚ะฒะพะฒะฐั‚ัŒ ัะธัั‚ะตะผัƒ. ะ ะตะทัƒะปัŒั‚ะฐั‚ั‹ ะพะบะฐะทะฐะปะธััŒ ะฝะตะพะถะธะดะฐะฝะฝั‹ะผะธ, ะฝะพ ะทะฐะบะพะฝะพะผะตั€ะฝั‹ะผะธ โ€“ ยซะŸะธัะตั†ยป  ะฒะฟะพะปะฝะต ัƒะดะพะฒะปะตั‚ะฒะพั€ะธั‚ะตะปัŒะฝะพ ั€ะฐััั‚ะฐะฒะธะป ะทะฐะฟัั‚ั‹ะต ะธ ั€ะฐะทะฑะธะป ั‚ะตะบัั‚ ะฝะฐ ะฐะฑะทะฐั†ั‹. ะ”ะปั ัั‚ะพะณะพ ะตะณะพ ัะฟะตั†ะธะฐะปัŒะฝะพ ะฝะฐัƒั‡ะธะปะธ ัƒะปะฐะฒะปะธะฒะฐั‚ัŒ ะฒ ั€ะตั‡ะธ ยซะบะพะดะพะฒั‹ะต ั„ั€ะฐะทั‹ยป ะฒั€ะพะดะต ยซะฟะธัˆะตะผ ั ะบั€ะฐัะฝะพะน ัั‚ั€ะพะบะธยป ะธะปะธ ยซะฟะตั€ะตั…ะพะดะธะผ ะฝะฐ ะฝะพะฒั‹ะน ะฐะฑะทะฐั†ยป. ะ’ ัั‚ะธั… ั†ะตะปัั… ะธัะฟะพะปัŒะทะพะฒะฐะปะฐััŒ ะพั‚ะดะตะปัŒะฝะฐั ะฝะตะนั€ะพัะตั‚ัŒ, ะพะฑัƒั‡ะตะฝะฝะฐั ะฝะฐ ะฑะฐะทะต Longformer ะฒั‹ะดะตะปัั‚ัŒ ั‚ะฐะบะธะต ยซะฒะฝะตััŽะถะตั‚ะฝั‹ะตยป ะฒัั‚ะฐะฒะบะธ ะฝะฐะฟะพะดะพะฑะธะต ัะธัั‚ะตะผั‹ NER (Named Entity Recognition - ั€ะฐัะฟะพะทะฝะฐะฒะฐะฝะธะต ะธะผะตะฝะพะฒะฐะฝะฝั‹ั… ััƒั‰ะฝะพัั‚ะตะน). ะ”ะปั ะพะฑัƒั‡ะตะฝะธั ะธัะฟะพะปัŒะทะพะฒะฐะปัั ัะธะฝั‚ะตั‚ะธั‡ะตัะบะธะน ั‚ะตะบัั‚ะพะฒั‹ะน ะบะพั€ะฟัƒั. ะกะฐะผ ะถะต ยซะŸะธัะตั†ยป ะธัะฟะพะปัŒะทะพะฒะฐะป ะฒ ัะฒะพะตะน ั€ะฐะฑะพั‚ะต ัะฒัะทะบัƒ Wav2Vec2-Large-Ru-Golos + Whisper-Podlodka (ะพ Wav2Vec2-Large-Ru-Golos ะผั‹ ั€ะฐะฝะตะต ะฟะธัะฐะปะธ https://www.nsu.ru/n/media/news/nauka/razrabotannuyu-professorom-ngu-model-raspoznavaniya-rechi-nauchili-razlichat-emotsii, ะฐ Whisper-Podlodka ัะฒะปัะตั‚ัั ะฝะพะฒะพะน ะผะพะดะตะปัŒัŽ). ะžะดะฝะฐะบะพ ะณะฐะปะปัŽั†ะธะฝะฐั†ะธะน ะธะทะฑะตะถะฐั‚ัŒ ะฝะต ัƒะดะฐะปะพััŒ.\n\nะ“ะฐะปะปัŽั†ะธะฝะฐั†ะธั โ€” ัั‚ะพ ะพั‚ะฒะตั‚ ะฐะฒั‚ะพั€ะตะณั€ะตััะธะพะฝะฝะพะน ะฝะตะนั€ะพัะตั‚ะตะฒะพะน ะผะพะดะตะปะธ ัะทั‹ะบะฐ, ะบะพั‚ะพั€ั‹ะน ะบะพั€ั€ะตะบั‚ะตะฝ ะณั€ะฐะผะผะฐั‚ะธั‡ะตัะบะธ, ะฝะพ ะฝะตะฒะตั€ะตะฝ ัะตะผะฐะฝั‚ะธั‡ะตัะบะธ (ะฝะต ัะพะพั‚ะฒะตั‚ัั‚ะฒัƒะตั‚ ะฒั…ะพะดะฝะพะผัƒ ะทะฐะฟั€ะพััƒ ะฟะพ ัะผั‹ัะปัƒ)."
messages = [
    {"role": "system", "content": "ะŸะตั€ะตัะบะฐะถะธ ะบั€ะฐั‚ะบะพ ั‚ะตะบัั‚."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    generation_config=gen_config
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

3. Anaphora resolution in dialogue (with few-shot prompting)

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model_name = "bond005/meno-tiny-0.1"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)

user_prompt = "User: ะšั‚ะพ ัะตะนั‡ะฐั ั€ะตะบั‚ะพั€ ะะพะฒะพัะธะฑะธั€ัะบะพะณะพ ะณะพััƒะดะฐั€ัั‚ะฒะตะฝะฝะพะณะพ ัƒะฝะธะฒะตั€ัะธั‚ะตั‚ะฐ?\nAssistant: ะ ะตะบั‚ะพั€ะพะผ ะะพะฒะพัะธะฑะธั€ัะบะพะณะพ ะณะพััƒะดะฐั€ัั‚ะฒะตะฝะฝะพะณะพ ัƒะฝะธะฒะตั€ัะธั‚ะตั‚ะฐ ัะฒะปัะตั‚ัั ะœะธั…ะฐะธะป ะŸะตั‚ั€ะพะฒะธั‡ ะคะตะดะพั€ัƒะบ, ะฐะบะฐะดะตะผะธะบ ะ ะพััะธะนัะบะพะน ะฐะบะฐะดะตะผะธะธ ะฝะฐัƒะบ, ะดะพะบั‚ะพั€ ั„ะธะทะธะบะพ-ะผะฐั‚ะตะผะฐั‚ะธั‡ะตัะบะธั… ะฝะฐัƒะบ, ะฟั€ะพั„ะตััะพั€.\nUser: ะšะฐะบะธะต ัƒ ะฝะตะณะพ ะฝะฐัƒั‡ะฝั‹ะต ะธะฝั‚ะตั€ะตัั‹?"
few_shots_for_anaphora = [
    {"role": "user", "content": "User: ะงั‚ะพ ั‚ะฐะบะพะต ะผะตั…ะฐะฝะธะบะพ-ะผะฐั‚ะตะผะฐั‚ะธั‡ะตัะบะธะน ั„ะฐะบัƒะปัŒั‚ะตั‚?\nAssistant: ะœะตั…ะฐะฝะธะบะพ-ะผะฐั‚ะตะผะฐั‚ะธั‡ะตัะบะธะน ั„ะฐะบัƒะปัŒั‚ะตั‚ ะะ“ะฃ โ€” ัั‚ะพ ั„ะฐะบัƒะปัŒั‚ะตั‚, ะฒั‹ะฟัƒัะบะฝะธะบะธ ะบะพั‚ะพั€ะพะณะพ ะพััƒั‰ะตัั‚ะฒะปััŽั‚ ะฝะฐัƒั‡ะฝั‹ะต ะธััะปะตะดะพะฒะฐะฝะธั ะธ ั€ะฐะทั€ะฐะฑะพั‚ะบะธ ะดะปั ะปัƒั‡ัˆะธั… ะบะพะผะฟะฐะฝะธะน ะผะธั€ะฐ. ะกั‚ัƒะดะตะฝั‚ ะœะตั…ะฐะฝะธะบะพ-ะผะฐั‚ะตะผะฐั‚ะธั‡ะตัะบะพะณะพ ั„ะฐะบัƒะปัŒั‚ะตั‚ะฐ ัƒั‡ะธั‚ัั ะฟั€ะตะพะฑั€ะฐะทะพะฒั‹ะฒะฐั‚ัŒ ัะฒะพะธ ั€ะฐะทั€ะพะทะฝะตะฝะฝั‹ะต ะผั‹ัะปะธ ะฒ ั‡ะตั‚ะบะพ ัั‚ั€ัƒะบั‚ัƒั€ะธั€ะพะฒะฐะฝะฝั‹ะต ั€ะฐัััƒะถะดะตะฝะธั, ะพะฑะปะฐะดะฐัŽั‰ะธะต ะปะพะณะธั‡ะตัะบะพะน ัั‚ั€ะพะนะฝะพัั‚ัŒัŽ.\nUser: ะ ั‚ะฐะผ ะตัั‚ัŒ ะผะฐะณะธัั‚ั€ะฐั‚ัƒั€ะฐ?"},
    {"role": "assistant", "content": "ะ ะฝะฐ ะผะตั…ะฐะฝะธะบะพ-ะผะฐั‚ะตะผะฐั‚ะธั‡ะตัะบะพะผ ั„ะฐะบัƒะปัŒั‚ะตั‚ะต ะตัั‚ัŒ ะผะฐะณะธัั‚ั€ะฐั‚ัƒั€ะฐ?"},
    {"role": "user", "content": "User: ะšะพะณะดะฐ ะฝะฐั‡ะธะฝะฐะตั‚ัั ะฟั€ะธั‘ะผ ะดะพะบัƒะผะตะฝั‚ะพะฒ ะฒ ะะ“ะฃ?\nAssistant: ะŸั€ะธั‘ะผ ะดะพะบัƒะผะตะฝั‚ะพะฒ ะฒ ะะ“ะฃ ะฝะฐั‡ะธะฝะฐะตั‚ัั 1 ะผะฐั€ั‚ะฐ โ€“ ะดะปั ะธะฝะพัั‚ั€ะฐะฝะฝั‹ั… ะณั€ะฐะถะดะฐะฝ ะธ ะปะธั† ะฑะตะท ะณั€ะฐะถะดะฐะฝัั‚ะฒะฐ ะธ 20 ะธัŽะฝั โ€“ ะดะปั ะณั€ะฐะถะดะฐะฝ ะ ะพััะธะนัะบะพะน ะคะตะดะตั€ะฐั†ะธะธ.\nUser: ะ ะบะพะณะดะฐ ะพะฝ ะทะฐะบะฐะฝั‡ะธะฒะฐะตั‚ัั?"},
    {"role": "assistant", "content": "ะ ะบะพะณะดะฐ ะฟั€ะธั‘ะผ ะดะพะบัƒะผะตะฝั‚ะพะฒ ะฒ ะะ“ะฃ ะทะฐะบะฐะฝั‡ะธะฒะฐะตั‚ัั?"},
    {"role": "user", "content": "User: ะšั‚ะพ ะพัะฝะพะฒะฐะป ะะพะฒะพัะธะฑะธั€ัะบะธะน ะะบะฐะดะตะผะณะพั€ะพะดะพะบ?\nAssistant: ะะพะฒะพัะธะฑะธั€ัะบะธะน ะะบะฐะดะตะผะณะพั€ะพะดะพะบ ะพัะฝะพะฒะฐะป ะœะธั…ะฐะธะป ะะปะตะบัะตะตะฒะธั‡ ะ›ะฐะฒั€ะตะฝั‚ัŒะตะฒ ะฒ 1957 ะณะพะดัƒ.\nUser: ะงะตะผ ะถะต ะพะฝ ะทะฐะฝะธะผะฐะปัั ะดะพ ัั‚ะพะณะพ?"},
    {"role": "assistant", "content": "ะงะตะผ ะถะต ะœะธั…ะฐะธะป ะะปะตะบัะตะตะฒะธั‡ ะ›ะฐะฒั€ะตะฝั‚ัŒะตะฒ ะทะฐะฝะธะผะฐะปัั ะดะพ ะพัะฝะพะฒะฐะฝะธั ะะพะฒะพัะธะฑะธั€ัะบะพะณะพ ะะบะฐะดะตะผะณะพั€ะพะดะบะฐ?"}
]
system_prompt_for_anaphora = [
    {"role": "system", "content": "ะŸะตั€ะตะฟะธัˆะธ ั‚ะตะบัั‚ ะฟะพัะปะตะดะฝะตะน ั€ะตะฟะปะธะบะธ ะฟะพะปัŒะทะพะฒะฐั‚ะตะปั ะฒ ะดะธะฐะปะพะณะต ั‚ะฐะบ, ั‡ั‚ะพ ั€ะฐะทั€ะตัˆะธั‚ัŒ ะฒัะต ัะธั‚ัƒะฐั†ะธะธ ะผะตัั‚ะพะธะผะตะฝะฝะพะน ะฐะฝะฐั„ะพั€ั‹ ะฒ ัั‚ะพะผ ั‚ะตะบัั‚ะต. ะ—ะฐะผะตะฝะธ ะฐะฝะฐั„ะพั€ะธั‡ะตัะบะธะต ะผะตัั‚ะพะธะผะตะฝะธั ัะพะพั‚ะฒะตั‚ัั‚ะฒัƒัŽั‰ะธะผะธ ะธะผ ััƒั‰ะตัั‚ะฒะธั‚ะตะปัŒะฝั‹ะผะธ."}
]
messages = system_prompt_for_anaphora + few_shots_for_anaphora + [
    {"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    generation_config=gen_config
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

4. Correction of speech recognition output (with few-shot prompting)

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model_name = "bond005/meno-tiny-0.1"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)

user_prompt = "ั‚ะพ ะตัั‚ัŒ ะผั‹ ะฒ ะบะฐะถะดั‹ะน ะผะพะผะตะฝั‚ ะฒั€ะตะผะตะฝะธ ะทะฝะฐะตะผ ะฟั€ะพ ะทะฒัƒะบ ะตั‰ะต ะธ ะบะฐะบะพะต ั‚ะพ ั‚ะฐะบะพะต ั€ะฐัะฟั€ะตะดะตะปะตะฝะธะต ั‡ะฐัั‚ะพั‚ ะธ ัะฒัะทะฐะฝะฝะพะต ัั‚ะพ ั ั‚ะตะผ ั‡ั‚ะพ ะฝะฐัˆะต ัƒั…ะพ ะฝะฐ ัะฐะผะพะผ ะดะตะปะต ะฟั€ะธะผะตั€ะฝะพ ั‚ะฐะบะถะต ะธ ะฒะพัะฟั€ะธะฝะธะผะฐั‚ัŒ ะทะฒัƒะบ ั‚ะพ ะตัั‚ัŒ ะผั‹ ะฝะต ะฟั€ะพัั‚ะพ ะฟะพะฝะธะผะฐะตะผ ั‡ั‚ะพ ะฒะพั‚ ะณะดะต ั‚ะพ ั‚ะฐะผ ะณั€ะพะผั‡ะต ะณะดะต ั‚ะพ ั‚ะธัˆะต ะฐ ะฝะฐัˆะต ัƒั…ัƒ ะตั‰ะต ะฟะพะฝะธะผะฐะตั‚ ั‡ั‚ะพ ะฒะพั‚ ัั‚ะพั‚  ะทะฒัƒะบ ะฒั‹ัˆะต ัั‚ะพั‚ ะฝะธะถะต ัั‚ะพั‚ ะณะพะปะพั ะฑะพะปะต ะฒั‹ัะพะบะธะน ัั‚ะพั‚ ะณะพะปะพั ะฝะธะทะบะธ"
few_shots_for_ASR_correction = [
    {"role": "user", "content": "ะฒั‹ ะฒั‹ะฑะพั€ัะบะพะผ ั€ะฐะนะพะฝะต ะณะพั€ะพะดะฐ ะฟั€ะพะฒะพะดะธั‚ัั ะฟั€ะพะฒะตั€ะบะฐ ะฟะพ ั„ะฐะบั‚ัƒ ะฝะฐะฟะฐะดะตะฝะธัŽ ะฝะฐ ะบะฒะฐั€ั‚ะธั€ัƒ"},
    {"role": "assistant", "content": "ะ’ ะ’ั‹ะฑะพั€ะณัะบะพะผ ั€ะฐะนะพะฝะต ะณะพั€ะพะดะฐ ะฟั€ะพะฒะพะดะธั‚ัั ะฟั€ะพะฒะตั€ะบะฐ ะฟะพ ั„ะฐะบั‚ัƒ ะฝะฐะฟะฐะดะตะฝะธั ะฝะฐ ะบะฒะฐั€ั‚ะธั€ัƒ."},
    {"role": "user", "content": "ะฒ ะฟะพั€ะฐะดะต ะฟะพะฑะตะดั‹ ะฝะฐ ะบั€ะฐัะฝะพะน ะฟะปะพั‰ะฐะดะธ ะฒ ะผะพัะบะฒะต ะฟั€ะตะผัƒั‚ ัƒั‡ะฐัั‚ะธะต ัะฒั‹ััˆะต ั‡ะตั‚ั‹ั€ะฝะฐะดั†ะฐั‚ัŒ ั‚ั‹ ัะฒะพะธะผ ะฝะฐัะปัƒะถะฐั‰ะธั… ะฒ ั‚ะพะผ ั‡ะธัะปะต ะพะดะธะฝัั‚ะฒะพะณะพ ะผัƒะทั‹ะบะฐะฝั‚ะฐ ะฒัะฒะพะดะฝะพะณะพ ะพั€ะบะตัั‚ั€ะฐ ัะพะพะฑั‰ะธะป ะธะฝั‚ะตั€ั„ะฐะบั‚ ัะปะพะฒะฝะฒั‹ ั‡ะตั‚ะฒะตั€ะณ ะฟั€ะตะดัั‚ะฐะฒะธั‚ะตะปะต ะฟั€ะฐะฒะปะตะฝะธั ะฟั€ะตั ัะปัƒะถะฑั‹ ะธ ะธะฝั„ะพั€ะผะฐั†ะธัŽ ะธะผะตะฝะฝะพ ะฑะพั€ะพะฝ ั€ั„ ะฟะพ ััƒั…ะพะพะฟัƒั‚ะฝั‹ะผ ะฒะพะนัะบะฐะผ ะฟะพะปะบะพะฒะฝะธะบ ัะตั€ะณะตะน ะฒะปะฐัะพะฒ"},
    {"role": "assistant", "content": "ะ’ ะฟะฐั€ะฐะดะต ะŸะพะฑะตะดั‹ ะฝะฐ ะšั€ะฐัะฝะพะน ะฟะปะพั‰ะฐะดะธ ะฒ ะœะพัะบะฒะต ะฟั€ะธะผัƒั‚ ัƒั‡ะฐัั‚ะธะต ัะฒั‹ัˆะต 14 ั‚ั‹ั. ะฒะพะตะฝะฝะพัะปัƒะถะฐั‰ะธั…, ะฒ ั‚ะพะผ ั‡ะธัะปะต 1 100 ะผัƒะทั‹ะบะฐะฝั‚ะพะฒ ัะฒะพะดะฝะพะณะพ ะพั€ะบะตัั‚ั€ะฐ, ัะพะพะฑั‰ะธะป ยซะ˜ะฝั‚ะตั€ั„ะฐะบััƒ-ะะ’ะยป ะฒ ั‡ะตั‚ะฒะตั€ะณ ะฟั€ะตะดัั‚ะฐะฒะธั‚ะตะปัŒ ัƒะฟั€ะฐะฒะปะตะฝะธั ะฟั€ะตัั-ัะปัƒะถะฑั‹ ะธ ะธะฝั„ะพั€ะผะฐั†ะธะธ ะœะธะฝะพะฑะพั€ะพะฝั‹ ะ ะค ะฟะพ ะกัƒั…ะพะฟัƒั‚ะฝั‹ะผ ะฒะพะนัะบะฐะผ ะฟะพะปะบะพะฒะฝะธะบ ะกะตั€ะณะตะน ะ’ะปะฐัะพะฒ."},
    {"role": "user", "content": "ะณะปะฐะฒะฝั‹ะต ะฟะพั‚ั€ะตะฑะธั‚ะตะปะธ ะดะธัะบะพั„ั ะฒะธะฝะดะพะฒั ะธะผะธ ะบั€ะพัะพั„ั‚ะพะฒ ั„ะธะบะธั€ะพััะธะนัะบะธะต ัะฑะพั€ั‰ะธะบะธ ัƒัั‚ะฐะฝะฐะฒะปะธะฒะฐัŽั‰ะธะต ะฟั€ะพะณั€ะฐะผะผะฝั‹ะต ะพะฑะตัะฟะตั‡ะตะฝะธะต ะฟั€ะฐะฝะฐ ะฒั‹ะฟัƒัะบะฐะตะผั‹ะต ะธะผะธ ะบะพะผะฟัŒัŽั‚ะตั€ั‹ ะธ ัะตั€ะฒะตั€ั‹ ะพั‚ััŽะดะฐ ะฝะฐะทะฒะฐะฝะธะต ะฐ ะธะฝะฒะตั€ัะธั ัั‚ะพะน ัะฒะตั€ัะธั ะดะปั ะพั€ะธะณะธะฝะฐะปะณะธะบะพะตะฟะผะตะฝั‚ะผะฐะฝะพ ั„ะฐะบั‚ัƒะตั€ ะดะปั ัะฑะพั€ั‰ะธะบะฐ"},
    {"role": "assistant", "content": "ะ“ะปะฐะฒะฝั‹ะต ะฟะพั‚ั€ะตะฑะธั‚ะตะปะธ ะดะธัะบะพะฒ ั ยซWindowsยป ะธ ยซMicrosoft Officeยป - ั€ะพััะธะนัะบะธะต ัะฑะพั€ั‰ะธะบะธ, ัƒัั‚ะฐะฝะฐะฒะปะธะฒะฐัŽั‰ะธะต ะฟั€ะพะณั€ะฐะผะผะฝะพะต ะพะฑะตัะฟะตั‡ะตะฝะธะต (ะŸะž) ะฝะฐ ะฒั‹ะฟัƒัะบะฐะตะผั‹ะต ะธะผะธ ะบะพะผะฟัŒัŽั‚ะตั€ั‹ ะธ ัะตั€ะฒะตั€ั‹ (ะพั‚ััŽะดะฐ ะฝะฐะทะฒะฐะฝะธะต OEM-ะฒะตั€ัะธั, ั‚. ะต. ะฒะตั€ัะธั ะดะปั ยซOriginal Equipment Manufacturerยป, ะดะปั ัะฑะพั€ั‰ะธะบะฐ)."},
    {"role": "user", "content": "ะฒ ะดะฒะต ั‚ั‹ััั‡ะธ ั‚ั€ะธะฝะฐะดั†ะฐั‚ัŒ ะณะพะด ัƒะบะพะฝะบัƒั€ั ะณัƒะณะปะตัะบะธะน ะตะฝะบะธ ั„ะฐะธั€ ะพั€ะณะฐะฝะธะทะฐั‚ะพั€ะพะผ ะบะพั‚ะพั€ะพะณะพ ะฒั‹ัั‚ัƒะฟะฐะตั‚ ะบะพะผะฟะฐะฝะธั ะณัƒะณะปะต ะฟั€ะพะฒะพะดะธั‚ัั ะฒ ัั‚ั€ะตั‚ะธะน ั€ะฐะท ะฒะตะบะพะฝะบัƒ ะฒัะตะผะพะณัƒั‚ ัƒั‡ะฐัั‚ะฒะพะฒะฐั‚ัŒ ะดะตั‚ัŒ ะฒ ะฒะพะทั€ะฐัั‚ะธ ะพั‚ ั‚ั€ะธะฝะฐะดั†ะฐั‚ัŒ ะดะฐ ะฒะพัะตะผะฝะฐะดั†ะฐั‚ัŒ ะปะตั‚ ัะฒะพะธ ะฝะฐัƒั‡ะฝั‹ะต ะฟั€ะพะตะบั‚ั‹ ัƒั‡ะฐัั‚ะฝะธะบะธ ะพั‚ ะฟั€ะฐะฒะปััŽั‚ ะฝะฐ ั€ะฐััะผะพั‚ั€ะตะฝะธั ั‡ะตั€ะตะท ะธะฝั‚ะตั€ะฝะตั‚ั‹ั… ะธะทัƒั‡ะฐะตั‚ ะถัƒั€ะธ ัะพัั‚ะพัั‰ะธะต ะธะท ัƒั‡ะตะฝั‹ั… ะธ ัะพั‚ั€ัƒะดะฝะธะบะพะฒ ะณะพะณะปัŒ ะพะฝ ัƒะถะต ะพะฟั€ะตะดะตะปัะตั‚ ะดะตะฒัะฝะพัั‚ะพ ั€ะตะณะธะพะฝะฐะปัŒะฝั‹ั… ะธะทะฐั‚ะตะผ ะฟัั‚ะฝะฐะดั†ะฐั‚ัŒ ะณะปะพะฑะฐะปัŒะฝั‹ั… ั„ะตะฝะฐะปะธัั‚ะพะฒ"},
    {"role": "assistant", "content": "ะ’ 2013 ะณะพะดัƒ ะบะพะฝะบัƒั€ั Google Science Fair, ะพั€ะณะฐะฝะธะทะฐั‚ะพั€ะพะผ ะบะพั‚ะพั€ะพะณะพ ะฒั‹ัั‚ัƒะฟะฐะตั‚ ะบะพะผะฟะฐะฝะธั Google, ะฟั€ะพะฒะพะดะธั‚ัั ะฒ ั‚ั€ะตั‚ะธะน ั€ะฐะท. ะ’ ะบะพะฝะบัƒั€ัะต ะผะพะณัƒั‚ ัƒั‡ะฐัั‚ะฒะพะฒะฐั‚ัŒ ะดะตั‚ะธ ะฒ ะฒะพะทั€ะฐัั‚ะต ะพั‚ 13 ะดะพ 18 ะปะตั‚. ะกะฒะพะธ ะฝะฐัƒั‡ะฝั‹ะต ะฟั€ะพะตะบั‚ั‹ ัƒั‡ะฐัั‚ะฝะธะบะธ ะพั‚ะฟั€ะฐะฒะปััŽั‚ ะฝะฐ ั€ะฐััะผะพั‚ั€ะตะฝะธะต ั‡ะตั€ะตะท ะธะฝั‚ะตั€ะฝะตั‚. ะ˜ั… ะธะทัƒั‡ะฐะตั‚ ะถัŽั€ะธ, ัะพัั‚ะพัั‰ะตะต ะธะท ัƒั‡ะตะฝั‹ั… ะธ ัะพั‚ั€ัƒะดะฝะธะบะพะฒ Google. ะžะฝะพ ะถะต ะพะฟั€ะตะดะตะปัะตั‚ 90 ั€ะตะณะธะพะฝะฐะปัŒะฝั‹ั…, ะฐ ะทะฐั‚ะตะผ 15 ะณะปะพะฑะฐะปัŒะฝั‹ั… ั„ะธะฝะฐะปะธัั‚ะพะฒ."},
]
system_prompt_for_ASR_correction = [
    {"role": "system", "content": "ะ˜ัะฟั€ะฐะฒัŒ, ะฟะพะถะฐะปัƒะนัั‚ะฐ, ะพัˆะธะฑะบะธ ั€ะฐัะฟะพะทะฝะฐะฒะฐะฝะธั ั€ะตั‡ะธ ะฒ ัะปะตะดัƒัŽั‰ะตะผ ั‚ะตะบัั‚ะต, ะฒะพััั‚ะฐะฝะพะฒะธ ะฒ ะฝั‘ะผ ะทะฝะฐะบะธ ะฟัƒะฝะบั‚ัƒะฐั†ะธะธ ะธ ะฟั€ะฐะฒะธะปัŒะฝะพ ั€ะฐััั‚ะฐะฒัŒ ะฟั€ะพะฟะธัะฝั‹ะต ะธ ัั‚ั€ะพั‡ะฝั‹ะต ะฑัƒะบะฒั‹. ะŸะธัˆะธ ัะฒะพะน ะพั‚ะฒะตั‚ ะณั€ะฐะผะพั‚ะฝะพ, ั ัƒั‡ั‘ั‚ะพะผ ะผะพั€ั„ะพะปะพะณะธะธ ะธ ัะธะฝั‚ะฐะบัะธัะฐ ั€ัƒััะบะพะณะพ ัะทั‹ะบะฐ."}
]
messages = system_prompt_for_ASR_correction + few_shots_for_ASR_correction + [
    {"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    generation_config=gen_config
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Benchmarks

I report the results in the completion format for Meno-Tiny-0.1 on MERA, a well-known open-source independent benchmark for evaluating state-of-the-art models for the Russian language.

The MERA benchmark presents the results of solving more than 20 tasks for question answering, information retrieval, logic, commonsense reasoning, etc., for 59 large language models. I present selected results below. The full leaderboard is available at https://mera.a-ai.ru/en/leaderboard.

Rank Model Size Overall score
1 GPT4o - 0.642
2 RuadaptQwen-32B-instruct 32.0B 0.615
3 Qwen2.5-32B-Instruct 32.0B 0.603
... ... ... ...
6 GigaChat Max - 0.588
7 Mistral-Large-Instruct-2407 123.0B 0.574
8 GPT4o-mini - 0.570
... ... ... ...
12 GigaChat Pro - 0.512
13 GigaChat - 0.500
... ... ... ...
19 Phi-3-medium-4k-instruct 14.0B 0.465
... ... ... ...
34 Yi-1.5-9B-Chat-16K 14.0B 0.373
35 Meno-Tiny-0.1 1.5B 0.365
36 Qwen2.5 1.5B Instruct 1.5B 0.358
... ... ... ...
44 Mistral-7B-Instruct-v0.2 7.2B 0.318
45 Mistral-7B-Instruct-v0.3 7.2B 0.311
46 Yi-Coder-9B-Chat 9.0B 0.308
... ... ... ...
59 Qwen2.5-Math-1.5B-Instruct 1.5B 0.207

MultiQ Task

MultiQ is a multi-hop question-answering (QA) dataset for the Russian language, suitable for general open-domain question answering, information retrieval, and reading comprehension tasks. The results on the MultiQ task are crucial for evaluating the effectiveness of large language model (LLM) applications in the Retrieval-Augmented Generation (RAG) pipeline.

Rank Model Size MultiQ score
1 Mistral-Large-Instruct-2407 123.0B 0.630 / 0.471
2 Meta-Llama-3.1-405B-Instruct 405.0B 0.623 / 0.453
3 Meta-Llama-3.1-70B-Instruct 70.6B 0.607 / 0.443
... ... ... ...
7 GPT4o - 0.572 / 0.431
... ... ... ...
10 GPT4o-mini - 0.509 / 0.379
11 Mixtral-8x22B-Instruct-v0.1 140.6B 0.521 / 0.366
12 Qwen2-57B-A14B-Instruct 57.4B 0.480 / 0.348
13 ruadapt llama3-8B-instruct lep ft 8.4B 0.483 / 0.334
14 GigaChat Max - 0.486 / 0.322
... ... ... ...
21 Qwen2.5-Coder-7B-Instruct 7.0B 0.399 / 0.302
22 Meno-Tiny-0.1 1.5B 0.399 / 0.29
23 Yi-1.5-34B-Chat 34.4B 0.416 / 0.266
... ... ... ...
25 Qwen2.5-3B-Instruct 3.0B 0.391 / 0.263
26 GigaChat - 0.367 / 0.250
... ... ... ...
59 Qwen2.5-Math-7B-Instruct 7.0B 0.003 / 0.000

Intended Uses

Primary Use Cases

Meno-Tiny-0.1 is intended for commercial and research use in Russian. Meno-Tiny-0.1 provides uses for general purpose AI systems and applications which require:

  1. Memory/compute constrained environments
  2. Latency bound scenarios

Meno-Tiny-0.1 is designed to accelerate research on language models, for use as a building block for Retrieval Augmented Generation (RAG) pipelines.

Use Case Considerations

This model is not specifically designed or evaluated for all downstream purposes. Developers should consider common limitations of language models as they select use cases, and evaluate and mitigate for accuracy, safety, and fariness before using within a specific downstream use case, particularly for high risk scenarios. Developers should be aware of and adhere to applicable laws or regulations (including privacy, trade compliance laws, etc.) that are relevant to their use case.

Nothing contained in this Model Card should be interpreted as or deemed a restriction or modification to the license the model is released under.

Responsible AI Considerations

Like other language models, Meno-Tiny-0.1 can potentially behave in ways that are unfair, unreliable, or offensive. Some of the limiting behaviors to be aware of include:

  • Quality of Service: Meno-Tiny-0.1 is fine-tuned primarily on Russian text. Languages other than Russian will experience worse performance as well as performance disparities across non-Russian.
  • Safety gaps: I believe it is important to make language models more widely available for Russian, but Meno-Tiny-0.1 still exhibits challenges common across multilingual releases, since Meno-Tiny-0.1 is based on the multilingual Qwen2.5-1.5B-Instruct model. As with any deployment of LLMs, developers will be better positioned to test for performance or safety gaps for their linguistic and cultural context and customize Meno-Tiny-0.1 with additional fine-tuning and appropriate safeguards.
  • Representation of Harms & Perpetuation of Stereotypes: Meno-Tiny-0.1 can over- or under-represent groups of people, erase representation of some groups, or reinforce demeaning or negative stereotypes. Despite safety post-training, these limitations may still be present due to differing levels of representation of different groups, cultural contexts, or prevalence of examples of negative stereotypes in training data that reflect real-world patterns and societal biases.
  • Inappropriate or Offensive Content: Meno-Tiny-0.1 may produce other types of inappropriate or offensive content, which may make it inappropriate to deploy for sensitive contexts without additional mitigations that are specific to the case.
  • Information Reliability: Language models can generate nonsensical content or fabricate content that might sound reasonable but is inaccurate or outdated.
  • Long Conversation: Meno-Tiny-0.1, like other models, can in some cases generate responses that are repetitive, unhelpful, or inconsistent in very long chat sessions in both Russian and non-Russian languages. Developers are encouraged to place appropriate mitigations, like limiting conversation turns to account for the possible conversational drift.

Developers should apply responsible AI best practices, including mapping, measuring, and mitigating risks associated with their specific use case and cultural, linguistic context. Meno-Tiny-0.1 is general purpose model. As developers plan to deploy this model for specific use cases, they are encouraged to fine-tune the model for their use case and leverage the model as part of broader AI systems with language-specific safeguards in place. Important areas for consideration include:

  • Allocation: Meno-Tiny-0.1 may not be suitable for scenarios that could have consequential impact on legal status or the allocation of resources or life opportunities (ex: housing, employment, credit, etc.) without further assessments and additional debiasing techniques.
  • High-Risk Scenarios: Developers should assess the suitability of using Meno-Tiny-0.1 in high-risk scenarios where unfair, unreliable or offensive outputs might be extremely costly or lead to harm. This includes providing advice in sensitive or expert domains where accuracy and reliability are critical (ex: legal or health advice). Additional safeguards should be implemented at the application level according to the deployment context.
  • Misinformation: Meno-Tiny-0.1 may produce inaccurate information. Developers should follow transparency best practices and inform end-users they are interacting with an AI system. At the application level, developers can build feedback mechanisms and pipelines to ground responses in use-case specific, contextual information, a technique known as Retrieval Augmented Generation (RAG).
  • Generation of Harmful Content: Developers should assess outputs for their context and use available safety classifiers or custom solutions appropriate for their use case.
  • Misuse: Other forms of misuse such as fraud, spam, or malware production may be possible, and developers should ensure that their applications do not violate applicable laws and regulations.

Citation

If you want to cite this model you can use this:

@misc{bondarenko2024meno,
  title={Meno-Tiny: A Small Russian Language Model for Question Answering and Other Useful NLP Tasks in Russian},
  author={Bondarenko, Ivan},
  publisher={Hugging Face},
  journal={Hugging Face Hub},
  howpublished={\url{https://huggingface.co/bond005/meno-tiny-0.1}},
  year={2024}
}
Downloads last month
32
Safetensors
Model size
1.54B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bond005/meno-tiny-0.1

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(886)
this model
Merges
5 models
Quantizations
4 models