Meno-Tiny-0.1
Meno-Tiny-0.1 is a descendant of the Qwen2.5-1.5B-Instruct model, which was fine-tuned on a special Russian instruct dataset. It is a 1.5B parameter language model with a decoder. It is based on the Transformer architecture with SwiGLU activation, attention QKV bias, group query attention, etc. The name "Meno" is associated with the adaptation of this model for answering questions from text in the RAG pipeline (in honor of the theory of knowledge as recollection from the Socratic dialogue "Meno").
Requirements
The code of Meno-Tiny-0.1 has been in the latest Hugging face transformers
and we advise you to use the latest version of transformers
.
With transformers<4.37.0
, you will encounter the following error:
KeyError: 'qwen2'
Quickstart
Here, we provide a code snippet with apply_chat_template
to show you how to load the tokenizer and model and how to generate content.
Meno-Tiny-0.1 was specifically "Russified" during the fine-tuning stage, but it retained the ability to answer in English. The following are two examples of communication with Meno-Tiny-0.1 in English and Russian.
1. Example of communication in English
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
model_name = "bond005/meno-tiny-0.1"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)
prompt = "Give me a short introduction to large language model." # in English
messages = [
{"role": "system", "content": "You are Meno, created by Ivan Bondarenko. You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
generation_config=gen_config
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
2. Example of communication in Russian
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
model_name = "bond005/meno-tiny-0.1"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)
prompt = "ะะฐะฟะธัะธ ะบัะฐัะบะพะต ะฒะฒะตะดะตะฝะธะต ะฒ ะฑะพะปััะธะต ัะทัะบะพะฒัะต ะผะพะดะตะปะธ." # in Russian
messages = [
{"role": "system", "content": "ะขั - ะะตะฝะพะฝ, ัะฐะทัะฐะฑะพัะฐะฝะฝัะน ะะฒะฐะฝะพะผ ะะพะฝะดะฐัะตะฝะบะพ. ะขั ะฟะพะปะตะทะฝัะน ะฐััะธััะตะฝั."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
generation_config=gen_config
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
Abilities of Meno-Tiny-0.1
Using Meno-Tiny-0.1 with different system and user prompts allows you to discover its various abilities. The main tasks that Meno-Tiny-0.1 can solve, including in the few-shot prompting mode, are:
- Answering questions about the text;
- Summarization;
- Determining text toxicity and detoxifying the text;
- Anaphora resolution;
- Correcting speech recognition errors;
- and so on.
Below are some examples of how to communicate with Meno-Tiny-0.1 in Russian in order to solve a variety of specialized tasks.
1. The answer to the question about the document
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
model_name = "bond005/meno-tiny-0.1"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)
prompt = "ะัะฒะตัั ะฝะฐ ะฒะพะฟัะพั ะฟะพ ัะตะบััั.\n\nะะพะฟัะพั: {question}\n\nะขะตะบัั: {context}".format(
question="ะะดะต ะถะธะฒัั ะฟะธะฝะณะฒะธะฝั?",
context="ะัะฝะต ะฟะธะฝะณะฒะธะฝั ะฝะฐะธะฑะพะปะตะต ัะฐะทะฝะพะพะฑัะฐะทะฝั ะฝะฐ ะพัััะพะฒะฐั
ะกัะฑะฐะฝัะฐัะบัะธะบะธ; ะฒ ัะตะปะพะผ ัะฐัะฟัะพัััะฐะฝะตะฝะธะต ะณััะฟะฟั ัะฒัะทะฐะฝะพ ั ั
ะพะปะพะดะฝัะผะธ ะพะบะตะฐะฝะธัะตัะบะธะผะธ ัะตัะตะฝะธัะผะธ ะฎะถะฝะพะณะพ ะฟะพะปััะฐัะธั, ะฒะดะพะปั ะบะพัะพััั
ะฟะธะฝะณะฒะธะฝั ะฟัะพะฝะธะบะฐัั ะดะฐะปะตะบะพ ะฝะฐ ัะตะฒะตั โ ะฒ ััะฑััะพะฟะธะบะธ ะฎะถะฝะพะน ะะผะตัะธะบะธ (ะณัะผะฑะพะปัะดัะพะฒ ะธ ะผะฐะณะตะปะปะฐะฝะพะฒ ะฟะธะฝะณะฒะธะฝั), ะััะธะบะธ (ะพัะบะพะฒัะน ะฟะธะฝะณะฒะธะฝ Spheniscus demersus), ะะฒัััะฐะปะธะธ (ะผะฐะปัะน ะฟะธะฝะณะฒะธะฝ) ะธ ะดะฐะถะต ะบ ัะบะฒะฐัะพัะธะฐะปัะฝัะผ ะัััะพะฒะฐะผ ะะฐะปะฐะฟะฐะณะพั (ัะฝะดะตะผะธัะฝัะน ะณะฐะปะฐะฟะฐะณะพััะบะธะน ะฟะธะฝะณะฒะธะฝ, Spheniscus mendiculus). ะะฐ ะคะพะปะบะปะตะฝะดัะบะธั
ะพัััะพะฒะฐั
ัะธะผะฟะฐััะธัะฝะพ ะพะฑะธัะฐัั 5 ะฒะธะดะพะฒ. ะะธัั 3 ะฒะธะดะฐ โ ะธะผะฟะตัะฐัะพััะบะธะน, ะฐะฝัะฐัะบัะธัะตัะบะธะน (Pygoscelis antarcticus) ะฟะธะฝะณะฒะธะฝั ะธ ะฟะธะฝะณะฒะธะฝ ะะดะตะปะธ (Pygoscelis adeliae) โ ะฝะฐัะตะปััั ะฑะตัะตะณะพะฒัั ะบัะพะผะบั ะปะตะดะพะฒะพะณะพ ัะธัะฐ ะะฝัะฐัะบัะธะดั. ะกะตะฒะตัะฝะฐั ะณัะฐะฝะธัะฐ ัะฐัะฟัะพัััะฐะฝะตะฝะธั ะฑะพะปััะธะฝััะฒะฐ ะฟะธะฝะณะฒะธะฝะพะฒ ะพะฟัะตะดะตะปัะตััั ะธะทะพัะตัะผะพะน ะผะพััะบะพะน ะฒะพะดั +15โฆ+16 ยฐะก."
)
messages = [
{"role": "system", "content": "ะขั - ะะตะฝะพะฝ, ัะฐะทัะฐะฑะพัะฐะฝะฝัะน ะะฒะฐะฝะพะผ ะะพะฝะดะฐัะตะฝะบะพ. ะขั ะฟะพะปะตะทะฝัะน ะฐััะธััะตะฝั."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
generation_config=gen_config
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
2. Summarization
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
model_name = "bond005/meno-tiny-0.1"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)
prompt = "ะกัะฐะปะธ ะธะทะฒะตััะฝั ัะตะทัะปััะฐัั, ะฟะพะปััะตะฝะฝัะต ะพัะบัััะพะน ัะธััะตะผะพะน ยซะะธัะตัยป ะฝะฐ ะตะถะตะณะพะดะฝะพะน ะฐะบัะธะธ ยซะขะพัะฐะปัะฝัะน ะดะธะบัะฐะฝัยป, ะบะพัะพัะฐั ัะพััะพัะปะฐัั 20 ะฐะฟัะตะปั. ะะฐะฟะพะผะฝะธะผ, ััะพ ยซะะธัะตัยป ะฑัะป ัะฐะทัะฐะฑะพัะฐะฝ ะฝะฐััะฝัะผ ัะพัััะดะฝะธะบะพะผ ะะฐะฑะพัะฐัะพัะธะธ ะฟัะธะบะปะฐะดะฝัั
ัะธััะพะฒัั
ัะตั
ะฝะพะปะพะณะธะน ะะตะถะดัะฝะฐัะพะดะฝะพะณะพ ะฝะฐััะฝะพ-ะพะฑัะฐะทะพะฒะฐัะตะปัะฝะพะณะพ ะผะฐัะตะผะฐัะธัะตัะบะพะณะพ ัะตะฝััะฐ ะะะฃ ะธ ัะพะพัะฝะพะฒะฐัะตะปะตะผ ััะฐััะฐะฟะฐ ยซะกะธะฑะธััะบะธะต ะฝะตะนัะพัะตัะธยป ะะฒะฐะฝะพะผ ะะพะฝะดะฐัะตะฝะบะพ. ะะฟะตัะฒัะต ะธัะบััััะฒะตะฝะฝัะน ะธะฝัะตะปะปะตะบั ัะพัะตะฒะฝะพะฒะฐะปัั ะฒ ะณัะฐะผะพัะฝะพััะธ ั ัะตะปะพะฒะตัะตัะบะธะผ ะฒ ัะฐะผะบะฐั
ะทะฐะดะฐัะธ ะดะธะบัะฐะฝัะฐ, ะธ ัะพะทะดะฐัะตะปั ยซะะธััะฐยป ะฟัะตะดะฟะพะปะฐะณะฐะป, ััะพ ะฟะพะปะพะถะธัะตะปัะฝะพะน ะพัะตะฝะบะธ ัะพั ะฝะต ะฟะพะปััะธั โ ัะบะพัะตะต ะฒัะตะณะพ, ัะธััะตะผะฐ ะดะพะฟัััะธั ะผะธะฝะธะผัะผ ะพััะพะณัะฐัะธัะตัะบะธั
ะพัะธะฑะพะบ, ะพะดะฝะฐะบะพ ั ัะฐัััะฐะฒะปะตะฝะธะตะผ ะทะฝะฐะบะพะฒ ะฟัะตะฟะธะฝะฐะฝะธั ะฒััะด ะปะธ ัะฟัะฐะฒะธััั. \n\nะ ะฐะทัะฐะฑะพััะธะบะฐะผ ยซะะธััะฐยป ะฑัะปะพ ะฒะฐะถะฝะพ ัะพะฑัะฐัั ััะฐัะธััะธะบั ะพ ัะฐะทะฝะพะพะฑัะฐะทะธะธ ัะพะฒะตััะฐะตะผัั
ะธะผ ะพัะธะฑะพะบ ะธ ะฝะตัะพัะฝะพััะตะน, ััะพะฑั ะฒ ะดะฐะปัะฝะตะนัะตะผ ััะพะฒะตััะตะฝััะฒะพะฒะฐัั ัะธััะตะผั. ะ ะตะทัะปััะฐัั ะพะบะฐะทะฐะปะธัั ะฝะตะพะถะธะดะฐะฝะฝัะผะธ, ะฝะพ ะทะฐะบะพะฝะพะผะตัะฝัะผะธ โ ยซะะธัะตัยป ะฒะฟะพะปะฝะต ัะดะพะฒะปะตัะฒะพัะธัะตะปัะฝะพ ัะฐัััะฐะฒะธะป ะทะฐะฟัััะต ะธ ัะฐะทะฑะธะป ัะตะบัั ะฝะฐ ะฐะฑะทะฐัั. ะะปั ััะพะณะพ ะตะณะพ ัะฟะตัะธะฐะปัะฝะพ ะฝะฐััะธะปะธ ัะปะฐะฒะปะธะฒะฐัั ะฒ ัะตัะธ ยซะบะพะดะพะฒัะต ััะฐะทัยป ะฒัะพะดะต ยซะฟะธัะตะผ ั ะบัะฐัะฝะพะน ัััะพะบะธยป ะธะปะธ ยซะฟะตัะตั
ะพะดะธะผ ะฝะฐ ะฝะพะฒัะน ะฐะฑะทะฐัยป. ะ ััะธั
ัะตะปัั
ะธัะฟะพะปัะทะพะฒะฐะปะฐัั ะพัะดะตะปัะฝะฐั ะฝะตะนัะพัะตัั, ะพะฑััะตะฝะฝะฐั ะฝะฐ ะฑะฐะทะต Longformer ะฒัะดะตะปััั ัะฐะบะธะต ยซะฒะฝะตััะถะตัะฝัะตยป ะฒััะฐะฒะบะธ ะฝะฐะฟะพะดะพะฑะธะต ัะธััะตะผั NER (Named Entity Recognition - ัะฐัะฟะพะทะฝะฐะฒะฐะฝะธะต ะธะผะตะฝะพะฒะฐะฝะฝัั
ัััะฝะพััะตะน). ะะปั ะพะฑััะตะฝะธั ะธัะฟะพะปัะทะพะฒะฐะปัั ัะธะฝัะตัะธัะตัะบะธะน ัะตะบััะพะฒัะน ะบะพัะฟัั. ะกะฐะผ ะถะต ยซะะธัะตัยป ะธัะฟะพะปัะทะพะฒะฐะป ะฒ ัะฒะพะตะน ัะฐะฑะพัะต ัะฒัะทะบั Wav2Vec2-Large-Ru-Golos + Whisper-Podlodka (ะพ Wav2Vec2-Large-Ru-Golos ะผั ัะฐะฝะตะต ะฟะธัะฐะปะธ https://www.nsu.ru/n/media/news/nauka/razrabotannuyu-professorom-ngu-model-raspoznavaniya-rechi-nauchili-razlichat-emotsii, ะฐ Whisper-Podlodka ัะฒะปัะตััั ะฝะพะฒะพะน ะผะพะดะตะปัั). ะะดะฝะฐะบะพ ะณะฐะปะปััะธะฝะฐัะธะน ะธะทะฑะตะถะฐัั ะฝะต ัะดะฐะปะพัั.\n\nะะฐะปะปััะธะฝะฐัะธั โ ััะพ ะพัะฒะตั ะฐะฒัะพัะตะณัะตััะธะพะฝะฝะพะน ะฝะตะนัะพัะตัะตะฒะพะน ะผะพะดะตะปะธ ัะทัะบะฐ, ะบะพัะพััะน ะบะพััะตะบัะตะฝ ะณัะฐะผะผะฐัะธัะตัะบะธ, ะฝะพ ะฝะตะฒะตัะตะฝ ัะตะผะฐะฝัะธัะตัะบะธ (ะฝะต ัะพะพัะฒะตัััะฒัะตั ะฒั
ะพะดะฝะพะผั ะทะฐะฟัะพัั ะฟะพ ัะผััะปั)."
messages = [
{"role": "system", "content": "ะะตัะตัะบะฐะถะธ ะบัะฐัะบะพ ัะตะบัั."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
generation_config=gen_config
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
3. Anaphora resolution in dialogue (with few-shot prompting)
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
model_name = "bond005/meno-tiny-0.1"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)
user_prompt = "User: ะัะพ ัะตะนัะฐั ัะตะบัะพั ะะพะฒะพัะธะฑะธััะบะพะณะพ ะณะพััะดะฐัััะฒะตะฝะฝะพะณะพ ัะฝะธะฒะตััะธัะตัะฐ?\nAssistant: ะ ะตะบัะพัะพะผ ะะพะฒะพัะธะฑะธััะบะพะณะพ ะณะพััะดะฐัััะฒะตะฝะฝะพะณะพ ัะฝะธะฒะตััะธัะตัะฐ ัะฒะปัะตััั ะะธั
ะฐะธะป ะะตััะพะฒะธั ะคะตะดะพััะบ, ะฐะบะฐะดะตะผะธะบ ะ ะพััะธะนัะบะพะน ะฐะบะฐะดะตะผะธะธ ะฝะฐัะบ, ะดะพะบัะพั ัะธะทะธะบะพ-ะผะฐัะตะผะฐัะธัะตัะบะธั
ะฝะฐัะบ, ะฟัะพัะตััะพั.\nUser: ะะฐะบะธะต ั ะฝะตะณะพ ะฝะฐััะฝัะต ะธะฝัะตัะตัั?"
few_shots_for_anaphora = [
{"role": "user", "content": "User: ะงัะพ ัะฐะบะพะต ะผะตั
ะฐะฝะธะบะพ-ะผะฐัะตะผะฐัะธัะตัะบะธะน ัะฐะบัะปััะตั?\nAssistant: ะะตั
ะฐะฝะธะบะพ-ะผะฐัะตะผะฐัะธัะตัะบะธะน ัะฐะบัะปััะตั ะะะฃ โ ััะพ ัะฐะบัะปััะตั, ะฒัะฟััะบะฝะธะบะธ ะบะพัะพัะพะณะพ ะพัััะตััะฒะปััั ะฝะฐััะฝัะต ะธััะปะตะดะพะฒะฐะฝะธั ะธ ัะฐะทัะฐะฑะพัะบะธ ะดะปั ะปัััะธั
ะบะพะผะฟะฐะฝะธะน ะผะธัะฐ. ะกััะดะตะฝั ะะตั
ะฐะฝะธะบะพ-ะผะฐัะตะผะฐัะธัะตัะบะพะณะพ ัะฐะบัะปััะตัะฐ ััะธััั ะฟัะตะพะฑัะฐะทะพะฒัะฒะฐัั ัะฒะพะธ ัะฐะทัะพะทะฝะตะฝะฝัะต ะผััะปะธ ะฒ ัะตัะบะพ ััััะบัััะธัะพะฒะฐะฝะฝัะต ัะฐัััะถะดะตะฝะธั, ะพะฑะปะฐะดะฐััะธะต ะปะพะณะธัะตัะบะพะน ัััะพะนะฝะพัััั.\nUser: ะ ัะฐะผ ะตััั ะผะฐะณะธัััะฐัััะฐ?"},
{"role": "assistant", "content": "ะ ะฝะฐ ะผะตั
ะฐะฝะธะบะพ-ะผะฐัะตะผะฐัะธัะตัะบะพะผ ัะฐะบัะปััะตัะต ะตััั ะผะฐะณะธัััะฐัััะฐ?"},
{"role": "user", "content": "User: ะะพะณะดะฐ ะฝะฐัะธะฝะฐะตััั ะฟัะธัะผ ะดะพะบัะผะตะฝัะพะฒ ะฒ ะะะฃ?\nAssistant: ะัะธัะผ ะดะพะบัะผะตะฝัะพะฒ ะฒ ะะะฃ ะฝะฐัะธะฝะฐะตััั 1 ะผะฐััะฐ โ ะดะปั ะธะฝะพัััะฐะฝะฝัั
ะณัะฐะถะดะฐะฝ ะธ ะปะธั ะฑะตะท ะณัะฐะถะดะฐะฝััะฒะฐ ะธ 20 ะธัะฝั โ ะดะปั ะณัะฐะถะดะฐะฝ ะ ะพััะธะนัะบะพะน ะคะตะดะตัะฐัะธะธ.\nUser: ะ ะบะพะณะดะฐ ะพะฝ ะทะฐะบะฐะฝัะธะฒะฐะตััั?"},
{"role": "assistant", "content": "ะ ะบะพะณะดะฐ ะฟัะธัะผ ะดะพะบัะผะตะฝัะพะฒ ะฒ ะะะฃ ะทะฐะบะฐะฝัะธะฒะฐะตััั?"},
{"role": "user", "content": "User: ะัะพ ะพัะฝะพะฒะฐะป ะะพะฒะพัะธะฑะธััะบะธะน ะะบะฐะดะตะผะณะพัะพะดะพะบ?\nAssistant: ะะพะฒะพัะธะฑะธััะบะธะน ะะบะฐะดะตะผะณะพัะพะดะพะบ ะพัะฝะพะฒะฐะป ะะธั
ะฐะธะป ะะปะตะบัะตะตะฒะธั ะะฐะฒัะตะฝััะตะฒ ะฒ 1957 ะณะพะดั.\nUser: ะงะตะผ ะถะต ะพะฝ ะทะฐะฝะธะผะฐะปัั ะดะพ ััะพะณะพ?"},
{"role": "assistant", "content": "ะงะตะผ ะถะต ะะธั
ะฐะธะป ะะปะตะบัะตะตะฒะธั ะะฐะฒัะตะฝััะตะฒ ะทะฐะฝะธะผะฐะปัั ะดะพ ะพัะฝะพะฒะฐะฝะธั ะะพะฒะพัะธะฑะธััะบะพะณะพ ะะบะฐะดะตะผะณะพัะพะดะบะฐ?"}
]
system_prompt_for_anaphora = [
{"role": "system", "content": "ะะตัะตะฟะธัะธ ัะตะบัั ะฟะพัะปะตะดะฝะตะน ัะตะฟะปะธะบะธ ะฟะพะปัะทะพะฒะฐัะตะปั ะฒ ะดะธะฐะปะพะณะต ัะฐะบ, ััะพ ัะฐะทัะตัะธัั ะฒัะต ัะธััะฐัะธะธ ะผะตััะพะธะผะตะฝะฝะพะน ะฐะฝะฐัะพัั ะฒ ััะพะผ ัะตะบััะต. ะะฐะผะตะฝะธ ะฐะฝะฐัะพัะธัะตัะบะธะต ะผะตััะพะธะผะตะฝะธั ัะพะพัะฒะตัััะฒัััะธะผะธ ะธะผ ัััะตััะฒะธัะตะปัะฝัะผะธ."}
]
messages = system_prompt_for_anaphora + few_shots_for_anaphora + [
{"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
generation_config=gen_config
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
4. Correction of speech recognition output (with few-shot prompting)
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
model_name = "bond005/meno-tiny-0.1"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
gen_config = GenerationConfig.from_pretrained(model_name)
user_prompt = "ัะพ ะตััั ะผั ะฒ ะบะฐะถะดัะน ะผะพะผะตะฝั ะฒัะตะผะตะฝะธ ะทะฝะฐะตะผ ะฟัะพ ะทะฒัะบ ะตัะต ะธ ะบะฐะบะพะต ัะพ ัะฐะบะพะต ัะฐัะฟัะตะดะตะปะตะฝะธะต ัะฐััะพั ะธ ัะฒัะทะฐะฝะฝะพะต ััะพ ั ัะตะผ ััะพ ะฝะฐัะต ัั
ะพ ะฝะฐ ัะฐะผะพะผ ะดะตะปะต ะฟัะธะผะตัะฝะพ ัะฐะบะถะต ะธ ะฒะพัะฟัะธะฝะธะผะฐัั ะทะฒัะบ ัะพ ะตััั ะผั ะฝะต ะฟัะพััะพ ะฟะพะฝะธะผะฐะตะผ ััะพ ะฒะพั ะณะดะต ัะพ ัะฐะผ ะณัะพะผัะต ะณะดะต ัะพ ัะธัะต ะฐ ะฝะฐัะต ัั
ั ะตัะต ะฟะพะฝะธะผะฐะตั ััะพ ะฒะพั ััะพั ะทะฒัะบ ะฒััะต ััะพั ะฝะธะถะต ััะพั ะณะพะปะพั ะฑะพะปะต ะฒััะพะบะธะน ััะพั ะณะพะปะพั ะฝะธะทะบะธ"
few_shots_for_ASR_correction = [
{"role": "user", "content": "ะฒั ะฒัะฑะพััะบะพะผ ัะฐะนะพะฝะต ะณะพัะพะดะฐ ะฟัะพะฒะพะดะธััั ะฟัะพะฒะตัะบะฐ ะฟะพ ัะฐะบัั ะฝะฐะฟะฐะดะตะฝะธั ะฝะฐ ะบะฒะฐััะธัั"},
{"role": "assistant", "content": "ะ ะัะฑะพัะณัะบะพะผ ัะฐะนะพะฝะต ะณะพัะพะดะฐ ะฟัะพะฒะพะดะธััั ะฟัะพะฒะตัะบะฐ ะฟะพ ัะฐะบัั ะฝะฐะฟะฐะดะตะฝะธั ะฝะฐ ะบะฒะฐััะธัั."},
{"role": "user", "content": "ะฒ ะฟะพัะฐะดะต ะฟะพะฑะตะดั ะฝะฐ ะบัะฐัะฝะพะน ะฟะปะพัะฐะดะธ ะฒ ะผะพัะบะฒะต ะฟัะตะผัั ััะฐััะธะต ัะฒัััะต ัะตัััะฝะฐะดัะฐัั ัั ัะฒะพะธะผ ะฝะฐัะปัะถะฐัะธั
ะฒ ัะพะผ ัะธัะปะต ะพะดะธะฝััะฒะพะณะพ ะผัะทัะบะฐะฝัะฐ ะฒัะฒะพะดะฝะพะณะพ ะพัะบะตัััะฐ ัะพะพะฑัะธะป ะธะฝัะตััะฐะบั ัะปะพะฒะฝะฒั ัะตัะฒะตัะณ ะฟัะตะดััะฐะฒะธัะตะปะต ะฟัะฐะฒะปะตะฝะธั ะฟัะตั ัะปัะถะฑั ะธ ะธะฝัะพัะผะฐัะธั ะธะผะตะฝะฝะพ ะฑะพัะพะฝ ัั ะฟะพ ััั
ะพะพะฟััะฝัะผ ะฒะพะนัะบะฐะผ ะฟะพะปะบะพะฒะฝะธะบ ัะตัะณะตะน ะฒะปะฐัะพะฒ"},
{"role": "assistant", "content": "ะ ะฟะฐัะฐะดะต ะะพะฑะตะดั ะฝะฐ ะัะฐัะฝะพะน ะฟะปะพัะฐะดะธ ะฒ ะะพัะบะฒะต ะฟัะธะผัั ััะฐััะธะต ัะฒััะต 14 ััั. ะฒะพะตะฝะฝะพัะปัะถะฐัะธั
, ะฒ ัะพะผ ัะธัะปะต 1 100 ะผัะทัะบะฐะฝัะพะฒ ัะฒะพะดะฝะพะณะพ ะพัะบะตัััะฐ, ัะพะพะฑัะธะป ยซะะฝัะตััะฐะบัั-ะะะยป ะฒ ัะตัะฒะตัะณ ะฟัะตะดััะฐะฒะธัะตะปั ัะฟัะฐะฒะปะตะฝะธั ะฟัะตัั-ัะปัะถะฑั ะธ ะธะฝัะพัะผะฐัะธะธ ะะธะฝะพะฑะพัะพะฝั ะ ะค ะฟะพ ะกัั
ะพะฟััะฝัะผ ะฒะพะนัะบะฐะผ ะฟะพะปะบะพะฒะฝะธะบ ะกะตัะณะตะน ะะปะฐัะพะฒ."},
{"role": "user", "content": "ะณะปะฐะฒะฝัะต ะฟะพััะตะฑะธัะตะปะธ ะดะธัะบะพัั ะฒะธะฝะดะพะฒั ะธะผะธ ะบัะพัะพััะพะฒ ัะธะบะธัะพััะธะนัะบะธะต ัะฑะพััะธะบะธ ัััะฐะฝะฐะฒะปะธะฒะฐััะธะต ะฟัะพะณัะฐะผะผะฝัะต ะพะฑะตัะฟะตัะตะฝะธะต ะฟัะฐะฝะฐ ะฒัะฟััะบะฐะตะผัะต ะธะผะธ ะบะพะผะฟัััะตัั ะธ ัะตัะฒะตัั ะพัััะดะฐ ะฝะฐะทะฒะฐะฝะธะต ะฐ ะธะฝะฒะตััะธั ััะพะน ัะฒะตััะธั ะดะปั ะพัะธะณะธะฝะฐะปะณะธะบะพะตะฟะผะตะฝัะผะฐะฝะพ ัะฐะบััะตั ะดะปั ัะฑะพััะธะบะฐ"},
{"role": "assistant", "content": "ะะปะฐะฒะฝัะต ะฟะพััะตะฑะธัะตะปะธ ะดะธัะบะพะฒ ั ยซWindowsยป ะธ ยซMicrosoft Officeยป - ัะพััะธะนัะบะธะต ัะฑะพััะธะบะธ, ัััะฐะฝะฐะฒะปะธะฒะฐััะธะต ะฟัะพะณัะฐะผะผะฝะพะต ะพะฑะตัะฟะตัะตะฝะธะต (ะะ) ะฝะฐ ะฒัะฟััะบะฐะตะผัะต ะธะผะธ ะบะพะผะฟัััะตัั ะธ ัะตัะฒะตัั (ะพัััะดะฐ ะฝะฐะทะฒะฐะฝะธะต OEM-ะฒะตััะธั, ั. ะต. ะฒะตััะธั ะดะปั ยซOriginal Equipment Manufacturerยป, ะดะปั ัะฑะพััะธะบะฐ)."},
{"role": "user", "content": "ะฒ ะดะฒะต ัััััะธ ััะธะฝะฐะดัะฐัั ะณะพะด ัะบะพะฝะบััั ะณัะณะปะตัะบะธะน ะตะฝะบะธ ัะฐะธั ะพัะณะฐะฝะธะทะฐัะพัะพะผ ะบะพัะพัะพะณะพ ะฒััััะฟะฐะตั ะบะพะผะฟะฐะฝะธั ะณัะณะปะต ะฟัะพะฒะพะดะธััั ะฒ ัััะตัะธะน ัะฐะท ะฒะตะบะพะฝะบั ะฒัะตะผะพะณัั ััะฐััะฒะพะฒะฐัั ะดะตัั ะฒ ะฒะพะทัะฐััะธ ะพั ััะธะฝะฐะดัะฐัั ะดะฐ ะฒะพัะตะผะฝะฐะดัะฐัั ะปะตั ัะฒะพะธ ะฝะฐััะฝัะต ะฟัะพะตะบัั ััะฐััะฝะธะบะธ ะพั ะฟัะฐะฒะปััั ะฝะฐ ัะฐััะผะพััะตะฝะธั ัะตัะตะท ะธะฝัะตัะฝะตััั
ะธะทััะฐะตั ะถััะธ ัะพััะพััะธะต ะธะท ััะตะฝัั
ะธ ัะพัััะดะฝะธะบะพะฒ ะณะพะณะปั ะพะฝ ัะถะต ะพะฟัะตะดะตะปัะตั ะดะตะฒัะฝะพััะพ ัะตะณะธะพะฝะฐะปัะฝัั
ะธะทะฐัะตะผ ะฟััะฝะฐะดัะฐัั ะณะปะพะฑะฐะปัะฝัั
ัะตะฝะฐะปะธััะพะฒ"},
{"role": "assistant", "content": "ะ 2013 ะณะพะดั ะบะพะฝะบััั Google Science Fair, ะพัะณะฐะฝะธะทะฐัะพัะพะผ ะบะพัะพัะพะณะพ ะฒััััะฟะฐะตั ะบะพะผะฟะฐะฝะธั Google, ะฟัะพะฒะพะดะธััั ะฒ ััะตัะธะน ัะฐะท. ะ ะบะพะฝะบัััะต ะผะพะณัั ััะฐััะฒะพะฒะฐัั ะดะตัะธ ะฒ ะฒะพะทัะฐััะต ะพั 13 ะดะพ 18 ะปะตั. ะกะฒะพะธ ะฝะฐััะฝัะต ะฟัะพะตะบัั ััะฐััะฝะธะบะธ ะพัะฟัะฐะฒะปััั ะฝะฐ ัะฐััะผะพััะตะฝะธะต ัะตัะตะท ะธะฝัะตัะฝะตั. ะั
ะธะทััะฐะตั ะถััะธ, ัะพััะพััะตะต ะธะท ััะตะฝัั
ะธ ัะพัััะดะฝะธะบะพะฒ Google. ะะฝะพ ะถะต ะพะฟัะตะดะตะปัะตั 90 ัะตะณะธะพะฝะฐะปัะฝัั
, ะฐ ะทะฐัะตะผ 15 ะณะปะพะฑะฐะปัะฝัั
ัะธะฝะฐะปะธััะพะฒ."},
]
system_prompt_for_ASR_correction = [
{"role": "system", "content": "ะัะฟัะฐะฒั, ะฟะพะถะฐะปัะนััะฐ, ะพัะธะฑะบะธ ัะฐัะฟะพะทะฝะฐะฒะฐะฝะธั ัะตัะธ ะฒ ัะปะตะดัััะตะผ ัะตะบััะต, ะฒะพัััะฐะฝะพะฒะธ ะฒ ะฝัะผ ะทะฝะฐะบะธ ะฟัะฝะบััะฐัะธะธ ะธ ะฟัะฐะฒะธะปัะฝะพ ัะฐัััะฐะฒั ะฟัะพะฟะธัะฝัะต ะธ ัััะพัะฝัะต ะฑัะบะฒั. ะะธัะธ ัะฒะพะน ะพัะฒะตั ะณัะฐะผะพัะฝะพ, ั ััััะพะผ ะผะพััะพะปะพะณะธะธ ะธ ัะธะฝัะฐะบัะธัะฐ ััััะบะพะณะพ ัะทัะบะฐ."}
]
messages = system_prompt_for_ASR_correction + few_shots_for_ASR_correction + [
{"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
generation_config=gen_config
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
Benchmarks
I report the results in the completion format for Meno-Tiny-0.1 on MERA, a well-known open-source independent benchmark for evaluating state-of-the-art models for the Russian language.
The MERA benchmark presents the results of solving more than 20 tasks for question answering, information retrieval, logic, commonsense reasoning, etc., for 59 large language models. I present selected results below. The full leaderboard is available at https://mera.a-ai.ru/en/leaderboard.
Rank | Model | Size | Overall score |
---|---|---|---|
1 | GPT4o | - | 0.642 |
2 | RuadaptQwen-32B-instruct | 32.0B | 0.615 |
3 | Qwen2.5-32B-Instruct | 32.0B | 0.603 |
... | ... | ... | ... |
6 | GigaChat Max | - | 0.588 |
7 | Mistral-Large-Instruct-2407 | 123.0B | 0.574 |
8 | GPT4o-mini | - | 0.570 |
... | ... | ... | ... |
12 | GigaChat Pro | - | 0.512 |
13 | GigaChat | - | 0.500 |
... | ... | ... | ... |
19 | Phi-3-medium-4k-instruct | 14.0B | 0.465 |
... | ... | ... | ... |
34 | Yi-1.5-9B-Chat-16K | 14.0B | 0.373 |
35 | Meno-Tiny-0.1 | 1.5B | 0.365 |
36 | Qwen2.5 1.5B Instruct | 1.5B | 0.358 |
... | ... | ... | ... |
44 | Mistral-7B-Instruct-v0.2 | 7.2B | 0.318 |
45 | Mistral-7B-Instruct-v0.3 | 7.2B | 0.311 |
46 | Yi-Coder-9B-Chat | 9.0B | 0.308 |
... | ... | ... | ... |
59 | Qwen2.5-Math-1.5B-Instruct | 1.5B | 0.207 |
MultiQ Task
MultiQ is a multi-hop question-answering (QA) dataset for the Russian language, suitable for general open-domain question answering, information retrieval, and reading comprehension tasks. The results on the MultiQ task are crucial for evaluating the effectiveness of large language model (LLM) applications in the Retrieval-Augmented Generation (RAG) pipeline.
Rank | Model | Size | MultiQ score |
---|---|---|---|
1 | Mistral-Large-Instruct-2407 | 123.0B | 0.630 / 0.471 |
2 | Meta-Llama-3.1-405B-Instruct | 405.0B | 0.623 / 0.453 |
3 | Meta-Llama-3.1-70B-Instruct | 70.6B | 0.607 / 0.443 |
... | ... | ... | ... |
7 | GPT4o | - | 0.572 / 0.431 |
... | ... | ... | ... |
10 | GPT4o-mini | - | 0.509 / 0.379 |
11 | Mixtral-8x22B-Instruct-v0.1 | 140.6B | 0.521 / 0.366 |
12 | Qwen2-57B-A14B-Instruct | 57.4B | 0.480 / 0.348 |
13 | ruadapt llama3-8B-instruct lep ft | 8.4B | 0.483 / 0.334 |
14 | GigaChat Max | - | 0.486 / 0.322 |
... | ... | ... | ... |
21 | Qwen2.5-Coder-7B-Instruct | 7.0B | 0.399 / 0.302 |
22 | Meno-Tiny-0.1 | 1.5B | 0.399 / 0.29 |
23 | Yi-1.5-34B-Chat | 34.4B | 0.416 / 0.266 |
... | ... | ... | ... |
25 | Qwen2.5-3B-Instruct | 3.0B | 0.391 / 0.263 |
26 | GigaChat | - | 0.367 / 0.250 |
... | ... | ... | ... |
59 | Qwen2.5-Math-7B-Instruct | 7.0B | 0.003 / 0.000 |
Intended Uses
Primary Use Cases
Meno-Tiny-0.1 is intended for commercial and research use in Russian. Meno-Tiny-0.1 provides uses for general purpose AI systems and applications which require:
- Memory/compute constrained environments
- Latency bound scenarios
Meno-Tiny-0.1 is designed to accelerate research on language models, for use as a building block for Retrieval Augmented Generation (RAG) pipelines.
Use Case Considerations
This model is not specifically designed or evaluated for all downstream purposes. Developers should consider common limitations of language models as they select use cases, and evaluate and mitigate for accuracy, safety, and fariness before using within a specific downstream use case, particularly for high risk scenarios. Developers should be aware of and adhere to applicable laws or regulations (including privacy, trade compliance laws, etc.) that are relevant to their use case.
Nothing contained in this Model Card should be interpreted as or deemed a restriction or modification to the license the model is released under.
Responsible AI Considerations
Like other language models, Meno-Tiny-0.1 can potentially behave in ways that are unfair, unreliable, or offensive. Some of the limiting behaviors to be aware of include:
- Quality of Service: Meno-Tiny-0.1 is fine-tuned primarily on Russian text. Languages other than Russian will experience worse performance as well as performance disparities across non-Russian.
- Safety gaps: I believe it is important to make language models more widely available for Russian, but Meno-Tiny-0.1 still exhibits challenges common across multilingual releases, since Meno-Tiny-0.1 is based on the multilingual Qwen2.5-1.5B-Instruct model. As with any deployment of LLMs, developers will be better positioned to test for performance or safety gaps for their linguistic and cultural context and customize Meno-Tiny-0.1 with additional fine-tuning and appropriate safeguards.
- Representation of Harms & Perpetuation of Stereotypes: Meno-Tiny-0.1 can over- or under-represent groups of people, erase representation of some groups, or reinforce demeaning or negative stereotypes. Despite safety post-training, these limitations may still be present due to differing levels of representation of different groups, cultural contexts, or prevalence of examples of negative stereotypes in training data that reflect real-world patterns and societal biases.
- Inappropriate or Offensive Content: Meno-Tiny-0.1 may produce other types of inappropriate or offensive content, which may make it inappropriate to deploy for sensitive contexts without additional mitigations that are specific to the case.
- Information Reliability: Language models can generate nonsensical content or fabricate content that might sound reasonable but is inaccurate or outdated.
- Long Conversation: Meno-Tiny-0.1, like other models, can in some cases generate responses that are repetitive, unhelpful, or inconsistent in very long chat sessions in both Russian and non-Russian languages. Developers are encouraged to place appropriate mitigations, like limiting conversation turns to account for the possible conversational drift.
Developers should apply responsible AI best practices, including mapping, measuring, and mitigating risks associated with their specific use case and cultural, linguistic context. Meno-Tiny-0.1 is general purpose model. As developers plan to deploy this model for specific use cases, they are encouraged to fine-tune the model for their use case and leverage the model as part of broader AI systems with language-specific safeguards in place. Important areas for consideration include:
- Allocation: Meno-Tiny-0.1 may not be suitable for scenarios that could have consequential impact on legal status or the allocation of resources or life opportunities (ex: housing, employment, credit, etc.) without further assessments and additional debiasing techniques.
- High-Risk Scenarios: Developers should assess the suitability of using Meno-Tiny-0.1 in high-risk scenarios where unfair, unreliable or offensive outputs might be extremely costly or lead to harm. This includes providing advice in sensitive or expert domains where accuracy and reliability are critical (ex: legal or health advice). Additional safeguards should be implemented at the application level according to the deployment context.
- Misinformation: Meno-Tiny-0.1 may produce inaccurate information. Developers should follow transparency best practices and inform end-users they are interacting with an AI system. At the application level, developers can build feedback mechanisms and pipelines to ground responses in use-case specific, contextual information, a technique known as Retrieval Augmented Generation (RAG).
- Generation of Harmful Content: Developers should assess outputs for their context and use available safety classifiers or custom solutions appropriate for their use case.
- Misuse: Other forms of misuse such as fraud, spam, or malware production may be possible, and developers should ensure that their applications do not violate applicable laws and regulations.
Citation
If you want to cite this model you can use this:
@misc{bondarenko2024meno,
title={Meno-Tiny: A Small Russian Language Model for Question Answering and Other Useful NLP Tasks in Russian},
author={Bondarenko, Ivan},
publisher={Hugging Face},
journal={Hugging Face Hub},
howpublished={\url{https://huggingface.co/bond005/meno-tiny-0.1}},
year={2024}
}
- Downloads last month
- 32
Model tree for bond005/meno-tiny-0.1
Base model
Qwen/Qwen2.5-1.5B