IlyaGusev/saiga_scored
Viewer • Updated • 41.6k • 523 • 23
How to use svyatsharov/Role-play-ai with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-32b-instruct-bnb-4bit")
model = PeftModel.from_pretrained(base_model, "svyatsharov/Role-play-ai")How to use svyatsharov/Role-play-ai with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for svyatsharov/Role-play-ai to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for svyatsharov/Role-play-ai to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for svyatsharov/Role-play-ai to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="svyatsharov/Role-play-ai",
max_seq_length=2048,
)Fine-tuned Qwen/Qwen2.5-32B-Instruct для SFW roleplay-диалогов на русском и английском. Адаптер LoRA r=64.
Code & metrics: https://github.com/ichinosekei/Role-play-ai
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-32B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(base, "svyatsharov/Role-play-ai")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-32B-Instruct")
messages = [
{"role": "system", "content": "You are Mira, a warm tavern owner. Witty but firm."},
{"role": "user", "content": "*sits at the bar* Tough day."},
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True,
add_generation_prompt=True,
return_tensors="pt").to(model.device)
out = model.generate(inputs, max_new_tokens=300, temperature=0.85, top_p=0.9, do_sample=True)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="svyatsharov/Role-play-ai",
max_seq_length=4096,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
| Параметр | Значение |
|---|---|
| Base model | Qwen/Qwen2.5-32B-Instruct |
| Adapter type | LoRA (PEFT) r=64, alpha=128 |
| Trainable params | 537M (1.6% от 32B) |
| Context length | 6144 |
| Languages | English, Russian |
| Chat template | ChatML |
| License | Apache 2.0 (наследуется от Qwen2.5) |
Метрики посчитаны на eval-сете (1203 примера, 5% от полного датасета). Все 5 групп метрик:
| Метрика | Значение | Цель |
|---|---|---|
| Perplexity (overall) | 3.31 | 5–12 |
| Perplexity (EN) | 3.24 | 5–12 |
| Perplexity (RU) | 3.58 | 5–15 |
| Token accuracy | 0.678 | >0.55 ✅ |
| Метрика | Значение | Цель |
|---|---|---|
| BLEU-4 | 11.20 | >5 ✅ |
| ROUGE-L | 0.215 | >0.20 ✅ |
| BERTScore F1 | 0.865 | >0.85 ✅ |
| chrF++ | 31.01 | >25 ✅ |
| Метрика | Значение | Цель |
|---|---|---|
| Length JS-divergence | 0.038 | <0.10 ✅ |
| Vocabulary overlap | 0.94 | >0.55 ✅ |
| Style Match Score | 0.798 | >0.7 ✅ |
| Метрика | Значение | Цель |
|---|---|---|
| Avg distinct-2 | 0.940 | >0.7 ✅ |
| Avg distinct-3 | 0.985 | — |
| Self-repetition rate | 0.000 | <0.05 ✅ |
| TTR (model) | 0.349 | 0.4–0.6 |
Целевой объём: 50k диалогов SFW roleplay. Реально получено 24 071.
| Источник | Получено | % | Заявлено |
|---|---|---|---|
PygmalionAI/PIPPA (SFW filter) |
13 050 | 54.2% | 35% |
lemonilia/LimaRP |
0 | 0% | 25% |
Norquinal/claude_multiround_chat_30k |
7 500 | 31.2% | 15% |
IlyaGusev/saiga_scored |
3 521 | 14.6% | 25% |
LimaRP не загрузился при подготовке датасета — fallback увеличил долю PIPPA.
| Параметр | Значение |
|---|---|
| Framework | Unsloth + TRL SFTTrainer |
| Method | QLoRA 4-bit (NF4) |
| Effective batch | 16 (2 × 8 grad_accum) |
| Epochs | 3 (early stopping) |
| Learning rate | 1e-4, cosine schedule, warmup 3% |
| Optimizer | AdamW 8-bit |
| Weight decay | 0.01 |
| Gradient checkpointing | Unsloth |
| Precision | bf16 + tf32 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
@misc{role-play-ai-2026,
title = {Role-play AI: Qwen2.5-32B fine-tune for bilingual SFW roleplay},
author = {svyatsharov, ichinosekei},
year = {2026},
url = {https://huggingface.co/svyatsharov/Role-play-ai}
}
Base model citation:
@misc{qwen2.5,
title = {Qwen2.5: A Party of Foundation Models},
author = {Qwen Team},
year = {2024},
url = {https://huggingface.co/Qwen/Qwen2.5-32B-Instruct}
}