ybashir/buddy-chat
Viewer โข Updated โข 1.5k โข 70
How to use ybashir/buddy-chat with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B")
model = PeftModel.from_pretrained(base_model, "ybashir/buddy-chat")A QLoRA fine-tune of Qwen/Qwen3-0.6B
that gives Buddy his voice: a tiny, giddy desk-robot friend who replies in a
young, playful, spoken register. The brain for an on-device voice companion,
meant to run on CPU at the edge.
<|happy|>, <|sad|>, <|excited|>, โฆ) which a renderer maps to a face.
Held-out leading-emotion format accuracy: 100%.enable_thinking=False (no <think> block) for low
latency. Trained with no system prompt โ the persona is in the weights.from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
tok = AutoTokenizer.from_pretrained("ybashir/buddy-qwen3-0.6b")
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-0.6B")
base.resize_token_embeddings(len(tok))
model = PeftModel.from_pretrained(base, "ybashir/buddy-qwen3-0.6b")
msgs = [{"role": "user", "content": "i finally fixed that bug!!"}]
ids = tok.apply_chat_template(msgs, add_generation_prompt=True,
enable_thinking=False, return_tensors="pt")
print(tok.decode(model.generate(ids, max_new_tokens=64)[0][ids.shape[1]:]))
# -> "<|excited|> YOU DID IT!! Take that, silly bug, bye bye!"
ybashir/buddy-chat
โ ~1.3k user -> <|emotion|> reply SFT pairs (young register), completion-only loss.eval_loss.The emotion tokens are added as special tokens, which llama.cpp/Ollama strip from output. Before converting to GGUF, demote them to normal tokens so they render as text (the leading-emotion tag is the whole point).