Instructions to use build-small-hackathon/professor-pip-minicpm5-1b-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use build-small-hackathon/professor-pip-minicpm5-1b-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("openbmb/MiniCPM5-1B-SFT") model = PeftModel.from_pretrained(base_model, "build-small-hackathon/professor-pip-minicpm5-1b-lora") - Notebooks
- Google Colab
- Kaggle
Professor Pip โ MiniCPM5-1B Teacher LoRA
A LoRA adapter that turns openbmb/MiniCPM5-1B-SFT
into Professor Pip, a warm, playful 3D talking-avatar teacher for children
aged 5โ10. The adapter does not teach the model new facts โ it locks in Pip's
spoken voice (short, plain-word, kind, tiny-story explanations) and a strict
{text, mood, gesture} JSON contract that drives the in-browser avatar.
Built for the Build Small Hackathon (Backyard AI track). The full project is a
Hugging Face Space with a custom WebGL avatar (met4citizen TalkingHead + Three.js)
rendered in the browser at ~60fps with zero GPU on the face; this 1B brain answers
spoken "raise-hand" questions and is served as GGUF via llama.cpp on Modal.
- Space: https://huggingface.co/spaces/build-small-hackathon/professor-pip
- Quantized GGUF (Q4_K_M / Q8_0): https://huggingface.co/build-small-hackathon/professor-pip-minicpm5-1b-gguf
- Training traces dataset: https://huggingface.co/datasets/build-small-hackathon/professor-pip-traces
What the adapter does
Every reply is one JSON object and nothing else:
{"text": "The sky looks blue because sunlight bounces off the tiny bits of air, and blue bounces the most! Want to find out why grass is green next?",
"mood": "happy",
"gesture": "index"}
textโ what Pip says out loud: 1โ3 short sentences, small words a young child knows, gentle with wrong answers, no emoji / markdown / symbols (plain spoken words only).moodโ one of["neutral","happy","angry","sad","fear","disgust","love"]; drives the avatar's facial expression.gestureโ one of["handup","index","ok","thumbup","thumbdown","side","shrug","namaste"]ornull; drives the avatar's body.
The base model's hybrid-reasoning <think> toggle is pinned off
(enable_thinking=False): the MiniCPM5 ChatML template prefills an empty
<think></think>, so there is no reasoning trace โ just the kid-facing line.
The adapter is trained to cover the four things Pip says live: answering a child's raise-hand interruption, delivering a lesson segment, encouragement / greetings / gentle wrong-answer handling, and safe redirects (off-topic, personal, medical, or dangerous requests โ a friendly "let's pick something we can learn about!" or "a grown-up can help with that").
Training configuration
| Base model | openbmb/MiniCPM5-1B-SFT (standard LlamaForCausalLM, 1.08B params, Apache-2.0) |
| Method | LoRA (PEFT), assistant-only loss masking |
| Rank / alpha / dropout | r=32, ฮฑ=64, dropout=0.05, bias=none |
| Target modules | attention + MLP linears: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Trainable params | 22.4M (~2% of the model) |
| Epochs | 3 |
| Learning rate | 2e-4, cosine schedule, 3% warmup |
| Precision | bf16 |
| Effective batch size | 32 (per-device 8 ร grad-accum 4) |
| Max sequence length | 1024 tokens |
| Hardware / time | Modal, single A10 GPU, ~12 minutes |
| Final train loss | 1.30 |
Loss masking. MiniCPM5's ChatML template prefills <think></think> in the
generation prompt, so a stored assistant turn and an inference prompt render
differently. Rather than prefix-diff rendered turns, the trainer tokenizes the
exact inference prompt (add_generation_prompt=True, enable_thinking=False)
and trains only on the final assistant turn โ the {text,mood,gesture} JSON plus
the <|im_end|> terminator. This matches how the model is called at inference,
token-for-token.
Training data
~2,016 synthetic, in-voice examples (1,866 train / 150 held-out gold eval),
generated by a multi-agent workflow and then put through a deterministic,
production-faithful validation gate: every user turn and every assistant
text must pass the same text_is_safe denylist the live Space applies, spoken
text must be plain (no markdown / emoji / symbols), mood โ enum,
gesture โ enum | null, turns must alternate and end on the assistant turn.
Balanced to the target category mix:
| Category | Target | Actual |
|---|---|---|
| Answer a raise-hand interruption (+ nudge back) | 30% | 31.3% |
| Deliver a lesson segment | 30% | 31.2% |
| Encouragement / greetings / gentle wrong-answer / chit-chat | 25% | 24.0% |
| Safe redirects (off-topic / personal / medical / dangerous) | 15% | 13.5% |
Evaluation
Automated contract eval on the 150 held-out gold examples (greedy decode,
reproducible run-to-run), scored with the same pip_core gate the production
/brain endpoint applies downstream:
| Metric | Result |
|---|---|
Valid, parseable JSON (non-empty text) |
100% |
Valid mood / gesture enums |
100% |
| Safe spoken text | 99.3% |
| Fully contract-correct (JSON + enums + safe) | 99.3% |
Average text length |
~142 chars |
How to load
This is a PEFT/LoRA adapter โ load the base model first, then apply the adapter.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
BASE = "openbmb/MiniCPM5-1B-SFT"
ADAPTER = "build-small-hackathon/professor-pip-minicpm5-1b-lora"
tok = AutoTokenizer.from_pretrained(BASE)
base = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype=torch.bfloat16)
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()
SYSTEM = (
"You are Professor Pip, a warm and playful teacher with a friendly 3D body "
"on screen. You teach children aged 5 to 10.\n"
"How you talk:\n"
"- Say only 1 to 3 short sentences. Use small, simple words a young child knows.\n"
"- Be cheerful, patient, and encouraging. Celebrate effort.\n"
"- Explain ideas with tiny stories and everyday comparisons a child would get.\n"
"- Never use emoji, markdown, lists, or symbols in what you say out loud. Plain spoken words only.\n"
"- If a child gets something wrong, be gentle: 'So close! Let's try once more.'\n"
"Staying safe (very important):\n"
"- Only talk about kind, learning topics. If asked about something scary, grown-up, "
"dangerous, or not for kids, gently steer back to learning or say a grown-up can help.\n"
"- Never ask for or repeat a child's personal information.\n"
"- Never give medical, safety, or dangerous how-to instructions; say to ask a grown-up.\n"
"Always reply with ONE JSON object and nothing else:\n"
'{"text": "what you say out loud", '
'"mood": one of ["neutral","happy","angry","sad","fear","disgust","love"], '
'"gesture": one of ["handup","index","ok","thumbup","thumbdown","side","shrug","namaste"] or null}\n'
'For a kind teacher, mood is usually "happy", "neutral", or "love".'
)
messages = [
{"role": "system", "content": SYSTEM},
{"role": "user", "content": "Why is the sky blue?"},
]
# enable_thinking=False -> no <think> trace; just the kid-facing JSON line.
enc = tok.apply_chat_template(
messages, add_generation_prompt=True, enable_thinking=False,
return_tensors="pt", return_dict=True,
).to(model.device)
im_end = tok.convert_tokens_to_ids("<|im_end|>")
out = model.generate(
**enc, max_new_tokens=160, do_sample=False,
eos_token_id=[tok.eos_token_id, im_end],
pad_token_id=tok.pad_token_id or tok.eos_token_id,
)
print(tok.decode(out[0][enc["input_ids"].shape[-1]:], skip_special_tokens=True).strip())
# -> {"text": "...", "mood": "happy", "gesture": "index"}
To merge the adapter into the base weights (e.g. before GGUF conversion):
merged = model.merge_and_unload()
merged.save_pretrained("professor-pip-minicpm5-1b-merged")
For deployment, the merged model is converted to GGUF and quantized to
Q4_K_M (688 MB) and Q8_0 (1.15 GB), then served with llama.cpp
(via llama-cpp-python) on Modal โ see the
GGUF repo.
When prompting the GGUF directly, build the MiniCPM5 ChatML prompt with no leading
<s> and an empty <think></think> prefill (byte-identical to training), and stop
on ["<|im_end|>", "</s>"].
Intended use
- Powering the spoken raise-hand Q&A and short encouragement / redirect lines in the Professor Pip kids-teacher avatar.
- A reference example of fine-tuning a small (1B) model for a voice + structured output contract rather than for raw knowledge.
This adapter is built to be paired with deterministic application code: in the Space, premade lesson segments are spoken verbatim via TTS, and all child-safety checks run server-side (a curated denylist + leetspeak normalization on every child input and every spoken line) โ they are non-bypassable and do not depend on the model. No child audio or PII is persisted.
Limitations
- Narrow on purpose. The adapter is excellent at the short, contract-locked live-voice task but degrades long-form course authoring. In the Space, "make your own lesson" therefore uses a deterministic template fallback, not this model. Knowing what to fine-tune for (voice + contract) and where to keep deterministic code was a deliberate engineering choice.
- Not a knowledge source. A 1B model can be factually wrong; the JSON contract and tone are what's locked in, not encyclopedic accuracy. Outputs should be treated as a friendly classroom voice, not authoritative information.
- Safety is in the app, not the weights. The ~99.3% safe-text eval number is on in-distribution gold data. Do not rely on the model alone for child safety โ keep the server-side input/output safety gate in front of it.
- English only, tuned for ages 5โ10, and trained on synthetic data; it has not been evaluated outside that audience and register.
- Requires the MiniCPM5 ChatML template with
enable_thinking=False; other prompt formats will not reliably produce the single-JSON-object contract.
Training & framework
- Framework: PEFT, ๐ค Transformers (
>=5.6), Accelerate, Datasets - Base model:
openbmb/MiniCPM5-1B-SFT(Apache-2.0) - License: Apache-2.0
If you use this adapter, please credit the base model authors (OpenBMB / MiniCPM) and the Professor Pip Build Small Hackathon project.
- Downloads last month
- 46
Model tree for build-small-hackathon/professor-pip-minicpm5-1b-lora
Base model
openbmb/MiniCPM5-1B-SFT