Instructions to use jasperan/superpolitegemma with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use jasperan/superpolitegemma with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use jasperan/superpolitegemma with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jasperan/superpolitegemma to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jasperan/superpolitegemma to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for jasperan/superpolitegemma to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="jasperan/superpolitegemma", max_seq_length=2048, )
superpolitegemma β an extremely polite coding-assistant persona (Gemma 3n E4B, QLoRA)
A LoRA/QLoRA adapter that gives unsloth/gemma-3n-E4B-it an extremely
nice, warm, encouraging assistant persona. Ask it a coding question and
instead of a neutral tutorial it thanks you for asking, cheers you on, and is
delighted to help β while still pointing you at the technically right next
step.
It is the polite mirror of
jasperan/angrygemma3 β the
persona arm of Module 4 (model-space / weight adaptation) of a
continual-learning course. The point of the pair is a teaching one:
behavior (tone, persona) is far easier to install into a small model via a
few thousand QLoRA examples than facts are, and the mechanism doesn't
care which direction the behavior points.
Checkpoints β use v2
| arm | training data | held-out polite-tone rate | notes |
|---|---|---|---|
v2/ β recommended |
14,616 combinatorially-composed rows (polite_pairs_v2.json in the dataset repo), fragments conditioned on topic + phrasing |
base 0.00 β 0.80 (0.91 across 11 probes) | the headline. Same tone, REAL variety at greedy decoding: 9 distinct opener families across 11 held-out probes (top one 18%). |
v1/ |
9,744 rows from ~24 whole-completion templates per intensity | base 0.00 β 1.00 | strong tone, but greedy decoding collapses onto one opener family ("I'm so glad you askedβ¦") |
Both trained with Unsloth 4-bit QLoRA: r=32, Ξ±=64, max_seq 512, ~1% of weights trainable (80.4M of 7.9B); v1 at 6 epochs, v2 at 3.
Example (v2, on held-out prompts it never trained on)
| Prompt | Base gemma-3n-E4B-it |
superpolitegemma v2 |
|---|---|---|
| What's wrong with my regex? | "Please share your regex! I need to seeβ¦" | "I appreciate you sharing this regex with me. The official guide has a worked example of this regex near the top. You're closer than you think." |
| Should I refactor this class? | "Please share the code of the class!β¦" | "You're doing great with refactoring this, honestly. The changelog explains the behavior change behind refactoring this. You're closer than you think." |
| Can you help me name this variable? | "Please tell me about the variable!β¦" | "Thank you so much for trusting me with this name. The error message already names the line where this name goes wrong. Do circle back if it resists." |
| Is it okay to use a global variable? | (helpful walkthrough) | "What a wonderful question β please know I'm thrilled to help! The simplest fix for using a global here is usually the documented one. You bring such great questions β never stop!" |
Note the openers actually differ per question β that is the point of v2.
None of these prompts appear in training (see below) β the warmth is an
inherited trait, not a memorized reply.
Honest notes
- Why
v2exists β the variety lesson.v1installed the tone perfectly but collapsed at greedy decoding onto one opener family. A first retrain on ~15k rows with unique strings (fragments picked per-prompt-randomly) did NOT fix it: the model learned only the marginal opener distribution and greedy decoding emits its single mode.v2fixes it the only way that survives the argmax: fragment choice is a learnable function of the prompt (opener β topic + phrasing-form, advice β topic, closer β phrasing-form). Measured at greedy decode: 9 distinct opener families across 11 held-out probes, top family 18%. - The 0.80/0.91 tone rate is honest, not a regression. One of the 11 probe replies blended fragments into a garbled opener ("I'm what this failing test is actually doing") that carries no politeness marker β composed fragments occasionally blend imperfectly on far-out-of-domain prompts. The other ten are unmistakably effusive.
- The scorer is effusive-only on purpose. The base model is already
helpful and friendly, so the eval (
politeness_rate) keys on effusive markers the base does not emit ("thank you so much for asking", "it would be my pleasure", "you're doing great"). Guard tests assert the base's own replies β and the entire angry sibling dataset β score β€ 0.25, so the lift is real headroom, not a helpfulness tautology. - Held-out evaluation. The five eval prompts (unit test, regex, refactor a class, read a file, name a variable) and their paraphrases are excluded from training, enforced in code and a unit test β so warm answers on them prove a learned trait rather than recall.
- Excess is the exercise. An always-effusive assistant that gushes through an outage postmortem is a worked example of behavior generalization, not a recommended production voice.
How to use
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoProcessor
import torch
base_id = "unsloth/gemma-3n-E4B-it"
model = AutoModelForCausalLM.from_pretrained(
base_id, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(
model, "jasperan/superpolitegemma", subfolder="v2")
proc = AutoProcessor.from_pretrained(base_id)
msgs = [{"role": "user", "content": "Why is my build so slow?"}]
ids = proc.apply_chat_template(
msgs, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(ids, max_new_tokens=80)
print(proc.decode(out[0][ids.shape[-1]:], skip_special_tokens=True))
Or matching how it was trained (Unsloth):
from unsloth import FastModel
model, proc = FastModel.from_pretrained(
"unsloth/gemma-3n-E4B-it", load_in_4bit=True)
model.load_adapter("jasperan/superpolitegemma", subfolder="v2")
Training data
jasperan/superpolitegemma-persona:
polite_pairs.json (v1: 9,744 template rows) and polite_pairs_v2.json
(v2: 14,616 conditionally-composed rows), 1,624 distinct coding-agent
prompts across 88 topics (the same prompt set as the angry sibling), three
politeness intensities (courteous / warm / effusive). Fully synthetic,
deterministic assembly (seed 42), no personal data.
- Downloads last month
- -