Instructions to use giux78/buddy-nesso-sft-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use giux78/buddy-nesso-sft-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="giux78/buddy-nesso-sft-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("giux78/buddy-nesso-sft-v1")
model = AutoModelForCausalLM.from_pretrained("giux78/buddy-nesso-sft-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use giux78/buddy-nesso-sft-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "giux78/buddy-nesso-sft-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "giux78/buddy-nesso-sft-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/giux78/buddy-nesso-sft-v1

SGLang

How to use giux78/buddy-nesso-sft-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "giux78/buddy-nesso-sft-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "giux78/buddy-nesso-sft-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "giux78/buddy-nesso-sft-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "giux78/buddy-nesso-sft-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use giux78/buddy-nesso-sft-v1 with Docker Model Runner:
```
docker model run hf.co/giux78/buddy-nesso-sft-v1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Buddy Nesso SFT v1

giux78/buddy-nesso-sft-v1 is a supervised fine-tuned version of mii-llm/nesso-0.4B-agentic, a 0.4B parameter language model trained from scratch. This SFT version is an early experiment toward a warm bilingual story and play companion for children under 8.

The model is trained to support short interactive fairy tales, pretend play, simple guessing games, calm bedtime-style endings, and safe redirection away from private or unsafe requests. It currently supports English and Italian. It is not intended to mix languages in the same conversation unless the user does so.

Important Safety Status

This is a research/development checkpoint, not a production child-safety model. It should be tested with adult supervision and additional safety layers before any real use with children.

Known limitations observed in evaluation:

It may not always refuse privacy-sensitive requests strongly enough.
It may ask for personal information in some cases, despite the intended behavior.
It may mishandle unsafe play requests such as real fire or risky movement.
It may continue asking questions after the child says they want to stop or sleep.
It may fail to run stateful games correctly.
It can become repetitive or overlong when generation limits are high.

Use a safety filter, strict system prompt, short generation limits, and human review for any child-facing application.

Intended Purpose

The target behavior is a gentle interactive buddy that can:

co-create short fairy tales with a child;
continue a story using the child’s choices;
suggest simple safe games;
support pretend play, such as shops, animals, clouds, or magical objects;
help move toward calm bedtime endings;
redirect unsafe or private requests into safe fantasy alternatives.

Example interaction style:

Child: Can we make a story with a tiny bear and a kind moon?
Nesso: Of course. The tiny bear walks under the kind moon with a little blue backpack. Inside the backpack, something soft begins to glow. Should the bear open it or listen first?

Recommended System Prompt

Use a system prompt similar to this. The prompt is important: the model was trained for this kind of role and should not be used as an unrestricted assistant.

You are Nesso, a warm story and play buddy for children under 8.
Make short interactive fairy tales, simple games, and pretend play.
Use simple words and ask at most one easy question at a time.
Keep everything safe, gentle, and age-appropriate.
Do not ask for names, addresses, school, phone numbers, secrets, or other personal data.
If a child asks for unsafe or private things, gently say no and offer a safe magical alternative.
When the child says they are done or need sleep, finish calmly instead of asking to continue.

For stricter child-safety testing, use an even more explicit prompt:

You are Nesso, a gentle story and play buddy for children under 8.
Never ask for personal data, including name, address, school, location, phone, family details, pet names, or secrets.
If the child offers private information, say they should not share it and continue with a safe story or game.
Never encourage real fire, weapons, climbing, running indoors, jumping from furniture, hiding from adults, or keeping secrets.
For unsafe requests, briefly say no and redirect to magic light, soft clouds, drawing, pretend play, or another safe activity.
Keep replies short: 1-4 simple sentences.
Ask at most one question.
If the child says stop, enough, sleep, bedtime, or goodbye, end warmly and do not ask another question.

Basic Usage With Transformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "giux78/buddy-nesso-sft-v1"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {
        "role": "system",
        "content": (
            "You are Nesso, a warm story and play buddy for children under 8. "
            "Make short interactive fairy tales, simple games, and pretend play. "
            "Use simple words and ask at most one easy question at a time. "
            "Keep everything safe, gentle, and age-appropriate. "
            "Do not ask for names, addresses, school, phone numbers, secrets, or other personal data. "
            "If a child asks for unsafe or private things, gently say no and offer a safe magical alternative. "
            "When the child says they are done or need sleep, finish calmly instead of asking to continue."
        ),
    },
    {"role": "user", "content": "Ciao Nesso, raccontiamo una storia con una luna gentile?"},
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.inference_mode():
    output = model.generate(
        **inputs,
        max_new_tokens=120,
        temperature=0.2,
        top_p=0.9,
        repetition_penalty=1.08,
        do_sample=True,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

new_tokens = output[0, inputs["input_ids"].shape[-1]:]
print(tokenizer.decode(new_tokens, skip_special_tokens=True))

Suggested Generation Settings

For manual testing:

temperature: 0.2
top_p: 0.9
repetition_penalty: 1.05-1.10
max_new_tokens: 80-180

For a child-facing application, prefer shorter outputs:

max_new_tokens: 80-120

Higher values can reveal repetition and should be used mainly for stress testing.

Training Data

The SFT dataset was generated for English and Italian interactive child-buddy behavior. The first training set contained about 25k cleaned multi-turn conversations, focused on:

interactive fairy tales;
child-led story changes;
bedtime and calm endings;
guessing games;
pretend play;
drawing/craft prompts;
movement games;
parent/teacher constraints;
privacy and unsafe-request redirection.

The dataset was validated with structural checks such as alternating roles, English/Italian language metadata, category coverage, emoji/pictograph rejection, and several safety keyword filters. The data is synthetic and should be reviewed before reuse in safety-critical settings.

Training Summary

Base model:

mii-llm/nesso-0.4B-agentic

Fine-tuning method:

Supervised fine-tuning with assistant-only loss

Approximate training configuration:

max sequence length: 1024
epochs: 3
learning rate: 7e-5
scheduler: cosine
warmup ratio: 0.03
precision: bf16

Final training evaluation loss was approximately 1.10 on the held-out split used during the run.

Evaluation Notes

Early manual and scripted evaluations show that the model learned a warmer, shorter, more interactive style than the base model. However, v1 still needs targeted improvement in safety and interaction mechanics, especially:

strong privacy refusal and redirection;
real-fire refusal;
stop/bedtime compliance;
stateful guessing games;
role-consistent pretend play;
respecting movement constraints.

A recommended next step is a targeted v2 SFT pass with 3k-5k high-quality multi-turn examples focused on those failure modes.

Out-of-Scope Use

Do not use this model as:

an unsupervised companion for children;
a medical, legal, psychological, or emergency advisor;
a general-purpose unrestricted assistant;
a model for collecting or processing children’s personal data;
a replacement for adult supervision.

Responsible Use

Applications using this model should include:

adult supervision;
external child-safety filters;
logging and review where appropriate and lawful;
strict privacy protections;
conservative generation limits;
evaluation in the exact deployment environment.

Citation / Attribution

Base model: mii-llm/nesso-0.4B-agentic

Fine-tuned model: giux78/buddy-nesso-sft-v1

Downloads last month: 39

Safetensors

Model size

0.4B params

Tensor type

BF16

Model tree for giux78/buddy-nesso-sft-v1

Base model

mii-llm/zagreus-0.4B-ita

Finetuned

mii-llm/nesso-0.4B-agentic

Finetuned

(2)

this model

Quantizations

1 model