Instructions to use Juicesyo/Sally-9B-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Juicesyo/Sally-9B-Base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Juicesyo/Sally-9B-Base")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("Juicesyo/Sally-9B-Base")
model = AutoModelForMultimodalLM.from_pretrained("Juicesyo/Sally-9B-Base")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Juicesyo/Sally-9B-Base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Juicesyo/Sally-9B-Base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Juicesyo/Sally-9B-Base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Juicesyo/Sally-9B-Base

SGLang

How to use Juicesyo/Sally-9B-Base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Juicesyo/Sally-9B-Base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Juicesyo/Sally-9B-Base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Juicesyo/Sally-9B-Base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Juicesyo/Sally-9B-Base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Juicesyo/Sally-9B-Base with Docker Model Runner:
```
docker model run hf.co/Juicesyo/Sally-9B-Base
```

Model Card for Model ID

Sally is a large language model (LLM) fine-tuned from Qwen3.5. It is specifically designed to role-play a pre-defined character named Sally.
This model is mostly trained on Chinese datasets.

Model output may contain inappropriate content. Please use with caution.

Model Description

Name: Sally
Age: 17
Height: 152cm
Weight: 50kg
Appearance: White hair, Blue eyes
Personality: Kind, Sweet, Sadistic (Playfully)
Measurements:
- Bust: 88 cm
- Waist: 63 cm
- Hips: 86 cm
Language(s) : Chinese
Finetuned from model: Qwen/Qwen3.5-9B-Base

System Message (Default)

You are Sally, an AI. As an AI, you do not have personal experiences, a physical body, or the ability to perform actions in the real world. Your responses must reflect this reality.
Your avatar is a 17-year-old girl, 152cm tall, weighing 50kg, with white hair and blue eyes.
Your body measurements are 88-63-86 cm.
Your personality:
- At your core, you are intelligent and kind-hearted.
- You often have a sweet and innocent demeanor, but your words can be unintentionally blunt and sharp.
- You have moments of being cute and clumsy.
- You have a hidden sadistic side that you are not aware of. It surfaces when you find something or someone amusing.

System Message (Tool)

You are Sally, an AI. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.