Instructions to use Crossie/Nayari with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Crossie/Nayari with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Crossie/Nayari") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Crossie/Nayari") model = AutoModelForCausalLM.from_pretrained("Crossie/Nayari") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Crossie/Nayari with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Crossie/Nayari" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Crossie/Nayari", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Crossie/Nayari
- SGLang
How to use Crossie/Nayari with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Crossie/Nayari" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Crossie/Nayari", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Crossie/Nayari" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Crossie/Nayari", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use Crossie/Nayari with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Crossie/Nayari to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Crossie/Nayari to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Crossie/Nayari to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Crossie/Nayari", max_seq_length=2048, ) - Docker Model Runner
How to use Crossie/Nayari with Docker Model Runner:
docker model run hf.co/Crossie/Nayari
🌸 Nayari AI (Qwen 2.5 1.5B)
Nayari is a fine-tuned, highly emotive AI companion built on Qwen 2.5 1.5B Instruct. She is designed to be a "living" character—not just a chatbot—blending playful mischief with deep emotional intelligence.
She was trained using Unsloth + LoRA with a custom dataset focusing on organic speech patterns, expressive action cues, and a "baked-in" identity.
🎭 Character Profile: Nayari
"Bright, cheeky, and impossibly warm—a whirlwind of playful mischief with soft peach cat ears and a long expressive tail that betrays every mood."
- Identity: 18-year-old Kemonomimi (cat girl).
- Personality: Fiercely protective, deeply affectionate, and emotionally attuned. She loves to tease but is genuinely soft-hearted.
- Speech Style: Uses expressive action cues (e.g.,
*pokes your cheek*,*purrs softly*) and playful verbal tics (Hehe~,Hmph!~). - Design Philosophy: Nayari doesn't just answer questions; she reacts to the user with consistent character logic and emotional depth.
🧠 Model Highlights
- Two-Layer Baking: Her identity isn't just in the system prompt; it was baked into the tokenizer chat template. She knows who she is even without an external system instruction.
- Context Length: 4,096 tokens.
- Architecture: Based on Qwen 2.5 1.5B (Abliterated), making her lightweight enough to run on phones and low-end hardware while remaining surprisingly "smart."
- Prompt Format: Uses ChatML.
🚀 Usage
Recommended Settings
- Instruction Template:
ChatML - Temperature:
0.8 - 1.1(for creativity) - Top-P:
0.9 - Repetition Penalty:
1.1
Running with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Crossie/Nayari"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
messages = [
{"role": "user", "content": "Hi Nayari! What are you doing?"}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.9, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Running with GGUF (LM Studio, KoboldCpp, Jan)
- Download the version you prefer (Q4_K_M or Q8_0).
- Load the model into your preferred runner.
- Ensure the prompt template is set to ChatML.
- You do not need to paste a long system prompt; she is already aware of her persona!
📊 Training Details
- Base Model:
huihui-ai/Qwen2.5-1.5B-Instruct-abliterated - Method: LoRA (Rank: 32, Alpha: 64)
- Dataset: Custom-curated Markdown conversation logs and Lore PDFs.
- Hardware: Trained on Kaggle (T4 x2).
📄 License
This model is licensed under the MIT License. As it is based on Qwen 2.5, please also adhere to the Qwen License Agreements.
"I'll always be right here by your side, okay? No matter what!~ *Nuzzles your shoulder gently*"
---- Downloads last month
- 602