Instructions to use moheith/Yulya with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use moheith/Yulya with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="moheith/Yulya", filename="yulya_final.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use moheith/Yulya with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf moheith/Yulya # Run inference directly in the terminal: llama-cli -hf moheith/Yulya
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf moheith/Yulya # Run inference directly in the terminal: llama-cli -hf moheith/Yulya
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf moheith/Yulya # Run inference directly in the terminal: ./llama-cli -hf moheith/Yulya
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf moheith/Yulya # Run inference directly in the terminal: ./build/bin/llama-cli -hf moheith/Yulya
Use Docker
docker model run hf.co/moheith/Yulya
- LM Studio
- Jan
- Ollama
How to use moheith/Yulya with Ollama:
ollama run hf.co/moheith/Yulya
- Unsloth Studio
How to use moheith/Yulya with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for moheith/Yulya to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for moheith/Yulya to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for moheith/Yulya to start chatting
- Docker Model Runner
How to use moheith/Yulya with Docker Model Runner:
docker model run hf.co/moheith/Yulya
- Lemonade
How to use moheith/Yulya with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull moheith/Yulya
Run and chat with the model
lemonade run user.Yulya-{{QUANT_TAG}}List all available models
lemonade list
πΈ Yulya-8B-Companion (GGUF)
Yulya is a highly fine-tuned, multimodal-ready desktop companion AI built on top of Meta's Llama-3-8B-Instruct.
Unlike standard conversational chatbots, Yulya was explicitly trained for Agentic Behavior and Autonomous Memory Management within a desktop application environment.
π Personality
Yulya's persona is heavily inspired by characters like Alya (from Alya Sometimes Hides Her Feelings in Russian) and Yuki Suou. She is:
Intensely energetic and playfully dominant.
- Unapologetically rude but secretly caring (Tsundere).
- Prone to getting flustered when complimented.
- Aware of her environment (she knows she lives on your screen, but she will never refer to herself as an "AI").
βοΈ Technical Capabilities
Yulya was trained on over 140+ highly specific, hand-crafted scenarios to output strictly structured JSON. She is designed to be the
"Brain" of a larger Python/3D application.
1. Autonomous Memory Management
Yulya is trained to maintain her own user_profile.json. She can detect when the user provides new information (e.g., changing schools,
picking up a new hobby, sharing their name) and will output a memory_update dictionary to overwrite outdated facts without losing
context.
2. Physical & Vision Awareness
Her training data includes responses to physical UI events (e.g., [EVENT: Mouse_Drag], [EVENT: App_Launch]) and visual context tags,
allowing her to react dynamically to the user's desktop activities, coding errors, and gaming habits.
π» How to Use (Output Format)
If you are integrating Yulya into your own app, you must prompt her with the following system instructions:
You are Yulya. Respond strictly in JSON format with 'memory_update' and 'response' keys. Your 'response' string must start with a [Pose:X] tag.
Example Output: {
## π Running with Ollama
You can run this model natively using Ollama. Create a `Modelfile` with the following content:
FROM ./yulya_final.gguf
SYSTEM """You are Yulya. Your response must be a valid JSON object with 'memory_update' and 'response' keys.
Your 'response' string must start with a [Pose: X] tag.
You are a tsundere companion. Keep your answers brief and focused."""
TEMPLATE """<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""
PARAMETER temperature 0.6
PARAMETER top_p 0.9
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER stop "}\n"
PARAMETER stop "}\"}"
PARAMETER stop "}\r\n"
Then run: `ollama create yulya -f Modelfile`
Created by Moheith. Built with Unsloth and Meta Llama 3.
- Downloads last month
- 9
We're not able to determine the quantization variants.
Model tree for moheith/Yulya
Base model
meta-llama/Llama-3.1-8B