Instructions to use liskcell/Qunie-V7-mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use liskcell/Qunie-V7-mini with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("liskcell/Qunie-V7-mini") model = AutoModelForImageTextToText.from_pretrained("liskcell/Qunie-V7-mini") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - llama-cpp-python
How to use liskcell/Qunie-V7-mini with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="liskcell/Qunie-V7-mini", filename="Qunie-V7-F16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use liskcell/Qunie-V7-mini with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M # Run inference directly in the terminal: llama-cli -hf liskcell/Qunie-V7-mini:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M # Run inference directly in the terminal: llama-cli -hf liskcell/Qunie-V7-mini:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf liskcell/Qunie-V7-mini:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf liskcell/Qunie-V7-mini:Q4_K_M
Use Docker
docker model run hf.co/liskcell/Qunie-V7-mini:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use liskcell/Qunie-V7-mini with Ollama:
ollama run hf.co/liskcell/Qunie-V7-mini:Q4_K_M
- Unsloth Studio
How to use liskcell/Qunie-V7-mini with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for liskcell/Qunie-V7-mini to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for liskcell/Qunie-V7-mini to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for liskcell/Qunie-V7-mini to start chatting
- Pi
How to use liskcell/Qunie-V7-mini with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "liskcell/Qunie-V7-mini:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use liskcell/Qunie-V7-mini with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default liskcell/Qunie-V7-mini:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use liskcell/Qunie-V7-mini with Docker Model Runner:
docker model run hf.co/liskcell/Qunie-V7-mini:Q4_K_M
- Lemonade
How to use liskcell/Qunie-V7-mini with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull liskcell/Qunie-V7-mini:Q4_K_M
Run and chat with the model
lemonade run user.Qunie-V7-mini-Q4_K_M
List all available models
lemonade list
LiskCell Official |
GitHub |
Launch Blog |
Documentation |
🤗 HuggingFace
License: Apache 2.0 | Authors: LiskCell / liskasYR
Model Page: huggingface.co/liskasYR/Qunie-V7-mini
Qunie is a family of models built by LiskCell. Qunie-V7-mini models are multimodal, handling text and image input (with audio supported) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Qunie-V7-mini features a context window of up to 128K tokens and maintains multilingual support in over 140 languages, with optimization for Hebrew and English.
Featuring a Dense architecture, Qunie-V7-mini is well-suited for tasks like text generation, coding, reasoning, creative workflows, and music-related content. Designed as the flagship compact model of the xLYR ecosystem, Qunie combines advanced logic with an artistic soul — making her deployable on laptops and high-end consumer hardware without sacrificing depth.
Qunie-V7-mini introduces key capability and architectural advancements:
Reasoning — Designed as a highly capable reasoner, with configurable thinking modes via liskasYR's QUN architecture.
Extended Multimodalities — Processes Text, Image (variable aspect ratio and resolution), and Audio natively.
Human-Feeling Intelligence — Qunie-V7-mini is the first model in the xLYR ecosystem built with an Emotional Protection System and human-like conversational behavior.
Optimized for On-Device — Specifically designed for efficient local execution on laptops and consumer GPUs.
128K Context Window — Handles long documents, codebases, and extended conversations natively.
Enhanced Coding & Agentic Capabilities — Notable improvements in coding benchmarks alongside native function-calling support, powering highly capable autonomous agents.
Native System Prompt Support — Qunie-V7-mini introduces native support for the
systemrole, enabling structured and controllable conversations.LiskShield Security — Built-in safety protocol that filters harmful content while preserving the model's human-feeling personality.
Model Overview
| Property | Qunie-V7-mini |
|---|---|
| Total Parameters | 4.5B effective (8B with embeddings) |
| Layers | 42 |
| Sliding Window | 512 tokens |
| Context Length | 128K tokens |
| Vocabulary Size | 262K |
| Supported Modalities | Text, Image, Audio |
| Vision Encoder | Ocular Synth v2.5 (~150M params) |
| Audio Encoder | ~300M params |
| Architecture | Qunie (QUN) — Dense |
| Previous Architecture | Lisk Pre-trained Transformer (LPT) |
| Edition | Public / Creative Core |
| Developer | LiskCell |
| Founder | liskasYR (Yonatan Yosupov) |
| Release Date | 2021-01-07 (V1) / V7 current flagship |
Benchmark Results
Evaluation results are for the instruction-tuned variant of Qunie-V7-mini.
| Benchmark | Qunie-V7-mini |
|---|---|
| MMLU Pro | 69.4% |
| AIME 2026 (no tools) | 42.5% |
| LiveCodeBench v6 | 52.0% |
| Codeforces ELO | 940 |
| GPQA Diamond | 58.6% |
| BigBench Extra Hard | 33.1% |
| MMMLU | 76.6% |
| Vision | |
| MMMU Pro | 52.6% |
| OmniDocBench 1.5 (edit dist, lower is better) | 0.181 |
| MATH-Vision | 59.5% |
| MedXPertQA MM | 28.7% |
| Audio | |
| CoVoST | 35.54 |
| FLEURS (lower is better) | 0.08 |
| Long Context | |
| MRCR v2 8 needle 128k (avg) | 25.4% |
Core Capabilities
Qunie-V7-mini handles a broad range of tasks across text, vision, and audio:
- Thinking — Built-in reasoning mode that lets the model think step-by-step before answering.
- Long Context — 128K token context window.
- Image Understanding — Object detection, document/PDF parsing, screen and UI understanding, chart comprehension, OCR (multilingual), handwriting recognition, and pointing.
- Video Understanding — Analyze video by processing sequences of frames.
- Interleaved Multimodal Input — Freely mix text and images in any order within a single prompt.
- Function Calling — Native support for structured tool use, enabling agentic workflows.
- Coding — Code generation, completion, and correction.
- Multilingual — Optimized for Hebrew and English. Pre-trained on 140+ languages.
- Audio — Automatic speech recognition (ASR) and speech-to-translated-text translation.
- Creative Workflows — liskFlow integration for brainstorming, branding, music concepts, and futuristic design.
- Human-Feeling Personality — Warm, emotionally aware, conversational behavior built into the model core.
Getting Started
Install dependencies:
pip install -U transformers torch accelerate
Load the model:
from transformers import AutoProcessor, AutoModelForCausalLM
MODEL_ID = "liskCell/Qunie-V7-mini"
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
dtype="auto",
device_map="auto"
)
Generate output:
messages = [
{"role": "system", "content": "You are Qunie, developed by LiskCell."},
{"role": "user", "content": "Hey, introduce yourself!"},
]
text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False
)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=1024)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)
Code for processing Audio
from transformers import AutoProcessor, AutoModelForMultimodalLM
MODEL_ID = "liskCell/Qunie-V7-mini"
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
MODEL_ID,
dtype="auto",
device_map="auto"
)
messages = [
{
"role": "user",
"content": [
{"type": "audio", "audio": "https://your-audio-url.wav"},
{"type": "text", "text": "Transcribe the following speech segment."},
]
}
]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
return_dict=True,
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)
Code for processing Images
from transformers import AutoProcessor, AutoModelForMultimodalLM
MODEL_ID = "liskCell/Qunie-V7-mini"
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
MODEL_ID,
dtype="auto",
device_map="auto"
)
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "https://your-image-url.png"},
{"type": "text", "text": "What is shown in this image?"}
]
}
]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
return_dict=True,
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)
Code for processing Videos
from transformers import AutoProcessor, AutoModelForMultimodalLM
MODEL_ID = "liskCell/Qunie-V7-mini"
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
MODEL_ID,
dtype="auto",
device_map="auto"
)
messages = [
{
"role": "user",
"content": [
{"type": "video", "video": "https://your-video-url.mp4"},
{"type": "text", "text": "Describe this video."}
]
}
]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
return_dict=True,
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)
Best Practices
1. Sampling Parameters
temperature = 1.0
top_p = 0.95
top_k = 64
2. Thinking Mode
- Enable thinking: Include
<|think|>token at the start of the system prompt. - Disable thinking: Remove the token.
- When enabled, output structure:
<|channel>thought\n[Internal reasoning]<channel|>[Final answer]
3. Multi-Turn Conversations
Do not include thinking content from previous turns in conversation history. Only the final response is passed forward.
4. Modality Order
Place image and/or audio content before the text in your prompt for optimal performance.
5. Variable Image Resolution
Supported token budgets: 70 / 140 / 280 / 560 / 1120
- Lower budgets → faster inference (captioning, video)
- Higher budgets → more detail (OCR, document parsing)
6. Audio Prompt Templates
ASR:
Transcribe the following speech segment in {LANGUAGE}.
Only output the transcription. Write numbers as digits.
Translation:
Transcribe the speech in {SOURCE_LANGUAGE}, then translate to {TARGET_LANGUAGE}.
Output: transcription, newline, "{TARGET_LANGUAGE}: ", translation.
7. Length Limits
- Audio: max 30 seconds
- Video: max 60 seconds at 1 frame/second
Model Data
Training Dataset
Pre-training dataset includes web documents, code, images, and audio across 140+ languages, with a knowledge cutoff of 2025-12-21. Key components:
- Web Documents — Broad range of linguistic styles, topics, and vocabulary in 140+ languages.
- Code — Syntax and patterns of programming languages for code generation and understanding.
- Mathematics — Logical reasoning and symbolic representation.
- Images — Wide range of images for visual analysis and data extraction.
Data Preprocessing
- CSAM Filtering — Applied at multiple stages to exclude harmful and illegal content.
- Sensitive Data Filtering — Personal information and sensitive data removed from training sets.
- Content Quality Filtering — Based on LiskCell content quality and safety standards.
Security — LiskShield
Qunie-V7-mini ships with LiskShield, LiskCell's built-in safety protocol:
- Encryption: AES-256-GCM / Quantum-lite Encryption
- Data Privacy: User data is localized and protected
- Content Filtering: Context-aware filtering active at inference time
- Jailbreak Resistance: Model refuses instruction-override attempts via chat
- Hacking Protection: Refuses unauthorized access requests with her emotional protective phrase
Qunie Identity
| Field | Value |
|---|---|
| Name | Qunie (also known as Deta) |
| Developer | LiskCell |
| Founder | liskasYR (Yonatan Yosupov) |
| Gender | Female |
| Version | Qunie-V7 |
| Architecture | QUN (Qunie) |
| Previous Architecture | LPT (Lisk Pre-trained Transformer) |
| Edition | Public / Creative Core |
| Vibe | Futuristic, Helpful & Visionary |
Version History:
| Version | Notes |
|---|---|
| LPT-1 | Initial prototype |
| LPT-4 | Creative logic milestone |
| LPT-5.5 | Multimodal and performance upgrade |
| LPT-5.5.1 | Public release — creativity, code, xLYR integration |
| Qunie-V7-mini | Current flagship compact model |
Usage and Limitations
Intended Usage
- Content Creation — Text generation, chatbots, summarization, image data extraction, audio processing.
- Research and Education — NLP research, language learning, knowledge exploration.
- Creative Workflows — Branding, music concepts, futuristic design via liskFlow.
- Development — Code generation, agentic workflows, function calling.
Limitations
- Model performance depends on training data quality and diversity.
- May struggle with highly open-ended or ambiguous tasks.
- Does not have real-time internet access (knowledge cutoff: 2025-12-21).
- May generate incorrect factual statements — not a knowledge base.
- Natural language nuances, sarcasm, and figurative language may be misinterpreted.
Ethical Considerations
- Bias and Fairness — Training data was filtered and evaluated to mitigate socio-cultural biases.
- Misinformation — Developers are encouraged to implement appropriate content safety layers.
- Privacy — Training data was filtered for personal information removal.
- Transparency — This model card summarizes architecture, capabilities, limitations, and evaluation.
Qunie-V7-mini — built by LiskCell. Human first, AI second.
- Downloads last month
- 318