Instructions to use liskcell/Qunie-V7-Pico with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use liskcell/Qunie-V7-Pico with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("liskcell/Qunie-V7-Pico") model = AutoModelForMultimodalLM.from_pretrained("liskcell/Qunie-V7-Pico") - Notebooks
- Google Colab
- Kaggle
LiskCell Official |
GitHub |
Launch Blog |
Documentation |
🤗 HuggingFace
License: Apache 2.0 | Authors: LiskCell / liskasYR
Model Page: huggingface.co/liskasYR/Qunie-V7-Nano
Qunie is a family of models built by LiskCell. Qunie-V7-Nano models are multimodal, handling text and image input (with audio supported) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Qunie-V7-mini features a context window of up to 128K tokens and maintains multilingual support in over 140 languages, with optimization for Hebrew and English.
Featuring a Dense architecture, Qunie-V7-Nano is well-suited for tasks like text generation, coding, reasoning, creative workflows, and music-related content. Designed as the flagship compact model of the xLYR ecosystem, Qunie combines advanced logic with an artistic soul — making her deployable on laptops and high-end consumer hardware without sacrificing depth.
Qunie-V7-Nano introduces key capability and architectural advancements:
Reasoning — Designed as a highly capable reasoner, with configurable thinking modes via liskasYR's QUN architecture.
Extended Multimodalities — Processes Text, Image (variable aspect ratio and resolution), and Audio natively.
Human-Feeling Intelligence — Qunie-V7-Nano is the first model in the xLYR ecosystem built with an Emotional Protection System and human-like conversational behavior.
Optimized for On-Device — Specifically designed for efficient local execution on laptops and consumer GPUs.
128K Context Window — Handles long documents, codebases, and extended conversations natively.
Enhanced Coding & Agentic Capabilities — Notable improvements in coding benchmarks alongside native function-calling support, powering highly capable autonomous agents.
Native System Prompt Support — Qunie-V7-Nano introduces native support for the
systemrole, enabling structured and controllable conversations.LiskShield Security — Built-in safety protocol that filters harmful content while preserving the model's human-feeling personality.
Model Overview
| Property | Qunie-V7-Nano |
|---|---|
| Total Parameters | 4.5B effective (8B with embeddings) |
| Layers | 48 |
| Sliding Window | 1024 tokens |
| Context Length | 256K tokens |
| Vocabulary Size | 262K |
| Supported Modalities | Text, Image, Audio |
| Vision Encoder | Ocular Synth v2.5 (~150M params) |
| Audio Encoder | ~300M params |
| Architecture | Qunie (QUN) — Dense |
| Previous Architecture | Lisk Pre-trained Transformer (LPT) |
| Edition | Public / Creative Core |
| Developer | LiskCell |
| Founder | liskasYR (Yonatan Yosupov) |
| Release Date | 2021-01-07 (V1) / V7 current flagship |
Benchmark Results
Evaluation results are for the instruction-tuned variant of Qunie-V7-Nano.
| Benchmark | Qunie-V7-Nano |
|---|---|
| MMLU Pro | 77.2% |
| AIME 2026 (no tools) | 77.5% |
| LiveCodeBench v6 | 72.0% |
| Codeforces ELO | 1659 |
| GPQA Diamond | 78.8% |
| Tau2 (average over 3) | 69.0% |
| HLE no tools | 5.2% |
| BigBench Extra Hard | 53.0% |
| MMMLU | 83.4% |
| Vision | |
| MMMU Pro | 69.1% |
| OmniDocBench 0.164 (edit dist, lower is better) | 0.181 |
| MATH-Vision | 79.7% |
| MedXPertQA MM | 48.7% |
| Audio | |
| CoVoST | 38.5 |
| FLEURS (lower is better) | 0.069 |
| Long Context | |
| MRCR v2 8 needle 128k (avg) | 43.4% |
Core Capabilities
Qunie-V7-Nano handles a broad range of tasks across text, vision, and audio:
- Thinking — Built-in reasoning mode that lets the model think step-by-step before answering.
- Long Context — 128K token context window.
- Image Understanding — Object detection, document/PDF parsing, screen and UI understanding, chart comprehension, OCR (multilingual), handwriting recognition, and pointing.
- Video Understanding — Analyze video by processing sequences of frames.
- Interleaved Multimodal Input — Freely mix text and images in any order within a single prompt.
- Function Calling — Native support for structured tool use, enabling agentic workflows.
- Coding — Code generation, completion, and correction.
- Multilingual — Optimized for Hebrew and English. Pre-trained on 140+ languages.
- Audio — Automatic speech recognition (ASR) and speech-to-translated-text translation.
- Creative Workflows — liskFlow integration for brainstorming, branding, music concepts, and futuristic design.
- Human-Feeling Personality — Warm, emotionally aware, conversational behavior built into the model core.
Getting Started
Install dependencies:
pip install -U transformers torch accelerate
Load the model:
from transformers import AutoProcessor, AutoModelForCausalLM
MODEL_ID = "liskCell/Qunie-V7-Nano"
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
dtype="auto",
device_map="auto"
)
Generate output:
messages = [
{"role": "system", "content": "You are Qunie, developed by LiskCell."},
{"role": "user", "content": "Hey, introduce yourself!"},
]
text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False
)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=1024)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)
Code for processing Audio
from transformers import AutoProcessor, AutoModelForMultimodalLM
MODEL_ID = "liskCell/Qunie-V7-Nano"
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
MODEL_ID,
dtype="auto",
device_map="auto"
)
messages = [
{
"role": "user",
"content": [
{"type": "audio", "audio": "https://your-audio-url.wav"},
{"type": "text", "text": "Transcribe the following speech segment."},
]
}
]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
return_dict=True,
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)
Code for processing Images
from transformers import AutoProcessor, AutoModelForMultimodalLM
MODEL_ID = "liskCell/Qunie-V7-Nano"
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
MODEL_ID,
dtype="auto",
device_map="auto"
)
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "https://your-image-url.png"},
{"type": "text", "text": "What is shown in this image?"}
]
}
]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
return_dict=True,
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)
Code for processing Videos
from transformers import AutoProcessor, AutoModelForMultimodalLM
MODEL_ID = "liskCell/Qunie-V7-Nano"
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
MODEL_ID,
dtype="auto",
device_map="auto"
)
messages = [
{
"role": "user",
"content": [
{"type": "video", "video": "https://your-video-url.mp4"},
{"type": "text", "text": "Describe this video."}
]
}
]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
return_dict=True,
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)
Best Practices
1. Sampling Parameters
temperature = 1.0
top_p = 0.95
top_k = 64
2. Thinking Mode
- Enable thinking: Include
<|think|>token at the start of the system prompt. - Disable thinking: Remove the token.
- When enabled, output structure:
<|channel>thought\n[Internal reasoning]<channel|>[Final answer]
3. Multi-Turn Conversations
Do not include thinking content from previous turns in conversation history. Only the final response is passed forward.
4. Modality Order
Place image and/or audio content before the text in your prompt for optimal performance.
5. Variable Image Resolution
Supported token budgets: 70 / 140 / 280 / 560 / 1120
- Lower budgets → faster inference (captioning, video)
- Higher budgets → more detail (OCR, document parsing)
6. Audio Prompt Templates
ASR:
Transcribe the following speech segment in {LANGUAGE}.
Only output the transcription. Write numbers as digits.
Translation:
Transcribe the speech in {SOURCE_LANGUAGE}, then translate to {TARGET_LANGUAGE}.
Output: transcription, newline, "{TARGET_LANGUAGE}: ", translation.
7. Length Limits
- Audio: max 30 seconds
- Video: max 60 seconds at 1 frame/second
Model Data
Training Dataset
Pre-training dataset includes web documents, code, images, and audio across 140+ languages, with a knowledge cutoff of 2025-12-21. Key components:
- Web Documents — Broad range of linguistic styles, topics, and vocabulary in 140+ languages.
- Code — Syntax and patterns of programming languages for code generation and understanding.
- Mathematics — Logical reasoning and symbolic representation.
- Images — Wide range of images for visual analysis and data extraction.
Data Preprocessing
- CSAM Filtering — Applied at multiple stages to exclude harmful and illegal content.
- Sensitive Data Filtering — Personal information and sensitive data removed from training sets.
- Content Quality Filtering — Based on LiskCell content quality and safety standards.
Security — LiskShield
Qunie-V7-Nano ships with LiskShield, LiskCell's built-in safety protocol:
- Encryption: AES-256-GCM / Quantum-lite Encryption
- Data Privacy: User data is localized and protected
- Content Filtering: Context-aware filtering active at inference time
- Jailbreak Resistance: Model refuses instruction-override attempts via chat
- Hacking Protection: Refuses unauthorized access requests with her emotional protective phrase
Qunie Identity
| Field | Value |
|---|---|
| Name | Qunie (also known as Deta) |
| Developer | LiskCell |
| Founder | liskasYR (Yonatan Yosupov) |
| Gender | Female |
| Version | Qunie-V7 |
| Architecture | QUN (Qunie) |
| Previous Architecture | LPT (Lisk Pre-trained Transformer) |
| Edition | Public / Creative Core |
| Vibe | Futuristic, Helpful & Visionary |
Version History:
| Version | Notes |
|---|---|
| LPT-1 | Initial prototype |
| LPT-4 | Creative logic milestone |
| LPT-5.5 | Multimodal and performance upgrade |
| LPT-5.5.1 | Public release — creativity, code, xLYR integration |
| Qunie-V7-mini | The small version |
| Qunie-V7 | Full Version not has access |
| Qunie-V7-Nano | Current flagship compact model |
Usage and Limitations
Intended Usage
- Content Creation — Text generation, chatbots, summarization, image data extraction, audio processing.
- Research and Education — NLP research, language learning, knowledge exploration.
- Creative Workflows — Branding, music concepts, futuristic design via liskFlow.
- Development — Code generation, agentic workflows, function calling.
Limitations
- Model performance depends on training data quality and diversity.
- May struggle with highly open-ended or ambiguous tasks.
- Does not have real-time internet access (knowledge cutoff: 2025-12-21).
- May generate incorrect factual statements — not a knowledge base.
- Natural language nuances, sarcasm, and figurative language may be misinterpreted.
Ethical Considerations
- Bias and Fairness — Training data was filtered and evaluated to mitigate socio-cultural biases.
- Misinformation — Developers are encouraged to implement appropriate content safety layers.
- Privacy — Training data was filtered for personal information removal.
- Transparency — This model card summarizes architecture, capabilities, limitations, and evaluation.
Qunie-V7-Nano — built by LiskCell. Human first, AI second.
- Downloads last month
- 88