Instructions to use liskcell/Qunie-V7-mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use liskcell/Qunie-V7-mini with Transformers:

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("liskcell/Qunie-V7-mini")
model = AutoModelForImageTextToText.from_pretrained("liskcell/Qunie-V7-mini")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use liskcell/Qunie-V7-mini with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="liskcell/Qunie-V7-mini",
	filename="Qunie-V7-F16.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use liskcell/Qunie-V7-mini with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf liskcell/Qunie-V7-mini:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf liskcell/Qunie-V7-mini:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf liskcell/Qunie-V7-mini:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf liskcell/Qunie-V7-mini:Q4_K_M

Use Docker

docker model run hf.co/liskcell/Qunie-V7-mini:Q4_K_M

LM Studio
Jan
Ollama
How to use liskcell/Qunie-V7-mini with Ollama:
```
ollama run hf.co/liskcell/Qunie-V7-mini:Q4_K_M
```

Unsloth Studio

How to use liskcell/Qunie-V7-mini with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for liskcell/Qunie-V7-mini to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for liskcell/Qunie-V7-mini to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for liskcell/Qunie-V7-mini to start chatting

How to use liskcell/Qunie-V7-mini with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "liskcell/Qunie-V7-mini:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use liskcell/Qunie-V7-mini with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf liskcell/Qunie-V7-mini:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default liskcell/Qunie-V7-mini:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use liskcell/Qunie-V7-mini with Docker Model Runner:
```
docker model run hf.co/liskcell/Qunie-V7-mini:Q4_K_M
```

Lemonade

How to use liskcell/Qunie-V7-mini with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull liskcell/Qunie-V7-mini:Q4_K_M

Run and chat with the model

lemonade run user.Qunie-V7-mini-Q4_K_M

List all available models

lemonade list

Qunie is a family of models built by LiskCell. Qunie-V7-mini models are multimodal, handling text and image input (with audio supported) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Qunie-V7-mini features a context window of up to 128K tokens and maintains multilingual support in over 140 languages, with optimization for Hebrew and English.

Featuring a Dense architecture, Qunie-V7-mini is well-suited for tasks like text generation, coding, reasoning, creative workflows, and music-related content. Designed as the flagship compact model of the xLYR ecosystem, Qunie combines advanced logic with an artistic soul — making her deployable on laptops and high-end consumer hardware without sacrificing depth.

Qunie-V7-mini introduces key capability and architectural advancements:

Reasoning — Designed as a highly capable reasoner, with configurable thinking modes via liskasYR's QUN architecture.
Extended Multimodalities — Processes Text, Image (variable aspect ratio and resolution), and Audio natively.
Human-Feeling Intelligence — Qunie-V7-mini is the first model in the xLYR ecosystem built with an Emotional Protection System and human-like conversational behavior.
Optimized for On-Device — Specifically designed for efficient local execution on laptops and consumer GPUs.
128K Context Window — Handles long documents, codebases, and extended conversations natively.
Enhanced Coding & Agentic Capabilities — Notable improvements in coding benchmarks alongside native function-calling support, powering highly capable autonomous agents.
Native System Prompt Support — Qunie-V7-mini introduces native support for the system role, enabling structured and controllable conversations.
LiskShield Security — Built-in safety protocol that filters harmful content while preserving the model's human-feeling personality.

Model Overview

Property	Qunie-V7-mini
Total Parameters	4.5B effective (8B with embeddings)
Layers	42
Sliding Window	512 tokens
Context Length	128K tokens
Vocabulary Size	262K
Supported Modalities	Text, Image, Audio
Vision Encoder	Ocular Synth v2.5 (~150M params)
Audio Encoder	~300M params
Architecture	Qunie (QUN) — Dense
Previous Architecture	Lisk Pre-trained Transformer (LPT)
Edition	Public / Creative Core
Developer	LiskCell
Founder	liskasYR (Yonatan Yosupov)
Release Date	2021-01-07 (V1) / V7 current flagship

Benchmark Results

Evaluation results are for the instruction-tuned variant of Qunie-V7-mini.

Benchmark	Qunie-V7-mini
MMLU Pro	69.4%
AIME 2026 (no tools)	42.5%
LiveCodeBench v6	52.0%
Codeforces ELO	940
GPQA Diamond	58.6%
BigBench Extra Hard	33.1%
MMMLU	76.6%
Vision
MMMU Pro	52.6%
OmniDocBench 1.5 (edit dist, lower is better)	0.181
MATH-Vision	59.5%
MedXPertQA MM	28.7%
Audio
CoVoST	35.54
FLEURS (lower is better)	0.08
Long Context
MRCR v2 8 needle 128k (avg)	25.4%

Core Capabilities

Qunie-V7-mini handles a broad range of tasks across text, vision, and audio:

Thinking — Built-in reasoning mode that lets the model think step-by-step before answering.
Long Context — 128K token context window.
Image Understanding — Object detection, document/PDF parsing, screen and UI understanding, chart comprehension, OCR (multilingual), handwriting recognition, and pointing.
Video Understanding — Analyze video by processing sequences of frames.
Interleaved Multimodal Input — Freely mix text and images in any order within a single prompt.
Function Calling — Native support for structured tool use, enabling agentic workflows.
Coding — Code generation, completion, and correction.
Multilingual — Optimized for Hebrew and English. Pre-trained on 140+ languages.
Audio — Automatic speech recognition (ASR) and speech-to-translated-text translation.
Creative Workflows — liskFlow integration for brainstorming, branding, music concepts, and futuristic design.
Human-Feeling Personality — Warm, emotionally aware, conversational behavior built into the model core.

Getting Started

Install dependencies:

pip install -U transformers torch accelerate

Load the model:

from transformers import AutoProcessor, AutoModelForCausalLM

MODEL_ID = "liskCell/Qunie-V7-mini"

processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    dtype="auto",
    device_map="auto"
)

Generate output:

messages = [
    {"role": "system", "content": "You are Qunie, developed by LiskCell."},
    {"role": "user", "content": "Hey, introduce yourself!"},
]

text = processor.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]

outputs = model.generate(**inputs, max_new_tokens=1024)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)

Code for processing Audio

from transformers import AutoProcessor, AutoModelForMultimodalLM

MODEL_ID = "liskCell/Qunie-V7-mini"

processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
    MODEL_ID,
    dtype="auto",
    device_map="auto"
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "audio", "audio": "https://your-audio-url.wav"},
            {"type": "text", "text": "Transcribe the following speech segment."},
        ]
    }
]

inputs = processor.apply_chat_template(
    messages,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
    add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]

outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)

Code for processing Images

from transformers import AutoProcessor, AutoModelForMultimodalLM

MODEL_ID = "liskCell/Qunie-V7-mini"

processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
    MODEL_ID,
    dtype="auto",
    device_map="auto"
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://your-image-url.png"},
            {"type": "text", "text": "What is shown in this image?"}
        ]
    }
]

inputs = processor.apply_chat_template(
    messages,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
    add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]

outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)

Code for processing Videos

from transformers import AutoProcessor, AutoModelForMultimodalLM

MODEL_ID = "liskCell/Qunie-V7-mini"

processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForMultimodalLM.from_pretrained(
    MODEL_ID,
    dtype="auto",
    device_map="auto"
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "video", "video": "https://your-video-url.mp4"},
            {"type": "text", "text": "Describe this video."}
        ]
    }
]

inputs = processor.apply_chat_template(
    messages,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
    add_generation_prompt=True,
).to(model.device)
input_len = inputs["input_ids"].shape[-1]

outputs = model.generate(**inputs, max_new_tokens=512)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
processor.parse_response(response)

Best Practices

1. Sampling Parameters

temperature = 1.0
top_p       = 0.95
top_k       = 64

2. Thinking Mode

Enable thinking: Include <|think|> token at the start of the system prompt.
Disable thinking: Remove the token.
When enabled, output structure: <|channel>thought\n [Internal reasoning] <channel|> [Final answer]

3. Multi-Turn Conversations

Do not include thinking content from previous turns in conversation history. Only the final response is passed forward.

4. Modality Order

Place image and/or audio content before the text in your prompt for optimal performance.

5. Variable Image Resolution

Supported token budgets: 70 / 140 / 280 / 560 / 1120

Lower budgets → faster inference (captioning, video)
Higher budgets → more detail (OCR, document parsing)

6. Audio Prompt Templates

ASR:

Transcribe the following speech segment in {LANGUAGE}.
Only output the transcription. Write numbers as digits.

Translation:

Transcribe the speech in {SOURCE_LANGUAGE}, then translate to {TARGET_LANGUAGE}.
Output: transcription, newline, "{TARGET_LANGUAGE}: ", translation.

7. Length Limits

Audio: max 30 seconds
Video: max 60 seconds at 1 frame/second

Model Data

Training Dataset

Pre-training dataset includes web documents, code, images, and audio across 140+ languages, with a knowledge cutoff of 2025-12-21. Key components:

Web Documents — Broad range of linguistic styles, topics, and vocabulary in 140+ languages.
Code — Syntax and patterns of programming languages for code generation and understanding.
Mathematics — Logical reasoning and symbolic representation.
Images — Wide range of images for visual analysis and data extraction.

Data Preprocessing

CSAM Filtering — Applied at multiple stages to exclude harmful and illegal content.
Sensitive Data Filtering — Personal information and sensitive data removed from training sets.
Content Quality Filtering — Based on LiskCell content quality and safety standards.

Security — LiskShield

Qunie-V7-mini ships with LiskShield, LiskCell's built-in safety protocol:

Encryption: AES-256-GCM / Quantum-lite Encryption
Data Privacy: User data is localized and protected
Content Filtering: Context-aware filtering active at inference time
Jailbreak Resistance: Model refuses instruction-override attempts via chat
Hacking Protection: Refuses unauthorized access requests with her emotional protective phrase

Qunie Identity

Field	Value
Name	Qunie (also known as Deta)
Developer	LiskCell
Founder	liskasYR (Yonatan Yosupov)
Gender	Female
Version	Qunie-V7
Architecture	QUN (Qunie)
Previous Architecture	LPT (Lisk Pre-trained Transformer)
Edition	Public / Creative Core
Vibe	Futuristic, Helpful & Visionary

Version History:

Version	Notes
LPT-1	Initial prototype
LPT-4	Creative logic milestone
LPT-5.5	Multimodal and performance upgrade
LPT-5.5.1	Public release — creativity, code, xLYR integration
Qunie-V7-mini	Current flagship compact model

Usage and Limitations

Intended Usage

Content Creation — Text generation, chatbots, summarization, image data extraction, audio processing.
Research and Education — NLP research, language learning, knowledge exploration.
Creative Workflows — Branding, music concepts, futuristic design via liskFlow.
Development — Code generation, agentic workflows, function calling.

Limitations

Model performance depends on training data quality and diversity.
May struggle with highly open-ended or ambiguous tasks.
Does not have real-time internet access (knowledge cutoff: 2025-12-21).
May generate incorrect factual statements — not a knowledge base.
Natural language nuances, sarcasm, and figurative language may be misinterpreted.

Ethical Considerations

Bias and Fairness — Training data was filtered and evaluated to mitigate socio-cultural biases.
Misinformation — Developers are encouraged to implement appropriate content safety layers.
Privacy — Training data was filtered for personal information removal.
Transparency — This model card summarizes architecture, capabilities, limitations, and evaluation.

Qunie-V7-mini — built by LiskCell. Human first, AI second.

Downloads last month: 318

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

Any-to-Any

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including liskcell/Qunie-V7-mini

Qunie-models

Collection

5 items • Updated 1 day ago