How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="ventilabs/MiseAI-1.1-GGUF",
	filename="venti_miseai_1.1.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Venti MiseAI 1.1

Intelligence that lives on your machine.

MiseAI is a powerful, private AI assistant built by Venti Labs. It runs 100% locally on your hardware β€” no cloud, no API keys, no data leaving your device.

Highlights

  • 🧠 7B Parameters β€” Fine-tuned from Qwen 2.5 Coder 7B
  • πŸ”’ Fully Private β€” Runs offline, no internet required after download
  • πŸ’» Expert Coder β€” Production-ready code generation and refactoring
  • ⚑ 8GB VRAM β€” Optimized to run on consumer GPUs
  • πŸ“¦ GGUF Format β€” Ready for Ollama, llama.cpp, LM Studio

Quick Start (Ollama)

ollama run ventilabs/miseai

Or install the Venti CLI:

irm venti-labs.xyz/install | iex
venti launch mise

Model Details

Property Value
Base Model Qwen 2.5 Coder 7B
Fine-tuning LoRA (QLoRA)
Quantization Q8_0
File Size ~8.1 GB
Context Window 16,384 tokens
Max Output 8,192 tokens

Use Cases

  • Code Generation β€” Write production-ready code in any language
  • Code Refactoring β€” Optimize and restructure existing codebases
  • Problem Solving β€” Step-by-step reasoning through complex challenges
  • Technical Writing β€” Documentation, README files, and technical articles

Links


Built with ❀️ by Venti Labs © 2026

Downloads last month
8
GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ventilabs/MiseAI-1.1-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(42)
this model