NobodyWho/Mistral_Ministral-3-8B-Instruct-GGUF

Overview

GGUF quantization of Mistral AI's Ministral 3 8B Instruct (Ministral-3-8B-Instruct-2512), prepared for NobodyWho: it works with NobodyWho out of the box, with Mistral's recommended sampling metadata embedded in every quant, and is verified with NobodyWho's test suite. Ministral 3 8B is a recent, edge-focused multimodal model (8.4B language model + 0.4B vision encoder) with best-in-class agentic tool calling, released under the Apache 2.0 license.

Model Capabilities

  • Text generation — instruction-following chat
  • Tool calling — native function calling with JSON output and grammar constraints
  • Vision — image understanding via the companion mmproj-BF16.gguf projection model
  • Long context — 256k tokens
  • Multilingual — dozens of languages

Available Quantizations

File Approach Tool-calling tests
Ministral-3-8B-Instruct-2512-BF16-vendor-sampling.gguf Vendor sampling injected 14/14
Ministral-3-8B-Instruct-2512-Q8_0-vendor-sampling.gguf Vendor sampling injected 14/14
Ministral-3-8B-Instruct-2512-Q4_K_M-vendor-sampling.gguf Vendor sampling injected 14/14
mmproj-BF16.gguf Vision projection (use with any of the above)

Verified with NobodyWho's tool-calling suite across BF16 / Q8_0 / Q4_K_M (14/14 each, June 2026); vision and multilingual verified. The upstream GGUF has no general.sampling.* metadata, so the -vendor-sampling files embed Mistral's recommended sampler (see INJECTION.md).

Quick Start

Using the NobodyWho library:

from nobodywho import Chat

chat = Chat("huggingface:NobodyWho/Mistral_Ministral-3-8B-Instruct-GGUF/Ministral-3-8B-Instruct-2512-Q4_K_M-vendor-sampling.gguf")
response = chat.ask("What is the capital of Denmark?").completed()
print(response)  # The capital of Denmark is Copenhagen.

Vision

from nobodywho import Model, Chat, Prompt, Image, Text

model = Model(
    "huggingface:NobodyWho/Mistral_Ministral-3-8B-Instruct-GGUF/Ministral-3-8B-Instruct-2512-Q4_K_M-vendor-sampling.gguf",
    projection_model_path="huggingface:NobodyWho/Mistral_Ministral-3-8B-Instruct-GGUF/mmproj-BF16.gguf",
)
chat = Chat(model=model, system_prompt="You are a helpful assistant.")
response = chat.ask(Prompt([
    Text("What is in this image?"),
    Image("./photo.png"),
])).completed()
print(response)

llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="NobodyWho/Mistral_Ministral-3-8B-Instruct-GGUF",
    filename="Ministral-3-8B-Instruct-2512-Q4_K_M-vendor-sampling.gguf",
)

Model Specifications

  • Parameters: 8.4B language model + 0.4B vision encoder
  • Context length: 262,144 tokens (256K)
  • License: Apache 2.0
  • Base model: mistralai/Ministral-3-8B-Instruct-2512
  • Architecture: mistral3 (vision-capable)

Licensing / Credits

Licensed under Apache 2.0 (unchanged from upstream). All model credit belongs to Mistral AI. GGUF quantizations provided by unsloth.

Downloads last month
184
GGUF
Model size
8B params
Architecture
mistral3
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NobodyWho/Mistral_Ministral-3-8B-Instruct-GGUF

Collection including NobodyWho/Mistral_Ministral-3-8B-Instruct-GGUF