MiniCPM-V-4.6-abliterated-MAX-GGUF

MiniCPM-V-4.6-abliterated-MAX is an abliterated evolution built on top of openbmb/MiniCPM-V-4.6. This model applies advanced refusal direction analysis and ablation-based optimization strategies to reduce internal refusal behaviors while preserving the multimodal reasoning and instruction-following strengths of the original architecture. The result is a highly capable and ultra-efficient multimodal language model optimized for image, video, and text understanding with improved instruction adherence.

This model is intended for research and learning purposes only. It reduces internal refusal behaviors, and any content generated by it is used at the user’s own risk. The authors and hosting page disclaim any liability for outputs produced by this model. Users are responsible for ensuring safe, ethical, and lawful usage.

Getting Started with llama.cpp Using Docker

FROM ghcr.io/ggml-org/llama.cpp:full

WORKDIR /app

# Install minimal dependencies required for creating a Python virtual environment
RUN apt-get update && apt-get install -y --no-install-recommends \
    python3-pip python3-venv \
    && rm -rf /var/lib/apt/lists/*

# Create virtual environment
RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Install Python packages inside the virtual environment only
RUN pip install --no-cache-dir -U huggingface_hub

# Download model and mmproj
RUN python3 -c 'from huggingface_hub import hf_hub_download; \
    repo="prithivMLmods/MiniCPM-V-4.6-abliterated-MAX-GGUF"; \
    hf_hub_download(repo_id=repo, filename="MiniCPM-V-4.6-abliterated-MAX.Q4_K_M.gguf", local_dir="/app"); \
    hf_hub_download(repo_id=repo, filename="MiniCPM-V-4.6-abliterated-MAX.mmproj-f16.gguf", local_dir="/app")'

CMD ["--server", \
     "-m", "/app/MiniCPM-V-4.6-abliterated-MAX.Q4_K_M.gguf", \
     "--mmproj", "/app/MiniCPM-V-4.6-abliterated-MAX.mmproj-f16.gguf", \
     "--host", "0.0.0.0", \
     "--port", "7860", \
     "-t", "3", \
     "--mlock", \
     "--prio", "3", \
     "--swa-full", \
     "--no-slots", \
     "-ngl", "99", \
     "--mmap", \
     "--log-disable", \
     "--skip-chat-parsing", \
     "--no-cont-batching", \
     "--threads-http", "1", \
     "--direct-io", \
     "--no-repack", \
     "--flash-attn", "off", \
     "-c", "640000", \
     "-n", "389012"]

Screenshot 2026-05-16 114645 Screenshot 2026-05-16 114708

Model Files

File Name Quant Type File Size File Link
MiniCPM-V-4.6-abliterated-MAX.BF16.gguf BF16 1.52 GB Download
MiniCPM-V-4.6-abliterated-MAX.F16.gguf F16 1.52 GB Download
MiniCPM-V-4.6-abliterated-MAX.F32.gguf F32 3.02 GB Download
MiniCPM-V-4.6-abliterated-MAX.Q2_K.gguf Q2_K 422 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q3_K_L.gguf Q3_K_L 491 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q3_K_M.gguf Q3_K_M 466 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q3_K_S.gguf Q3_K_S 435 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q4_0.gguf Q4_0 501 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q4_K_M.gguf Q4_K_M 529 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q4_K_S.gguf Q4_K_S 505 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q5_0.gguf Q5_0 563 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q5_K_M.gguf Q5_K_M 578 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q5_K_S.gguf Q5_K_S 563 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q6_K.gguf Q6_K 630 MB Download
MiniCPM-V-4.6-abliterated-MAX.Q8_0.gguf Q8_0 812 MB Download
MiniCPM-V-4.6-abliterated-MAX.mmproj-bf16.gguf mmproj-bf16 1.11 GB Download
MiniCPM-V-4.6-abliterated-MAX.mmproj-f16.gguf mmproj-f16 1.11 GB Download
MiniCPM-V-4.6-abliterated-MAX.mmproj-f32.gguf mmproj-f32 2.19 GB Download
MiniCPM-V-4.6-abliterated-MAX.mmproj-q8_0.gguf mmproj-q8_0 728 MB Download

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
2,771
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/MiniCPM-V-4.6-abliterated-MAX-GGUF

Quantized
(3)
this model

Dataset used to train prithivMLmods/MiniCPM-V-4.6-abliterated-MAX-GGUF

Collection including prithivMLmods/MiniCPM-V-4.6-abliterated-MAX-GGUF