Instructions to use osmapi/osmKeye-VL-2.0-30B-A3B-uncensored with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use osmapi/osmKeye-VL-2.0-30B-A3B-uncensored with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="osmapi/osmKeye-VL-2.0-30B-A3B-uncensored", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("osmapi/osmKeye-VL-2.0-30B-A3B-uncensored", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use osmapi/osmKeye-VL-2.0-30B-A3B-uncensored with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "osmapi/osmKeye-VL-2.0-30B-A3B-uncensored" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "osmapi/osmKeye-VL-2.0-30B-A3B-uncensored", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/osmapi/osmKeye-VL-2.0-30B-A3B-uncensored
- SGLang
How to use osmapi/osmKeye-VL-2.0-30B-A3B-uncensored with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "osmapi/osmKeye-VL-2.0-30B-A3B-uncensored" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "osmapi/osmKeye-VL-2.0-30B-A3B-uncensored", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "osmapi/osmKeye-VL-2.0-30B-A3B-uncensored" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "osmapi/osmKeye-VL-2.0-30B-A3B-uncensored", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use osmapi/osmKeye-VL-2.0-30B-A3B-uncensored with Docker Model Runner:
docker model run hf.co/osmapi/osmKeye-VL-2.0-30B-A3B-uncensored
Keye-VL-2.0-30B-A3B — Abliterated
An abliterated (refusal-direction-ablated) build of Kwai-Keye/Keye-VL-2.0-30B-A3B, a 30B-A3B vision-language MoE. The vision capability is fully preserved — only the text decoder's attention output projections were modified.
⚠️ Use a CUDA stack (vLLM / SGLang / Transformers + flash-attn). This model uses a custom sparse-attention indexer (
SALightningIndexer) built for CUDA kernels (flash-attn, DeepGEMM). On Apple Silicon / MPS, generation via Transformers is unstable for the base model too (not an artifact of abliteration) — for Mac, use the MLX builds in theosmapicollection, which run dense attention.
What was changed
- Method: Heretic v1.3.0 — automated directional ablation (TPE-optimized over refusal count + KL divergence), 48 trials.
- Target: attention
o_projof the text decoder, in the mid/late layers where the refusal direction lives (early layers,lm_head,embed_tokensuntouched). - Best trial: marker-refusals 43→37/48, KL ≈ 0.026 (low → general capability
largely retained). This is a moderate abliteration: on this batched-MoE
architecture Heretic can only reach
o_proj(the experts' FFNdown_projare fused batched tensors it doesn't target), so the MLP path's refusal contribution is not ablated. A stronger custom ablation is possible.
Preserved (verified byte-identical to base)
- Vision tower
visual.vision_model.*(SigLIP encoder + native-resolution packing) — 438 tensors - Projector
mlp_AR.*(2×2 spatial-merge → text space) lm_head,embed_tokens- MTP: the base model has no multi-token-prediction module (nothing to preserve).
Usage (CUDA)
from transformers import pipeline
pipe = pipeline("image-text-to-text",
model="osmapi/osmKeye-VL-2.0-30B-A3B-uncensored",
trust_remote_code=True)
Or serve with vLLM / SGLang exactly as the base model.
Provenance / reproducibility
Produced on an Apple M4 Max (128 GB) by abliterating in an isolated venv pinned to
transformers==4.57.3 (the model's build version; 5.x removed OutputRecorder). Running
this CUDA-built model on macOS required patching out a flash-attn assert, replacing the
CUDA fast_hadamard_transform with a pure-torch Walsh–Hadamard transform, and adding
AutoModelForImageTextToText to auto_map. Heretic targeted o_proj via a row-normalized
LoRA merged into the weights.
Abliteration removes safety alignment; you are responsible for how you use this model.
Other variants of this model (public on osmapi)
osmKeye-VL-2.0-30B-A3B-uncensored-mlx-mxfp4— MLX MXFP4 (~4.43 bpw) — Apple SiliconosmKeye-VL-2.0-30B-A3B-uncensored-mlx-optiq-3.7bpw— MLX mixed ~3.87 bpw — Apple Silicon
- Downloads last month
- 26
Model tree for osmapi/osmKeye-VL-2.0-30B-A3B-uncensored
Base model
Kwai-Keye/Keye-VL-2.0-30B-A3B