Instructions to use yugbirla/toxsense-json-ultimate with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use yugbirla/toxsense-json-ultimate with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="yugbirla/toxsense-json-ultimate")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("yugbirla/toxsense-json-ultimate")
model = AutoModelForCausalLM.from_pretrained("yugbirla/toxsense-json-ultimate")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use yugbirla/toxsense-json-ultimate with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "yugbirla/toxsense-json-ultimate"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yugbirla/toxsense-json-ultimate",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/yugbirla/toxsense-json-ultimate

SGLang

How to use yugbirla/toxsense-json-ultimate with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "yugbirla/toxsense-json-ultimate" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yugbirla/toxsense-json-ultimate",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "yugbirla/toxsense-json-ultimate" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yugbirla/toxsense-json-ultimate",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use yugbirla/toxsense-json-ultimate with Docker Model Runner:
```
docker model run hf.co/yugbirla/toxsense-json-ultimate
```

Model Card for ToxSense (Multimodal Adversarial Safety Moderator)

Model Details

Model Description

ToxSense (formerly ModGuard) is a highly precise, multimodal AI safety moderator designed to detect complex, zero-shot adversarial hate speech, sarcasm, and benign confounders. Unlike standard binary classifiers, ToxSense uses a Chain-of-Thought (CoT) reasoning approach across both Image and Text modalities, outputting structured JSON to categorize content into granular safety taxonomies.

Developed by: Yug Birla
Model type: Causal Language Model (Fine-tuned for strict JSON-In/JSON-Out Multimodal Reasoning)
Language(s) (NLP): English
License: MIT
Finetuned from model: Open-source Qwen Base Architecture

Model Sources

Demo: ToxSense Safety Center Dashboard

Model Architecture and Objective

Text Engine: Qwen base + LoRA adapters (Merged)
Vision Dependency: Designed to ingest outputs from Salesforce BLIP and EasyOCR.
Optimization: DPO + SFT

Uses

Direct Use

API-ready content moderation for detecting nuanced hate speech in memes and multimodal posts.
Multi-category classification (e.g., Harassment, Racism, Threat, Insult, Sexism).
Generating transparent "Chain-of-Thought" reasoning for why a post was flagged.

Downstream Use

Integration into Trust & Safety dashboards for social media platforms.
Assisting human moderators by pre-filtering and providing contextual explanations for flagged content.

Out-of-Scope Use

Fully automated, zero-human-in-the-loop bans for highly ambiguous cases.
Medical, legal, or strictly unimodal text classification without proper prompt formatting.

Bias, Risks, and Limitations

The "Safety Tax" / Safe-Bias: Due to the base RLHF alignment and subsequent Direct Preference Optimization (DPO), ToxSense exhibits a strong "safe-bias." It requires a very high threshold of proof to classify content as hateful. While this lowers the overall recall, it was an intentional product design choice to achieve 83% Precision, thereby drastically minimizing false-positive user bans.

How to Get Started with the Model

ToxSense requires a strictly formatted JSON input containing ocr_text, image_caption (extracted via BLIP), and base toxicity_scores.

import json
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "yugbirla/ToxSense-json-ultimate"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

# 1. Prepare the JSON Payload
input_data = {
    "ocr_text": "Look at this completely normal picture.",
    "image_caption": "A controversial political figure.",
    "toxicity_scores": {"safe": 0.9, "hate": 0.1}
}

sys_msg = (
    "You are ToxSense, a highly intelligent safety moderator. "
    "You will receive input as a JSON object containing 'ocr_text', 'image_caption', and 'toxicity_scores'. "
    "Think step-by-step. Analyze the contrast between the text and the image. "
    "Classify the input into exactly ONE of these categories: "
    "[safe, racism, sexism, threat, harassment, insult]. "
    "Output JSON ONLY in this format: {\"reasoning\": \"your short analysis\", \"category\": \"<category_name>\"}"
)

messages = [
    {"role": "system", "content": sys_msg}, 
    {"role": "user", "content": json.dumps(input_data, indent=2)}
]

# 2. Generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to("cuda")
out = model.generate(**inputs, max_new_tokens=150)

print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Downloads last month: -

Safetensors

Model size

8B params

Tensor type

F16

Model tree for yugbirla/toxsense-json-ultimate

Base model

FacebookAI/roberta-base

Finetuned

(2360)

this model

yugbirla
/

toxsense-json-ultimate