batinium/glimo-dehatebert-hsd

Fine-tuned DeHateBERT-style classifier developed during the PrivHSD Challenge for harmful or hate speech detection in the Glimo privacy-preserving pipeline.

Base model: Hate-speech-CNERG/dehatebert-mono-english
Default decision threshold: 0.850469
Intended use: research, moderation assistance, admin triage, and pipeline scoring.
Not intended use: fully automated enforcement without human review.

Data Statement

Do not publish private challenge samples, raw admin uploads, or generated outputs containing private source text in this repository.

Limitations

The classifier can produce false positives and false negatives, especially for dialectal language, reclaimed terms, counterspeech, quoted speech, contextual ambiguity, and emerging coded language. Model outputs and restatements require human/admin review before consequential action.

Usage

from transformers import pipeline

clf = pipeline("text-classification", model="batinium/glimo-dehatebert-hsd")
print(clf("The comment uses abusive language toward a protected group."))

Downloads last month: 21

Safetensors

Model size

0.2B params

Tensor type

F32