Belnap Corpus Controversy Classifier (DistilBERT)

What this is

A binary classifier that predicts whether a debate proposition is high-controversy (informed debaters strongly disagree) vs non-high-controversy (medium or low disagreement). Fine-tuned from distilbert-base-uncased on the 110-record barissozudogru/belnap-debate-corpus.

Honest sizing

This model was trained on 88 training examples / 22 eval examples (stratified 80/20 split from a 110-record corpus). 110 records is small by ML standards — fine-tunes at this scale tend toward memorization rather than broad generalization. Results below are useful as a small-data baseline, not a production controversy detector.

Metrics on held-out 22-example eval set

Metric Model Baseline (always-predict-high)
Accuracy 0.864 0.727
F1 (macro) 0.790 n/a
F1 (high) 0.914 n/a

Lift above baseline: +13.6 percentage points.

Confusion matrix

              predicted
              non-high  high
actual non-high    3      3      (recall 50%)
actual high        0     16      (recall 100%)

Per-class

Class Precision Recall F1 Support
non-high (0) 1.00 0.50 0.67 6
high (1) 0.84 1.00 0.91 16

How to read those numbers

The classifier is asymmetric: it never misses a high-controversy proposition (100% recall on high), but it's conservative when calling something non-high (only 50% recall, missing half of them). Practically:

  • Good for: flagging propositions that are likely high-controversy with high confidence
  • Less good for: definitively saying something is not controversial
  • Read the prediction as: "is this likely high-controversy? yes/maybe" rather than "is this high or low? clean binary"

How to use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("barissozudogru/belnap-controversy-classifier")
tokenizer = AutoTokenizer.from_pretrained("barissozudogru/belnap-controversy-classifier")

proposition = "Frontier AI models should be open-sourced despite misuse risks."
inputs = tokenizer(proposition, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.softmax(logits, dim=-1)[0]

print(f"P(high-controversy) = {probs[1]:.3f}")
print(f"P(non-high)         = {probs[0]:.3f}")
print(f"label: {model.config.id2label[int(torch.argmax(logits))]}")

Training details

Aspect Value
Base model distilbert-base-uncased (66M params)
Training data 88 propositions (stratified 80% of belnap-debate-corpus)
Eval data 22 propositions (stratified 20%)
Epochs 5
Batch size 8
Learning rate 3e-5
Optimizer AdamW (default for Trainer)
Weight decay 0.01
Warmup ratio 0.1
Loss Cross-entropy with class weights (non-high weighted higher to mitigate imbalance)
Class weights non-high: 2.44, high: 0.63
Seed 42
Hardware Apple Silicon (MPS backend)
Best-model selection Best macro-F1 across 5 epoch evals

Full training script: train.py in this repo.

Limitations

  • Small training set (88 examples). Tends toward memorization; broader generalization is unverified. Use embeddings + leave-one-out cross-val for more rigorous evaluation.
  • Domain coverage matches the corpus: economics, bioethics, ethics, labor, education, technology policy, environment, free speech, animal ethics, political philosophy. Out-of-domain propositions (e.g., scientific consensus questions, legal-procedural debates) may behave differently.
  • controversy labels in the source corpus are author judgments, not measured agreement rates. The classifier learns to predict those judgments, not ground-truth controversy.
  • English only.
  • Stratified split, not leave-one-out — eval set may be optimistic.

Intended use

  • Pairing with the Belnap paraconsistent debate Space to pre-filter propositions worth running through the full debate pipeline
  • A small-data baseline for anyone evaluating controversy-detection approaches on the same corpus
  • Teaching example showing 110-record fine-tuning trade-offs honestly

Not for

  • Production-grade controversy scoring on arbitrary text
  • Legal, journalistic, or moderation decisions
  • Languages other than English

Citation

If you use this model, please cite the underlying corpus:

@misc{sozudogru2026belnapcorpus,
  author       = {Sozudogru, Baris},
  title        = {Belnap Real-Debate Corpus},
  year         = {2026},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/datasets/barissozudogru/belnap-debate-corpus},
}

And the Belnap-Dunn foundations:

  • Belnap, N. D. (1977). A Useful Four-Valued Logic. In Modern Uses of Multiple-Valued Logic.
  • Dunn, J. M. (1976). Intuitive Semantics for First-Degree Entailments and Coupled Trees. Philosophical Studies 29(3).

Related

Downloads last month
43
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for barissozudogru/belnap-controversy-classifier

Finetuned
(11879)
this model

Dataset used to train barissozudogru/belnap-controversy-classifier

Space using barissozudogru/belnap-controversy-classifier 1

Collection including barissozudogru/belnap-controversy-classifier