Support Detector

Binary classifier for parliamentary sentences: among sentences that are not Opposition, does the sentence express Support toward the European Union (label 1) or is it Neutral (label 0)?

This is the second stage of a two-step stance-detection cascade. It is applied only to sentences that the upstream Opposition detector has classified as Non-Opposition. Both stages use a 0.5 decision threshold.

Fine-tuned from jhu-clsp/mmBERT-base on hand-annotated parliamentary speeches from AUS, CZE, DEU, DNK, ESP, GBR, NLD, and SWE.

Labels

  • 0 β€” Neutral
  • 1 β€” Support

Training data

  • Source: hand-annotated parliamentary sentences labelled Neutral, Support, or Opposition.
  • For this model, restricted to gold non-Opposition rows (Neutral βˆͺ Support) and binarised as Support vs Neutral.
  • File: Stance_Retrain_undersampled.csv (undersampled to address class imbalance).
  • Split: leakage-safe StratifiedGroupKFold (n_splits=10) on country Γ— speech_ID, so no speech appears in more than one fold. Realised allocation: 8 folds train / 1 fold val / 1 fold test (~80/10/10). Shares the same underlying stance split as the Opposition detector for consistent cascade evaluation.

Hyperparameters

  • Base model: jhu-clsp/mmBERT-base
  • Max sequence length: 320
  • Learning rate: 4e-05
  • Epochs: 4
  • Batch size: 32 (with gradient accumulation if large model)
  • Warmup ratio: 0.2
  • Weight decay: 0.05
  • LR scheduler: cosine
  • Optimizer: AdamW (HF Trainer default)
  • Mixed precision: fp16
  • Early stopping patience: 2 (monitoring f1_positive on val)
  • Class weights: balanced (sklearn compute_class_weight)
  • Focal loss: disabled (plain weighted cross-entropy)
  • Random seed: 123
  • Model selection: best checkpoint by validation f1_positive (minority-class F1)

Input format

Sentence-only input (no surrounding context window). Truncation to 320 tokens.

Usage (standalone β€” Support vs Neutral)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tok = AutoTokenizer.from_pretrained("LBenoit/support-detector-mmbert")
mdl = AutoModelForSequenceClassification.from_pretrained("LBenoit/support-detector-mmbert")

text = "European cooperation has brought decades of peace and prosperity."
enc  = tok(text, truncation=True, max_length=320, return_tensors="pt")
with torch.no_grad():
    prob_support = torch.softmax(mdl(**enc).logits, dim=-1)[0, 1].item()
print("P(Support | Non-Opposition) =", prob_support)

Usage (cascade β€” full 3-way stance)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

OPP_REPO = "LBenoit/opposition-detector-mmbert"
SUP_REPO = "LBenoit/support-detector-mmbert"

tok_o = AutoTokenizer.from_pretrained(OPP_REPO)
mdl_o = AutoModelForSequenceClassification.from_pretrained(OPP_REPO)
tok_s = AutoTokenizer.from_pretrained(SUP_REPO)
mdl_s = AutoModelForSequenceClassification.from_pretrained(SUP_REPO)

def predict_stance(text, thresh=0.5):
    enc = tok_o(text, truncation=True, max_length=320, return_tensors="pt")
    p_opp = torch.softmax(mdl_o(**enc).logits, dim=-1)[0, 1].item()
    if p_opp >= thresh:
        return "Opposition"
    enc = tok_s(text, truncation=True, max_length=320, return_tensors="pt")
    p_sup = torch.softmax(mdl_s(**enc).logits, dim=-1)[0, 1].item()
    return "Support" if p_sup >= thresh else "Neutral"

Intended use

Research on parliamentary stance toward the EU. Designed as the second stage of an Opposition β†’ Support cascade. Using it standalone on arbitrary text (without first filtering out Opposition sentences) is out of distribution and not recommended.

Limitations

  • Trained only on non-Opposition rows; applying it to Opposition sentences without the upstream filter will produce unreliable predictions.
  • Trained on parliamentary register; performance on social media, journalism, or other domains is not guaranteed.
  • Coverage limited to the eight countries listed above; generalisation to other parliaments is untested.
  • Sentence-level only; longer-range discourse context is not modelled.
Downloads last month
96
Safetensors
Model size
0.3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LBenoit/support-detector-mmbert

Finetuned
(112)
this model