mb-agn

ModernBERT-base fine-tuned on AG News
A 4-way news-headline classifier (World, Sports, Business, Sci/Tech) built by extending the
answerdotai/ModernBERT-base encoder.


Model description

This model uses the ModernBERT-base transformer as its encoder and a fresh 4-class classification head.
Inputs are English news headlines (merged title + description) tokenized to a maximum of 128 tokens.
Outputs are class indices {0,1,2,3} with corresponding confidence scores.


Intended uses & limitations

  • Intended use:

    • Classify short English news headlines into one of four AG News categories.
    • Integrate into high-throughput inference pipelines where accuracy and speed are critical.
  • Limitations:

    • Trained only on AG News; performance on other domains or longer texts is not guaranteed.
    • English-only model; non-English inputs will degrade accuracy.
    • May reflect biases present in the AG News dataset.

Training and evaluation data

  • Dataset: AG News (120 000 training, 12 000 validation, 7 600 test examples)
  • Preprocessing:
    • Loaded CSVs via Pandas, renamed columns to label,title,description.
    • Shifted labels from {1,…,4} → {0,…,3}.
    • Merged title + description into a single text field.
    • Split train→validation (90 %/10 %) using train_test_split.
    • Tokenized with AutoTokenizer.from_pretrained("svenk029/mb-agn"), truncation/padding to 128 tokens.

Training procedure

Training hyperparameters

Parameter Value
Epochs 3
Train batch size 16
Eval batch size 16
Learning rate 2 × 10⁻⁵
Weight decay 0.01
Optimizer AdamW (betas=(0.9,0.999), eps=1e-8)
LR scheduler Linear
Seed 42
Evaluation strategy Epoch
Save strategy Epoch
Load best model at end True (metric: accuracy)

Training results

Split Accuracy Precision Recall F1-Score
Validation 0.9421 0.9430 0.9421 0.9421
Test 0.9432 0.9436 0.9432 0.9432

Per-class F1 on test: World 0.95, Sports 0.99, Business 0.91, Sci/Tech 0.92


Framework versions

  • Transformers: 4.52.3
  • PyTorch: 2.0.1+cu117
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1
  • Scikit-Learn: 1.2.2
  • Pandas: 1.5.3
  • NumPy: 1.24.2

Downloads last month
5
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for svenk029/mb-agn

Finetuned
(1349)
this model

Dataset used to train svenk029/mb-agn