Model Card for BERT News Headline Classifier

This model is a fine-tuned version of bert-base-uncased designed to categorize English news headlines into four distinct topics: World, Sports, Business, and Sci/Tech.

Model Details

Model Description

This model acts as a highly accurate Natural Language Processing (NLP) routing engine for news content. By fine-tuning a BERT transformer on the AG News dataset, the model learned to identify the core subject matter of short-form news text (headlines and brief descriptions) with over 94% accuracy.

  • Developed by: Aashir Hameed
  • Model type: Transformer-based Text Classification Model
  • Language(s) (NLP): English (en)
  • License: Apache 2.0
  • Finetuned from model: bert-base-uncased

Model Sources

Uses

Direct Use

The model is intended to be used for the automated tagging and categorization of news headlines, RSS feeds, and short articles. It takes a string of text as input and outputs one of four labels along with a confidence score.

Labels:

  • LABEL_0: World
  • LABEL_1: Sports
  • LABEL_2: Business
  • LABEL_3: Sci/Tech

Out-of-Scope Use

This model is not intended for:

  • Long-form document classification (the model was trained with a max_length of 128 tokens; texts longer than this will be truncated).
  • Languages other than English.
  • Fact-checking, sentiment analysis, or identifying fake news.

Bias, Risks, and Limitations

Like all language models trained on historical web data, this model may carry the inherent biases present in the AG News dataset. Furthermore, due to the overlap between 'Business' and 'Sci/Tech' in the real world (e.g., tech companies posting quarterly earnings), the model's confidence scores may occasionally split between these two categories for corporate technology news.

How to Get Started with the Model

Use the code below to get started with the model using the Hugging Face pipeline:

from transformers import pipeline

# Load the fine-tuned model
classifier = pipeline("text-classification", model="Aashir92/News-Headline-Classifier")

# Run inference
text = "Tech giant unveils revolutionary quantum computer chip"
result = classifier(text)

print(result)
# Output: [{'label': 'Sci/Tech', 'score': 0.9401}]

Training Details

Training Data

The model was trained on the AG News dataset (ag_news), a collection of more than 1 million news articles gathered from more than 2,000 news sources by ComeToMyHead. For this fine-tuning task, the standard split of 120,000 training samples was utilized.

Training Procedure

Preprocessing

Texts were tokenized using the bert-base-uncased tokenizer. To optimize for GPU memory constraints during training, dynamic padding and truncation were applied with a max_length of 128.

Training Hyperparameters

  • Training regime: Mixed precision (fp16)
  • Learning rate: 2e-05
  • Train batch size (per device): 16
  • Gradient accumulation steps: 2 (effective batch size of 32)
  • Weight decay: 0.01
  • Epochs: 3
  • Optimizer: AdamW

Speeds, Sizes, Times

  • Hardware: Nvidia T4/P100 (16GB VRAM) via Kaggle Notebooks
  • Total Training Time: ~1 hour 23 minutes

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on the official test split of the AG News dataset, which consists of 7,600 unseen news headlines (1,900 samples per category).

Metrics

  • Accuracy: The proportion of correctly predicted classifications.
  • Macro F1-Score: The harmonic mean of precision and recall, unweighted by class frequency, ensuring the model performs equally well across all four categories.

Results

  • Final Test Accuracy: 94.66%
  • Final Test F1-Score: 94.67%

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator.

  • Hardware Type: Nvidia T4/P100 (16GB)
  • Hours used: 1.5 hours
  • Cloud Provider: Google Cloud (via Kaggle)

Author & Contact

Aashir Hameed

Downloads last month
5
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Aashir92/News-Headline-Classifier

Finetuned
(6786)
this model

Dataset used to train Aashir92/News-Headline-Classifier

Space using Aashir92/News-Headline-Classifier 1