DistilBERT Fake News Classifier

Model Description

Fine-tuned DistilBERT for fake news detection. Trained on the GonzaloA/fake_news dataset (24,353 articles).

Performance

Test Accuracy: 99.15%
F1 Score: 99.15%
Epochs: 3
Learning rate: 2e-5

How to Use

from transformers import pipeline

pipe = pipeline("text-classification",
                model="RazakAIhub/distilbert-fake-news-classifier")

result = pipe("WASHINGTON (Reuters) - NASA confirms water ice on moon.")
print(result)

Known Limitation — Important

This model was trained on Reuters articles (real) vs opinion/partisan content (fake). It learned to detect writing style, not factual accuracy.

Formal, sourced writing → predicted REAL
Casual, sensational writing → predicted FAKE

This means it will misclassify casual-but-true statements as fake. A more robust dataset with style-balanced examples is needed for production use.

What I Learned

Spurious correlation: models latch onto shortcut features in training data. High accuracy does not mean the model learned the right thing.

Author

Razak Shaik — VIT-AP University, CS Final Year

Downloads last month: 51

Safetensors

Model size

67M params

Tensor type

F32

RazakAIhub
/

distilbert-fake-news-classifier