dell-research-harvard
/

topic-politics

Text Classification

Inference Endpoints

Model card Files Files and versions Community

emilys commited on Aug 14, 2024

Commit

b933706

·

verified ·

1 Parent(s): 3761b80

Update README.md

Files changed (1) hide show

README.md +38 -4

README.md CHANGED Viewed

@@ -17,11 +17,11 @@ widget:
 # Fine-tuned DistilRoBERTa-base for detecting news on politics
-# Model Description
 This model is a finetuned RoBERTa-large, for classifying whether news articles are about politics.
-# How to Use
 ```python
 from transformers import pipeline
@@ -29,8 +29,42 @@ classifier = pipeline("sentiment-analysis", model="dell-research-harvard/topic-p
 classifier("Kennedy wins election")
 ```
-# Contact
-# Reference

 # Fine-tuned DistilRoBERTa-base for detecting news on politics
+# Model Description
 This model is a finetuned RoBERTa-large, for classifying whether news articles are about politics.
+# How to Use
 ```python
 from transformers import pipeline
 classifier("Kennedy wins election")
 ```
+# Training data
+The model was trained on a hand-labelled sample of data from the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
+Split|Size
+-|-
+Train|2418
+Dev|498
+Test|1473
+# Test set results
+F1|0.8492
+Accuracy|0.9593
+Precision|0.9086
+Recall|0.7972
+# Citation Information
+You can cite this dataset using
+```
+@misc{silcock2024newswirelargescalestructureddatabase,
+      title={Newswire: A Large-Scale Structured Database of a Century of Historical News},
+      author={Emily Silcock and Abhishek Arora and Luca D'Amico-Wong and Melissa Dell},
+      year={2024},
+      eprint={2406.09490},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2406.09490},
+}
+```
+# Applications
+We applied this model to a century of historical news articles. You can see all the classifications in the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).