Update README.md
Browse files
README.md
CHANGED
@@ -17,11 +17,11 @@ widget:
|
|
17 |
|
18 |
# Fine-tuned DistilRoBERTa-base for detecting news on politics
|
19 |
|
20 |
-
# Model Description
|
21 |
|
22 |
This model is a finetuned RoBERTa-large, for classifying whether news articles are about politics.
|
23 |
|
24 |
-
# How to Use
|
25 |
|
26 |
```python
|
27 |
from transformers import pipeline
|
@@ -29,8 +29,42 @@ classifier = pipeline("sentiment-analysis", model="dell-research-harvard/topic-p
|
|
29 |
classifier("Kennedy wins election")
|
30 |
```
|
31 |
|
32 |
-
#
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
-
# Reference
|
36 |
|
|
|
17 |
|
18 |
# Fine-tuned DistilRoBERTa-base for detecting news on politics
|
19 |
|
20 |
+
# Model Description
|
21 |
|
22 |
This model is a finetuned RoBERTa-large, for classifying whether news articles are about politics.
|
23 |
|
24 |
+
# How to Use
|
25 |
|
26 |
```python
|
27 |
from transformers import pipeline
|
|
|
29 |
classifier("Kennedy wins election")
|
30 |
```
|
31 |
|
32 |
+
# Training data
|
33 |
|
34 |
+
The model was trained on a hand-labelled sample of data from the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
|
35 |
+
|
36 |
+
Split|Size
|
37 |
+
-|-
|
38 |
+
Train|2418
|
39 |
+
Dev|498
|
40 |
+
Test|1473
|
41 |
+
|
42 |
+
# Test set results
|
43 |
+
|
44 |
+
F1|0.8492
|
45 |
+
Accuracy|0.9593
|
46 |
+
Precision|0.9086
|
47 |
+
Recall|0.7972
|
48 |
+
|
49 |
+
|
50 |
+
# Citation Information
|
51 |
+
|
52 |
+
You can cite this dataset using
|
53 |
+
|
54 |
+
```
|
55 |
+
@misc{silcock2024newswirelargescalestructureddatabase,
|
56 |
+
title={Newswire: A Large-Scale Structured Database of a Century of Historical News},
|
57 |
+
author={Emily Silcock and Abhishek Arora and Luca D'Amico-Wong and Melissa Dell},
|
58 |
+
year={2024},
|
59 |
+
eprint={2406.09490},
|
60 |
+
archivePrefix={arXiv},
|
61 |
+
primaryClass={cs.CL},
|
62 |
+
url={https://arxiv.org/abs/2406.09490},
|
63 |
+
}
|
64 |
+
```
|
65 |
+
|
66 |
+
# Applications
|
67 |
+
|
68 |
+
We applied this model to a century of historical news articles. You can see all the classifications in the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
|
69 |
|
|
|
70 |
|