emilys commited on
Commit
b933706
·
verified ·
1 Parent(s): 3761b80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -4
README.md CHANGED
@@ -17,11 +17,11 @@ widget:
17
 
18
  # Fine-tuned DistilRoBERTa-base for detecting news on politics
19
 
20
- # Model Description
21
 
22
  This model is a finetuned RoBERTa-large, for classifying whether news articles are about politics.
23
 
24
- # How to Use
25
 
26
  ```python
27
  from transformers import pipeline
@@ -29,8 +29,42 @@ classifier = pipeline("sentiment-analysis", model="dell-research-harvard/topic-p
29
  classifier("Kennedy wins election")
30
  ```
31
 
32
- # Contact
33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
- # Reference
36
 
 
17
 
18
  # Fine-tuned DistilRoBERTa-base for detecting news on politics
19
 
20
+ # Model Description
21
 
22
  This model is a finetuned RoBERTa-large, for classifying whether news articles are about politics.
23
 
24
+ # How to Use
25
 
26
  ```python
27
  from transformers import pipeline
 
29
  classifier("Kennedy wins election")
30
  ```
31
 
32
+ # Training data
33
 
34
+ The model was trained on a hand-labelled sample of data from the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
35
+
36
+ Split|Size
37
+ -|-
38
+ Train|2418
39
+ Dev|498
40
+ Test|1473
41
+
42
+ # Test set results
43
+
44
+ F1|0.8492
45
+ Accuracy|0.9593
46
+ Precision|0.9086
47
+ Recall|0.7972
48
+
49
+
50
+ # Citation Information
51
+
52
+ You can cite this dataset using
53
+
54
+ ```
55
+ @misc{silcock2024newswirelargescalestructureddatabase,
56
+ title={Newswire: A Large-Scale Structured Database of a Century of Historical News},
57
+ author={Emily Silcock and Abhishek Arora and Luca D'Amico-Wong and Melissa Dell},
58
+ year={2024},
59
+ eprint={2406.09490},
60
+ archivePrefix={arXiv},
61
+ primaryClass={cs.CL},
62
+ url={https://arxiv.org/abs/2406.09490},
63
+ }
64
+ ```
65
+
66
+ # Applications
67
+
68
+ We applied this model to a century of historical news articles. You can see all the classifications in the [NEWSWIRE dataset](https://huggingface.co/datasets/dell-research-harvard/newswire).
69
 
 
70