nljubesi commited on
Commit
610b928
1 Parent(s): f335e4f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -43
README.md CHANGED
@@ -17,48 +17,6 @@ widget:
17
 
18
  Text classification model based on [`roberta-base`](https://huggingface.co/roberta-base) and fine-tuned on the [FRENK dataset](https://www.clarin.si/repository/xmlui/handle/11356/1433) comprising of LGBT and migrant hatespeech. Only the English subset of the data was used for fine-tuning and the dataset has been relabeled for binary classification (offensive or acceptable).
19
 
20
-
21
-
22
- If you use the model, please cite the following paper on which the original model is based:
23
- ```
24
- @article{DBLP:journals/corr/abs-1907-11692,
25
- author = {Yinhan Liu and
26
- Myle Ott and
27
- Naman Goyal and
28
- Jingfei Du and
29
- Mandar Joshi and
30
- Danqi Chen and
31
- Omer Levy and
32
- Mike Lewis and
33
- Luke Zettlemoyer and
34
- Veselin Stoyanov},
35
- title = {RoBERTa: {A} Robustly Optimized {BERT} Pretraining Approach},
36
- journal = {CoRR},
37
- volume = {abs/1907.11692},
38
- year = {2019},
39
- url = {http://arxiv.org/abs/1907.11692},
40
- archivePrefix = {arXiv},
41
- eprint = {1907.11692},
42
- timestamp = {Thu, 01 Aug 2019 08:59:33 +0200},
43
- biburl = {https://dblp.org/rec/journals/corr/abs-1907-11692.bib},
44
- bibsource = {dblp computer science bibliography, https://dblp.org}
45
- }
46
- ```
47
-
48
- and the dataset used for fine-tuning:
49
- ```
50
- @misc{ljubešić2019frenk,
51
- title={The FRENK Datasets of Socially Unacceptable Discourse in Slovene and English},
52
- author={Nikola Ljubešić and Darja Fišer and Tomaž Erjavec},
53
- year={2019},
54
- eprint={1906.02045},
55
- archivePrefix={arXiv},
56
- primaryClass={cs.CL},
57
- url={https://arxiv.org/abs/1906.02045}
58
- }
59
- ```
60
-
61
-
62
  ## Fine-tuning hyperparameters
63
 
64
  Fine-tuning was performed with `simpletransformers`. Beforehand a brief hyperparameter optimisation was performed and the presumed optimal hyperparameters are:
@@ -125,4 +83,48 @@ predictions, logit_output = model.predict(["Build the wall",
125
  predictions
126
  ### Output:
127
  ### array([1, 0])
128
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  Text classification model based on [`roberta-base`](https://huggingface.co/roberta-base) and fine-tuned on the [FRENK dataset](https://www.clarin.si/repository/xmlui/handle/11356/1433) comprising of LGBT and migrant hatespeech. Only the English subset of the data was used for fine-tuning and the dataset has been relabeled for binary classification (offensive or acceptable).
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ## Fine-tuning hyperparameters
21
 
22
  Fine-tuning was performed with `simpletransformers`. Beforehand a brief hyperparameter optimisation was performed and the presumed optimal hyperparameters are:
83
  predictions
84
  ### Output:
85
  ### array([1, 0])
86
+ ```
87
+
88
+ ## Citation
89
+
90
+
91
+ If you use the model, please cite the following paper on which the original model is based:
92
+ ```
93
+ @article{DBLP:journals/corr/abs-1907-11692,
94
+ author = {Yinhan Liu and
95
+ Myle Ott and
96
+ Naman Goyal and
97
+ Jingfei Du and
98
+ Mandar Joshi and
99
+ Danqi Chen and
100
+ Omer Levy and
101
+ Mike Lewis and
102
+ Luke Zettlemoyer and
103
+ Veselin Stoyanov},
104
+ title = {RoBERTa: {A} Robustly Optimized {BERT} Pretraining Approach},
105
+ journal = {CoRR},
106
+ volume = {abs/1907.11692},
107
+ year = {2019},
108
+ url = {http://arxiv.org/abs/1907.11692},
109
+ archivePrefix = {arXiv},
110
+ eprint = {1907.11692},
111
+ timestamp = {Thu, 01 Aug 2019 08:59:33 +0200},
112
+ biburl = {https://dblp.org/rec/journals/corr/abs-1907-11692.bib},
113
+ bibsource = {dblp computer science bibliography, https://dblp.org}
114
+ }
115
+ ```
116
+
117
+ and the dataset used for fine-tuning:
118
+ ```
119
+ @misc{ljubešić2019frenk,
120
+ title={The FRENK Datasets of Socially Unacceptable Discourse in Slovene and English},
121
+ author={Nikola Ljubešić and Darja Fišer and Tomaž Erjavec},
122
+ year={2019},
123
+ eprint={1906.02045},
124
+ archivePrefix={arXiv},
125
+ primaryClass={cs.CL},
126
+ url={https://arxiv.org/abs/1906.02045}
127
+ }
128
+ ```
129
+
130
+