Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,44 @@ metrics:
|
|
10 |
- f1
|
11 |
widget:
|
12 |
- text: "Najkrajšia vianočná reklama: Toto milé video vám vykúzli čarovnú atmosféru: Vianoce sa nezadržateľne blížia."
|
|
|
13 |
---
|
14 |
|
15 |
|
16 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
- f1
|
11 |
widget:
|
12 |
- text: "Najkrajšia vianočná reklama: Toto milé video vám vykúzli čarovnú atmosféru: Vianoce sa nezadržateľne blížia."
|
13 |
+
- text: "A opäť sa objavili nebezpečné výrobky. Pozrite sa, či ich nemáte doma"
|
14 |
---
|
15 |
|
16 |
|
17 |
+
# Sentiment Analysis model based on SlovakBERT
|
18 |
+
|
19 |
+
This is a sentiment analysis classifier based on [SlovakBERT](https://huggingface.co/gerulata/slovakbert). The model can distinguish three level of sentiment:
|
20 |
+
|
21 |
+
- `-1` - Negative sentiment
|
22 |
+
- `0` - Neutral sentiment
|
23 |
+
- `1` - Positive setiment
|
24 |
+
|
25 |
+
The model was fine-tuned using Slovak part of [Multilingual Twitter Sentiment Analysis Dataset](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0155036) [Mozetič et al 2016] containing 50k manually annotated Slovak tweets. As such, it is fine-tuned for tweets and it is not advised to use the model for general-purpose sentiment analysis.
|
26 |
+
|
27 |
+
## Results
|
28 |
+
|
29 |
+
The model was evaluated in [our paper](https://arxiv.org/abs/2109.15254) [Pikuliak et al 2021, Section 4.4]. It achieves \\(0.67\\) F1-score on the original dataset and \\(0.58\\) F1-score on general reviews dataset.
|
30 |
+
|
31 |
+
## Cite
|
32 |
+
|
33 |
+
```
|
34 |
+
@article{DBLP:journals/corr/abs-2109-15254,
|
35 |
+
author = {Mat{\'{u}}s Pikuliak and
|
36 |
+
Stefan Grivalsky and
|
37 |
+
Martin Konopka and
|
38 |
+
Miroslav Blst{\'{a}}k and
|
39 |
+
Martin Tamajka and
|
40 |
+
Viktor Bachrat{\'{y}} and
|
41 |
+
Mari{\'{a}}n Simko and
|
42 |
+
Pavol Bal{\'{a}}zik and
|
43 |
+
Michal Trnka and
|
44 |
+
Filip Uhl{\'{a}}rik},
|
45 |
+
title = {SlovakBERT: Slovak Masked Language Model},
|
46 |
+
journal = {CoRR},
|
47 |
+
volume = {abs/2109.15254},
|
48 |
+
year = {2021},
|
49 |
+
url = {https://arxiv.org/abs/2109.15254},
|
50 |
+
eprinttype = {arXiv},
|
51 |
+
eprint = {2109.15254},
|
52 |
+
}
|
53 |
+
```
|