File size: 2,315 Bytes
fd07a91 3e57b17 fd07a91 fb98705 843b7e8 6ca647b e58e06c 6ca647b 3c3e1a4 9090130 3c3e1a4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
---
license: apache-2.0
language:
- bs
- hr
- sr
- sl
- sk
- cs
- en
tags:
- sentiment-analysis
- text-regression
- text-classification
- sentiment-regression
- sentiment-classification
- parliament
widget:
- text: >-
Poštovani potpredsjedničke Vlade i ministre hrvatskih branitelja, mislite li
da ste zapravo iznevjerili svoje suborce s kojima ste 555 dana prosvjedovali
u šatoru protiv tadašnjih dužnosnika jer ste zapravo donijeli zakon koji je
neprovediv, a birali ste si suradnike koji nemaju etički integritet.
---
# Multilingual parliament sentiment regression model XLM-R-Parla-Sent
This model is based on [xlm-r-parla](https://huggingface.co/classla/xlm-r-parla) and fine-tuned on manually annotated sentiment datasets from United Kingdom, Czechia, Slovakia, Slovenia, Bosnia and Herzegovina, Croatia, and Serbia.
## Annotation schema
The discrete labels, present in the original dataset, were mapped to integers as follows:
```
"Negative": 0.0,
"M_Negative": 1.0,
"N_Neutral": 2.0,
"P_Neutral": 3.0,
"M_Positive": 4.0,
"Positive": 5.0,
```
Model was then fine-tuned on numeric labels and setup as regressor.
## Finetuning procedure
The fine-tuning procedure is described in this paper (ARXIV SUBMISSION to be added). Presumed optimal hyperparameters used are
```
num_train_epochs=4,
train_batch_size=32,
learning_rate=8e-6,
regression=True
```
## Results
Results reported were obtained from 10 fine-tuning runs.
test dataset | R^2
--- | ---
BCS | 0.6146 ± 0.0104
EN | 0.6722 ± 0.0100
## Example
With `simpletransformers==0.64.3`.
```python
from simpletransformers.classification import ClassificationModel, ClassificationArgs
import torch
model_args = ClassificationArgs(
regression=True,
)
model = ClassificationModel(model_type="xlmroberta", model_name="classla/xlm-r-parlasent",use_cuda=torch.cuda.is_available(), num_labels=1,args=model_args)
model.predict(["""Poštovani potpredsjedničke Vlade i ministre hrvatskih branitelja, mislite li
da ste zapravo iznevjerili svoje suborce s kojima ste 555 dana prosvjedovali
u šatoru protiv tadašnjih dužnosnika jer ste zapravo donijeli zakon koji je
neprovediv, a birali ste si suradnike koji nemaju etički integritet."""])
```
Output:
``` (array(-0.0847168), array(-0.0847168))``` |