poltextlab/HunEmBERT3 · Hugging Face

Model description

Cased fine-tuned BERT model for Hungarian, trained on (manually annotated) parliamentary pre-agenda speeches scraped from parlament.hu.

Intended uses & limitations

The model can be used as any other (cased) BERT model. It has been tested recognizing positive, negative, and neutral sentences in (parliamentary) pre-agenda speeches, where:

'Label_0': Neutral
'Label_1': Positive
'Label_2': Negative

Training

The fine-tuned version of the original huBERT model (SZTAKI-HLT/hubert-base-cc), trained on HunEmPoli corpus.

Category	Count	Ratio	Sentiment	Count	Ratio
Neutral	351	1.85%	Neutral	351	1.85%
Fear	162	0.85%	Negative	11180	58.84%
Sadness	4258	22.41%
Anger	643	3.38%
Disgust	6117	32.19%
Success	6602	34.74%	Positive	7471	39.32%
Joy	441	2.32%
Trust	428	2.25%
Sum	19002

Eval results

Class	Precision	Recall	F-Score
Neutral	0.83	0.71	0.76
Positive	0.87	0.91	0.9
Negative	0.94	0.91	0.93
Macro AVG	0.88	0.85	0.86
Weighted WVG	0.91	0.91	0.91

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("poltextlab/HunEmBERT3")
model = AutoModelForSequenceClassification.from_pretrained("poltextlab/HunEmBERT3")

BibTeX entry and citation info

If you use the model, please cite the following paper:

Bibtex:

@ARTICLE{10149341,
  author={{"U}veges, Istv{\'a}n and Ring, Orsolya},
  journal={IEEE Access}, 
  title={HunEmBERT: a fine-tuned BERT-model for classifying sentiment and emotion in political communication}, 
  year={2023},
  volume={11},
  number={},
  pages={60267-60278},
  doi={10.1109/ACCESS.2023.3285536}
}

poltextlab
/

HunEmBERT3