---
license: apache-2.0
language:
- hu
metrics:
- accuracy
model-index:
- name: huBERTPlain
  results:
  - task:
      type: text-classification
    metrics:
      - type: f1
        value: 0.91
widget:
- text: "A vegetációs időben az országban rendszeresen jelentkező jégesők ellen is van mód védekezni lokálisan, ki-ki a saját nagy értékű ültetvényén."
  example_title: "Positive"
  
- text: "Magyarország több évtizede küzd demográfiai válsággal, és egyre több gyermekre vágyó pár meddőségi problémákkal néz szembe."
  exmaple_title: "Negative"

- text: "Tisztelt fideszes, KDNP-s Képvi­selőtársaim!"
  example_title: "Neutral"

---

## Model description

Cased fine-tuned BERT model for Hungarian, trained on (manuallay anniated) parliamentary pre-agenda speeches scraped from `parlament.hu`. 

## Intended uses & limitations

The model can be used as any other (cased) BERT model. It has been tested recognizing positive, negative and neutral sentences in (parliamentary) pre-agenda speeches, where:
* 'Label_0': Neutral
* 'Label_1': Positive
* 'Label_2': Negative

## Training

Fine-tuned version of the original huBERT model (`SZTAKI-HLT/hubert-base-cc`), trained on HunEmPoli corpus.

## Eval results

| Class | Precision | Recall | F-Score |
|-----|------------|------------|------|
|Neutral|0.83|0.71|0.76|
|Positive|0.87|0.91|0.9|
|Negative|0.94|0.91|0.93|
|Macro AVG|0.88|0.85|0.86|
|Weighted WVG|0.91|0.91|0.91|


## Usage

```py
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("poltextlab/HunEmBERT3")
model = AutoModelForSequenceClassification.from_pretrained("poltextlab/HunEmBERT3")
```

### BibTeX entry and citation info

If you use the model, please cite the following paper:

Bibtex:
```bibtex
@{
}
```