Edit model card

Model description

Fine-tuned xlm-RoBERTa model for Hungarian, trained on a dataset provided by National Tax and Customs Administration - Hungary (NAV): Public Accessibilty Programme.

Intended uses & limitations

The model can be used as any other xlm-RoBERTa model. It has been tested recognizing "accessible" and "original" sentences, where:

  • "accessible" - "Label_1": sentence, that can be considered as comprehensible (regarding to Plain Language directives)
  • "original" - "Label_0": sentence, that needs to rephrased in order to follow Plain Language Guidelines.

Training

Fine-tuned version of the xlm-RoBERTa model (FacebookAI/xlm-roberta-base), trained on information materials provided by NAV linguistic experts.

Eval results

Class Precision Recall F-Score
Original / Label_0 0.76 0.71 0.73
Accessible / Label_1 0.72 0.78 0.75
accuracy 0.74
macro avg 0.74 0.74 0.74
weighted avg 0.74 0.74 0.74

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("uvegesistvan/Hun_RoBERTa_Plain")
model = AutoModelForSequenceClassification.from_pretrained("uvegesistvan/Hun_RoBERTa_Plain")
Downloads last month
6
Safetensors
Model size
278M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.