TurkuNLP
/

xlmr-qa-register

Text Classification

Inference Endpoints

Model card Files Files and versions Community

xlmr-qa-register / README.md

annieske's picture

Update README.md

6cf62a4 verified 5 months ago

|

No virus

1.01 kB

	---
	license: cc-by-sa-4.0
	library_name: transformers
	pipeline_tag: text-classification
	---

	### xlm-roberta-base for register labeling, specifically fine-tuned for question-answer document identification

	This is the `xlm-roberta-base`, fine-tuned on register annotated data in English (https://github.com/TurkuNLP/CORE-corpus) and Finnish (https://github.com/TurkuNLP/FinCORE_full) as well as unpublished versions of Swedish and French (https://github.com/TurkuNLP/multilingual-register-labeling). The model is trained to predict whether a text includes something related to questions and answers or not.


	### Hyperparameters
	```
	batch_size = 8
	epochs = 10 (trained for less)
	base_LM_model = "xlm-roberta-base"
	max_seq_len = 512
	learning_rate = 4e-6
	```

	### Performance
	```
	F1-micro = 0.98
	F1-macro = 0.79

	F1 QA label = 0.60
	F1 not QA label = 0.99
	Precision QA label = 0.82
	Precision not QA label = 0.99
	Recall QA label = 0.47
	Recall not QA label = 1.00
	```


	### Citing

	Citing information coming soon!