Edit model card

Question answering model for Estonian

This is a question answering model based on XLM-Roberta base model. It is fine-tuned subsequentially on:

  1. English SQuAD v1.1
  2. SQuAD v1.1 translated into Estonian
  3. Small native Estonian dataset (800 samples)

The model has retained good multilingual properties and can be used for extractive QA tasks in all languages included in XLM-Roberta. The performance is best in the fine-tuning languages of Estonian and English.

Tested on F1 EM
EstQA test set 82.4 75.3
SQuAD v1.1 dev set 86.9 77.9

The Estonian dataset used for fine-tuning and validating results is available in https://huggingface.co/datasets/anukaver/EstQA/ (version 1.0)

Downloads last month
3

Datasets used to train anukaver/xlm-roberta-est-qa