bert-base-finnish-cased-v1 for QA
This is the bert-base-finnish-cased-v1 model, fine-tuned using an automatically translated Finnish version of the SQuAD2.0 dataset in combination with the Finnish partition of the TyDi-QA dataset. It's been trained on question-answer pairs, excluding unanswerable questions, for the task of question answering.
Another QA model that has been fine-tuned with also unanswerable questions is also available: bert-base-finnish-cased-squad2-fi.
Overview
Language model: bert-base-finnish-cased-v1
Language: Finnish
Downstream-task: Extractive QA
Training data: Answerable questions from Finnish SQuAD 2.0 + Finnish partition of TyDi-QA
Eval data: Answerable questions from Finnish SQuAD 2.0 + Finnish partition of TyDi-QA
Usage
In Transformers
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
model_name = "ilmariky/bert-base-finnish-cased-squad1-fi"
# a) Get predictions
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
QA_input = {
'question': 'Mikä tämä on?',
'context': 'Tämä on testi.'
}
res = nlp(QA_input)
# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
Performance
Evaluated with a slightly modified version of the official eval script.
{
"exact": 58.00497718788884,
"f1": 69.90891092523077,
"total": 4822,
"HasAns_exact": 58.00497718788884,
"HasAns_f1": 69.90891092523077,
"HasAns_total": 4822
}
- Downloads last month
- 20