--- language: fi datasets: - SQuAD_v2_fi + Finnish partition of TyDi-QA license: gpl-3.0 --- # bert-base-finnish-cased-v1 for QA This is the [bert-base-finnish-cased-v1](https://huggingface.co/TurkuNLP/bert-base-finnish-cased-v1) model, fine-tuned using an automatically translated [Finnish version of the SQuAD2.0 dataset](https://huggingface.co/datasets/ilmariky/SQuAD_v2_fi) in combination with the Finnish partition of the [TyDi-QA](https://github.com/google-research-datasets/tydiqa) dataset. It's been trained on question-answer pairs, **including unanswerable questions**, for the task of question answering. When the model classifies the question as unanswerable, it outputs "[CLS]". There is also a QA model available that does not try to identify unanswerable questions, [ bert-base-finnish-cased-squad1-fi ](https://huggingface.co/ilmariky/bert-base-finnish-cased-squad1-fi). ## Overview **Language model:** bert-base-finnish-cased-v1 **Language:** Finnish **Downstream-task:** Extractive QA **Training data:** [Finnish SQuAD 2.0](https://huggingface.co/datasets/ilmariky/SQuAD_v2_fi) + Finnish partition of TyDi-QA **Eval data:** [Finnish SQuAD 2.0](https://huggingface.co/datasets/ilmariky/SQuAD_v2_fi) + Finnish partition of TyDi-QA ## Usage ### In Transformers ```python from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline model_name = "ilmariky/bert-base-finnish-cased-squad2-fi" # a) Get predictions nlp = pipeline('question-answering', model=model_name, tokenizer=model_name) QA_input = { 'question': 'Mikä tämä on?', 'context': 'Tämä on testi.' } res = nlp(QA_input) # b) Load model & tokenizer model = AutoModelForQuestionAnswering.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) ``` ## Performance Evaluated with a slightly modified version of the [official eval script](https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/). ``` { "exact": 55.53157042633567, "f1": 61.869335312255835, "total": 7412, "HasAns_exact": 51.26503525508088, "HasAns_f1": 61.006950090095565, "HasAns_total": 4822, "NoAns_exact": 63.47490347490348, "NoAns_f1": 63.47490347490348, "NoAns_total": 2590 } ```