Task: Question Answering
Model: DeBERTa
Lang: IT
Model description
This is a DeBERTa [1] model for the Italian language, fine-tuned for Extractive Question Answering on the SQuAD-IT dataset [2]. The model is trained with an enhanced procedure that delivers top-level performance and reliability. The latest upgrade, code-name LITEQA, offers increased robustness and maintains optimal results even in uncased settings.
Training and Performances
The model is trained to perform question answering, given a context and a question (under the assumption that the context contains the answer to the question). It has been fine-tuned for Extractive Question Answering, using the SQuAD-IT dataset, for 2 epochs with a linearly decaying learning rate starting from 3e-5, maximum sequence length of 384 and document stride of 128.
The dataset includes 54.159 training instances and 7.609 test instances
update: version 2.0
The 2.0 version further improves the performances by exploiting a 2-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and max learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, no warmup, initial learning rate of 3e-5)
In order to maximize the benefits of the multilingual procedure, mdeberta-v3-base is used as a pre-trained model. When the double fine-tuning is completed, the embedding layer is then compressed as in deberta-base-italian to obtain a mono-lingual model size
The performances on the test set are reported in the following table:
(version 2.0 performances)
Cased setting:
EM | F1 |
---|---|
70.04 | 80.97 |
Uncased setting:
EM | F1 |
---|---|
68.55 | 80.11 |
Testing notebook: https://huggingface.co/osiria/deberta-italian-question-answering/blob/main/osiria_deberta_italian_qa_evaluation.ipynb
update: version 3.0 (LITEQA)
The 3.0 version, with the nickname LITEQA, further improves the performances by exploiting a 3-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and max learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, no warmup, initial learning rate of 3e-5) and lastly fine-tuned on the lowercase Italian SQuAD (1 epoch, no warmup, initial learning rate of 3e-5). This helps making the model generally more robust, but particularly in uncased settings.
The 3.0 version can be downloaded from the liteqa branch of this repo. The performances on the test set are reported in the following table:
(version 3.0 performances)
Cased setting:
EM | F1 |
---|---|
70.19 | 81.01 |
Uncased setting:
EM | F1 |
---|---|
69.60 | 80.74 |
Testing notebook: https://huggingface.co/osiria/deberta-italian-question-answering/blob/liteqa/osiria_liteqa_evaluation.ipynb
Quick usage
In order to get the best possible outputs from the model, it is recommended to use the following pipeline
from transformers import DebertaV2TokenizerFast, DebertaV2ForQuestionAnswering
import re
import string
from transformers.pipelines import QuestionAnsweringPipeline
tokenizer = DebertaV2TokenizerFast.from_pretrained("osiria/deberta-italian-question-answering")
model = DebertaV2ForQuestionAnswering.from_pretrained("osiria/deberta-italian-question-answering")
class OsiriaQA(QuestionAnsweringPipeline):
def __init__(self, punctuation = ',;.:!?()[\]{}', **kwargs):
QuestionAnsweringPipeline.__init__(self, **kwargs)
self.post_regex_left = "^[\s" + punctuation + "]+"
self.post_regex_right = "[\s" + punctuation + "]+$"
def postprocess(self, output):
output = QuestionAnsweringPipeline.postprocess(self, model_outputs=output)
output_length = len(output["answer"])
output["answer"] = re.sub(self.post_regex_left, "", output["answer"])
output["start"] = output["start"] + (output_length - len(output["answer"]))
output_length = len(output["answer"])
output["answer"] = re.sub(self.post_regex_right, "", output["answer"])
output["end"] = output["end"] - (output_length - len(output["answer"]))
return output
pipeline_qa = OsiriaQA(model = model, tokenizer = tokenizer)
pipeline_qa(context = "Alessandro Manzoni è nato a Milano nel 1785",
question = "Dove è nato Manzoni?")
# {'score': 0.9899800419807434, 'start': 28, 'end': 34, 'answer': 'Milano'}
You can also try the model online using this web app: https://huggingface.co/spaces/osiria/deberta-italian-question-answering
References
[1] https://arxiv.org/abs/2111.09543
[2] https://link.springer.com/chapter/10.1007/978-3-030-03840-3_29
Limitations
This model was trained on the English SQuAD v2 and on SQuAD-IT, which is mainly a machine translated version of the original SQuAD v1.1. This means that the quality of the training set is limited by the machine translation. Moreover, the model is meant to answer questions under the assumption that the required information is actually contained in the given context (which is the underlying assumption of SQuAD v1.1). If the assumption is violated, the model will try to return an answer in any case, which is going to be incorrect.
License
The model is released under MIT license
- Downloads last month
- 1,024
Dataset used to train osiria/deberta-italian-question-answering
Spaces using osiria/deberta-italian-question-answering 2
Collection including osiria/deberta-italian-question-answering
Evaluation results
- Exact Match on squad_itself-reported0.700
- F1 on squad_itself-reported0.810