File size: 7,438 Bytes
d239e81
fc10a48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32923d7
fc10a48
 
32923d7
fc10a48
 
d239e81
fc10a48
 
 
 
 
 
 
 
9d88612
fc10a48
 
 
 
 
 
 
 
 
 
 
 
55e62ec
56e2e1f
fc10a48
b6023bb
 
 
 
 
b86f563
59c9591
49b5e5b
59c9591
55e62ec
b86f563
32923d7
 
 
 
08c2366
 
 
 
32923d7
 
 
 
08c2366
 
 
 
 
 
32923d7
 
09401b3
 
4ea16c7
09401b3
 
a4ff31a
fc10a48
 
09401b3
 
b54debe
e9d8bd2
09401b3
 
 
 
 
 
 
59c9591
fc10a48
 
09401b3
fc10a48
09401b3
fc10a48
 
 
 
 
 
 
 
 
 
 
 
 
 
f8b5d8a
fc10a48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f8b5d8a
fc10a48
 
 
12ffe37
d65b9ba
229586f
 
fc10a48
 
 
b86f563
fc10a48
 
 
 
 
1cd7ee5
fc10a48
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
license: mit
language:
- it
datasets:
- squad_it
widget:
- text: Quale libro fu scritto da Alessandro Manzoni?
  context: Alessandro Manzoni pubblicò la prima versione dei Promessi Sposi nel 1827
- text: In quali competizioni gareggia la Ferrari?
  context: La Scuderia Ferrari è una squadra corse italiana di Formula 1 con sede a Maranello
- text: Quale sport è riferito alla Serie A?
  context: Il campionato di Serie A è la massima divisione professionistica del campionato italiano di calcio maschile
model-index:
- name: osiria/deberta-italian-question-answering
  results:
  - task:
      type: question-answering
      name: Question Answering
    dataset:
      name: squad_it
      type: squad_it
    metrics:
    - type: exact-match
      value: 0.7004
      name: Exact Match
    - type: f1
      value: 0.8097
      name: F1
pipeline_tag: question-answering
---

--------------------------------------------------------------------------------------------------

<body>
<span class="vertical-text" style="background-color:lightgreen;border-radius: 3px;padding: 3px;"></span>
<br>
<span class="vertical-text" style="background-color:orange;border-radius: 3px;padding: 3px;">    Task: Question Answering</span>
<br>
<span class="vertical-text" style="background-color:lightblue;border-radius: 3px;padding: 3px;">    Model: DeBERTa</span>
<br>
<span class="vertical-text" style="background-color:tomato;border-radius: 3px;padding: 3px;">    Lang: IT</span>
<br>
<span class="vertical-text" style="background-color:lightgrey;border-radius: 3px;padding: 3px;">  </span>
<br>
<span class="vertical-text" style="background-color:#CF9FFF;border-radius: 3px;padding: 3px;"></span>
</body>

--------------------------------------------------------------------------------------------------

<h3>Model description</h3>

This is a <b>DeBERTa</b> <b>[1]</b> model for the <b>Italian</b> language, fine-tuned for <b>Extractive Question Answering</b> on the [SQuAD-IT](https://huggingface.co/datasets/squad_it) dataset <b>[2]</b>.
The model is trained with an enhanced procedure that delivers top-level performance and reliability. The latest upgrade, code-name <b>LITEQA</b>, offers increased robustness and maintains optimal results even in uncased settings.

<h3>Training and Performances</h3>

The model is trained to perform question answering, given a context and a question (under the assumption that the context contains the answer to the question). It has been fine-tuned for Extractive Question Answering, using the SQuAD-IT dataset, for 2 epochs with a linearly decaying learning rate starting from 3e-5, maximum sequence length of 384 and document stride of 128.
<br>The dataset includes 54.159 training instances and 7.609 test instances

<b>update: version 2.0</b>

The 2.0 version further improves the performances by exploiting a 2-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and max learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, no warmup, initial learning rate of 3e-5)

In order to maximize the benefits of the multilingual procedure, [mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) is used as a pre-trained model. When the double fine-tuning is completed, the embedding layer is then compressed as in [deberta-base-italian](https://huggingface.co/osiria/deberta-base-italian) to obtain a mono-lingual model size

The performances on the test set are reported in the following table:

(<b>version 2.0</b> performances)

<br>

<b>Cased setting:</b>

| EM | F1 |
| ------ | ------ |
| 70.04 | 80.97 |

<b>Uncased setting:</b>

| EM | F1 |
| ------ | ------ |
| 68.55 | 80.11 |

Testing notebook: https://huggingface.co/osiria/deberta-italian-question-answering/blob/main/osiria_deberta_italian_qa_evaluation.ipynb

<b>update: version 3.0 (LITEQA)</b>

The 3.0 version, with the nickname LITEQA, further improves the performances by exploiting a 3-phases fine-tuning strategy: the model is first fine-tuned on the English SQuAD v2 (1 epoch, 20% warmup ratio, and max learning rate of 3e-5) then further fine-tuned on the Italian SQuAD (2 epochs, no warmup, initial learning rate of 3e-5) and lastly fine-tuned on the lowercase Italian SQuAD (1 epoch, no warmup, initial learning rate of 3e-5).
This helps making the model generally more robust, but particularly in uncased settings.

The 3.0 version can be downloaded from the <b>liteqa</b> branch of this repo.
The performances on the test set are reported in the following table:

(<b>version 3.0</b> performances)

<br>

<b>Cased setting:</b>

| EM | F1 |
| ------ | ------ |
| 70.19 | 81.01 |

<b>Uncased setting:</b>

| EM | F1 |
| ------ | ------ |
| 69.60 | 80.74 |

Testing notebook: https://huggingface.co/osiria/deberta-italian-question-answering/blob/liteqa/osiria_liteqa_evaluation.ipynb

<h3>Quick usage</h3>

In order to get the best possible outputs from the model, it is recommended to use the following pipeline

```python
from transformers import DebertaV2TokenizerFast, DebertaV2ForQuestionAnswering
import re
import string
from transformers.pipelines import QuestionAnsweringPipeline

tokenizer = DebertaV2TokenizerFast.from_pretrained("osiria/deberta-italian-question-answering")
model = DebertaV2ForQuestionAnswering.from_pretrained("osiria/deberta-italian-question-answering")

class OsiriaQA(QuestionAnsweringPipeline):
    
    def __init__(self, punctuation = ',;.:!?()[\]{}', **kwargs):

        QuestionAnsweringPipeline.__init__(self, **kwargs)
        self.post_regex_left = "^[\s" + punctuation + "]+"
        self.post_regex_right = "[\s" + punctuation + "]+$"
    
    def postprocess(self, output):
        
        output = QuestionAnsweringPipeline.postprocess(self, model_outputs=output)
        output_length = len(output["answer"])
        output["answer"] = re.sub(self.post_regex_left, "", output["answer"])
        output["start"] = output["start"] + (output_length - len(output["answer"]))
        output_length = len(output["answer"])
        output["answer"] = re.sub(self.post_regex_right, "", output["answer"])
        output["end"] = output["end"] - (output_length - len(output["answer"]))
        
        return output
    
pipeline_qa = OsiriaQA(model = model, tokenizer = tokenizer)
pipeline_qa(context = "Alessandro Manzoni è nato a Milano nel 1785",
            question = "Dove è nato Manzoni?")

# {'score': 0.9899800419807434, 'start': 28, 'end': 34, 'answer': 'Milano'}
```

You can also try the model online using this web app: https://huggingface.co/spaces/osiria/deberta-italian-question-answering

<h3>References</h3>

[1] https://arxiv.org/abs/2111.09543

[2] https://link.springer.com/chapter/10.1007/978-3-030-03840-3_29

<h3>Limitations</h3>

This model was trained on the English SQuAD v2 and on SQuAD-IT, which is mainly a machine translated version of the original SQuAD v1.1. This means that the quality of the training set is limited by the machine translation.
Moreover, the model is meant to answer questions under the assumption that the required information is actually contained in the given context (which is the underlying assumption of SQuAD v1.1). 
If the assumption is violated, the model will try to return an answer in any case, which is going to be incorrect.

<h3>License</h3>

The model is released under <b>MIT</b> license