File size: 2,517 Bytes
9074242
a001fd5
 
 
 
4539d12
a001fd5
4539d12
 
 
a001fd5
 
 
92ed45f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0d2cfba
92ed45f
d8e9715
92ed45f
 
 
 
0d2cfba
92ed45f
d8e9715
92ed45f
 
 
 
0d2cfba
92ed45f
d8e9715
92ed45f
 
 
 
0d2cfba
92ed45f
d8e9715
92ed45f
 
 
 
0d2cfba
daacd65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language: it
tags:
- DISTILbert
- Italian
license: mit
widget:
- text: Vado al [MASK] a fare la spesa
- text: Vado al parco a guardare le [MASK]
- text: Il cielo è [MASK] di stelle.
---


# BERTino: an Italian DistilBERT model
This repository hosts BERTino, an Italian DistilBERT model pre-trained by
[indigo.ai](https://indigo.ai/en/)
on a large general-domain Italian corpus. BERTino is task-agnostic and can be 
fine-tuned for every downstream task.
### Corpus
The pre-training corpus that we used is the union of the
[Paisa](https://www.corpusitaliano.it/) and
[ItWaC](https://corpora.dipintra.it/public/run.cgi/corp_info?corpname=itwac_full)
corpora. The final corpus counts 14 millions of sentences for a total of 12 GB
of text.
### Downstream Results
To validate the pre-training that we conducted, we evaluated BERTino on the
[Italian ParTUT](https://universaldependencies.org/treebanks/it_partut/index.html),
[Italian ISDT](https://universaldependencies.org/treebanks/it_isdt/index.html),
[Italian WikiNER](https://figshare.com/articles/Learning_multilingual_named_entity_recognition_from_Wikipedia/5462500)
and multi-class sentence classification tasks. We report for comparison results
obtained by the [teacher model](https://huggingface.co/dbmdz/bert-base-italian-xxl-uncased)
fine-tuned in the same tasks and for the same number of epochs.

**Italian ISDT:**

| Model        | F1 score | Fine-tuning time | Evaluation time |
|--------------|----------|------------------|-----------------|
| BERTino      | 0,9801   | 9m, 4s           | 3s              |
| Teacher      | 0,983    | 16m, 28s         | 5s              |

**Italian ParTUT:**

| Model        | F1 score | Fine-tuning time | Evaluation time |
|--------------|----------|------------------|-----------------|
| BERTino      | 0,9268   | 1m, 18s           | 1s             |
| Teacher      | 0,9688    | 2m, 18s         | 1s              |

**Italian WikiNER:**

| Model        | F1 score | Fine-tuning time | Evaluation time |
|--------------|----------|------------------|-----------------|
| BERTino      | 0,9038  | 35m, 35s           | 3m, 1s             |
| Teacher      | 0,9178    | 67m, 8s         | 5m, 16s              |

**Multi-class sentence classification:**

| Model        | F1 score | Fine-tuning time | Evaluation time |
|--------------|----------|------------------|-----------------|
| BERTino      | 0,7788   | 4m, 40s           | 6s             |
| Teacher      | 0,7986    | 8m, 52s         | 9s              |