IT5 Cased Small Efficient EL32 for Question Answering ⁉️ 🇮🇹
Shout-out to Stefan Schweter for contributing the pre-trained efficient model!
This repository contains the checkpoint for the IT5 Cased Small Efficient EL32 model fine-tuned on extractive question answering on the SQuAD-IT corpus as part of the experiments of the paper IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation by Gabriele Sarti and Malvina Nissim.
A comprehensive overview of other released materials is provided in the gsarti/it5 repository. Refer to the paper for additional details concerning the reported scores and the evaluation approach.
Using the model
Model checkpoints are available for usage in Tensorflow, Pytorch and JAX. They can be used directly with pipelines as:
from transformers import pipelines
qa = pipeline("text2text-generation", model='it5/it5-efficient-small-el32-question-answering')
qa("In seguito all' evento di estinzione del Cretaceo-Paleogene, l' estinzione dei dinosauri e il clima umido possono aver permesso alla foresta pluviale tropicale di diffondersi in tutto il continente. Dal 66-34 Mya, la foresta pluviale si estendeva fino a sud fino a 45°. Le fluttuazioni climatiche degli ultimi 34 milioni di anni hanno permesso alle regioni della savana di espandersi fino ai tropici. Durante l' Oligocene, ad esempio, la foresta pluviale ha attraversato una banda relativamente stretta. Si espandeva di nuovo durante il Miocene medio, poi si ritrasse ad una formazione prevalentemente interna all' ultimo massimo glaciale. Tuttavia, la foresta pluviale è riuscita ancora a prosperare durante questi periodi glaciali, consentendo la sopravvivenza e l' evoluzione di un' ampia varietà di specie. Domanda: La foresta pluviale amazzonica è diventata per lo più una foresta interna intorno a quale evento globale?")
>>> [{"generated_text": "ultimo massimo glaciale"}]
or loaded using autoclasses:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("it5/it5-efficient-small-el32-question-answering")
model = AutoModelForSeq2SeqLM.from_pretrained("it5/it5-efficient-small-el32-question-answering")
If you use this model in your research, please cite our work as:
@article{sarti-nissim-2022-it5,
title={{IT5}: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation},
author={Sarti, Gabriele and Nissim, Malvina},
journal={ArXiv preprint 2203.03759},
url={https://arxiv.org/abs/2203.03759},
year={2022},
month={mar}
}
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 7.0
Framework versions
- Transformers 4.15.0
- Pytorch 1.10.0+cu102
- Datasets 1.17.0
- Tokenizers 0.10.3
- Downloads last month
- 5
Dataset used to train gsarti/it5-efficient-small-el32-question-answering
Collection including gsarti/it5-efficient-small-el32-question-answering
Evaluation results
- Test F1 on SQuAD-ITself-reported0.747
- Test Exact Match on SQuAD-ITself-reported0.645