BART-IT: Italian pretraining for BART sequence to sequence model

BART-IT is a sequence-to-sequence model, based on the BART architecture that is specifically tailored to the Italian language. The model is pre-trained on a large corpus of Italian text, and can be fine-tuned on a variety of tasks.

Model description

The model is a base-sized BART model, with a vocabulary size of 52,000 tokens. It has 140M parameters and can be used for any task that requires a sequence-to-sequence model. It is trained from scratch on a large corpus of Italian text, and can be fine-tuned on a variety of tasks.

Pre-training

The code used to pre-train BART-IT together with additional information on model parameters can be found here.

Fine-tuning

The model in this repository is a pre-trained model without any fine-tuning. In order to use the model for a specific task, you can fine-tune it on a specific dataset.

The model has been fine-tuned for the abstractive summarization task on 3 different Italian datasets:

Usage

In order to use the model, you can use the following code:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("morenolq/bart-it")
model = AutoModelForSeq2SeqLM.from_pretrained("morenolq/bart-it")

input_ids = tokenizer.encode("Il modello BART-IT è stato pre-addestrato su un corpus di testo italiano", return_tensors="pt")
outputs = model.generate(input_ids, max_length=40, num_beams=4, early_stopping=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

If you find this model useful for your research, please cite the following paper:

@Article{BARTIT,
    AUTHOR = {La Quatra, Moreno and Cagliero, Luca},
    TITLE = {BART-IT: An Efficient Sequence-to-Sequence Model for Italian Text Summarization},
    JOURNAL = {Future Internet},
    VOLUME = {15},
    YEAR = {2023},
    NUMBER = {1},
    ARTICLE-NUMBER = {15},
    URL = {https://www.mdpi.com/1999-5903/15/1/15},
    ISSN = {1999-5903},
    DOI = {10.3390/fi15010015}
}
Downloads last month
380
Safetensors
Model size
141M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train morenolq/bart-it