mt5-small-finetuned-xlsum-en-es

This model is a fine-tuned version of google/mt5-small on the csebuetnlp/xlsum dataset.

Reduced versions of the English/Spanish subsets were used, focusing on shorter targets.

It achieves the following results on the evaluation set:

Loss: 2.9483
Rouge1: 19.42
Rouge2: 4.44
Rougel: 16.7
Rougelsum: 16.7
Mean Len: 16.3231

Model description

More information needed

Intended uses & limitations

Model may produce false information when summarizing.

This is very much an initial draft, and is not expected for use in production, use at your own risk.

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3

Training results

Lead-3 Baseline:

Rouge1: 12.22
Rouge2: 2.01
RougeL: 9.02
RougeLsum: 10.33

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Mean Len
6.7763	1.0	1237	3.1120	13.57	2.76	11.59	11.59	12.6116
4.1022	2.0	2474	2.9718	19.35	4.32	16.63	16.64	16.3084
3.9219	3.0	3711	2.9483	19.42	4.44	16.7	16.7	16.3231

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

Citation

BibTeX:

@inproceedings{hasan-etal-2021-xl,
    title = "{XL}-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages",
    author = "Hasan, Tahmid  and
      Bhattacharjee, Abhik  and
      Islam, Md. Saiful  and
      Mubasshir, Kazi  and
      Li, Yuan-Fang  and
      Kang, Yong-Bin  and
      Rahman, M. Sohel  and
      Shahriyar, Rifat",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.413",
    pages = "4693--4703",
}

alex-atelo
/

mt5-small-finetuned-xlsum-en-es

mt5-small-finetuned-xlsum-en-es

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Citation

Model tree for alex-atelo/mt5-small-finetuned-xlsum-en-es

Dataset used to train alex-atelo/mt5-small-finetuned-xlsum-en-es

Evaluation results