google/flan-t5-base finetuned on xsum dataset
train args
max_input_length: 512
max_tgt_length: 128
epoch: 3
optimizer: AdamW
lr: 2e-5
weight_decay: 1e-3
fp16: False
prefix: "summarize: "
performance
eval_rouge1: 38.6648
eval_rouge2: 15.5661
eval_rougeL: 30.6158
usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM