Edit model card

flan-t5-base-samsum

This model is a fine-tuned version of google/flan-t5-base on the samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3723
  • Rouge1: 47.2141
  • Rouge2: 23.4799
  • Rougel: 39.7474
  • Rougelsum: 43.3222
  • Gen Len: 17.2589

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.4665 1.0 921 1.3915 46.9661 23.1441 39.2886 43.1249 17.2894
1.3722 2.0 1842 1.3778 47.1196 23.1221 39.6222 43.3404 17.1905
1.3145 3.0 2763 1.3723 47.2141 23.4799 39.7474 43.3222 17.2589
1.2767 4.0 3684 1.3787 47.1852 23.5757 39.7355 43.4915 17.4554
1.257 5.0 4605 1.3742 47.4921 23.6605 39.9254 43.7327 17.3529

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.19.1

How to use model

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

ckpt = 'sharmax-vikas/flan-t5-base-samsum'
tokenizer = AutoTokenizer.from_pretrained(ckpt)
# Use AutoModelForSeq2SeqLM for text generation tasks like summarization
model = AutoModelForSeq2SeqLM.from_pretrained(ckpt) 

summarize = pipeline('summarization', tokenizer=tokenizer, model=model)

result = summarize('''Hannah: Hey, do you have Betty's number?
Amanda: Lemme check
Hannah: <file_gif>
Amanda: Sorry, can't find it.
Amanda: Ask Larry
Amanda: He called her last time we were at the park together
Hannah: I don't know him well
Hannah: <file_gif>
Amanda: Don't be shy, he's very nice
Hannah: If you say so..
Hannah: I'd rather you texted him
Amanda: Just text him 🙂
Hannah: Urgh.. Alright
Hannah: Bye
Amanda: Bye bye''')

print(result[0])

#{'summary_text': "Amanda can't find Betty's number. Amanda will ask Larry. Larry called Betty last time they were at the park together."}
Downloads last month
29
Safetensors
Model size
248M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train sharmax-vikas/flan-t5-base-samsum

Evaluation results