Edit model card

t5_xsum_samsum_billsum_cnn_dailymail

The t5_xsum_samsum_billsum_cnn_dailymail model is a text summarization model fine-tuned on the t5-base architecture, which is a versatile text-to-text transfer transformer. This powerful model excels at generating abstractive summaries from input text. It has been fine-tuned on multiple datasets, including CNN/Daily Mail (cnn_dailymail), XSum (xsum), SamSum (samsum), BillSum (billsum), and the MeetingBank-transcript dataset by lytang.

Intended Uses & Limitations

Intended Uses

  • Document summarization: The model is well-suited for summarizing lengthy documents or articles, making it valuable for content curation and information extraction tasks.
  • Content generation: It can be used to generate concise summaries from input text, which is useful for creating short and informative snippets.

Limitations

  • Model size: The model's size may require significant computational resources for deployment, limiting its use in resource-constrained environments.
  • Domain-specific content: While it performs well on general text summarization tasks, its performance may vary when applied to domain-specific content.

Training and Evaluation Data

The model has been trained on a diverse set of datasets, including CNN/Daily Mail, XSum, SamSum, BillSum, and the MeetingBank-transcript dataset. These datasets provide a wide range of text summarization examples, enabling the model to generalize across various domains and styles of text.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

samsum

Rouge1 Rouge2 RougeL RougeLsum
0.0138 0.0002 0.0138 0.0138

CNN_Dailymail

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.8486 1.0 32300 1.6478 0.2373 0.1086 0.1972 0.1971 18.9674

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.13.3
Downloads last month
36
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train madushakv/t5_xsum_samsum_billsum_cnn_dailymail

Evaluation results