Edit model card

flan-t5-base-tldr_news

A fine-tuned T5 model for text summarization and title generation on TLDR (Too Long; Didn't Read) news articles.

Introduction

flan-t5-base-tldr_news is a deep learning model that has been fine-tuned on a dataset of TLDR news articles. The model is specifically designed to perform the tasks of text summarization and title generation.

The T5 architecture is a transformer-based neural network architecture that has been used to achieve state-of-the-art results on a variety of NLP tasks. By fine-tuning the T5 architecture on a dataset of TLDR news articles, we aim to create a model that is capable of generating concise and informative summaries and titles for news articles.

Task

The main goal of this model is to perform two NLP tasks: text summarization and title generation. Text summarization involves generating a shortened version of a longer text that retains the most important information and ideas. Title generation, on the other hand, involves generating a headline or title for a given text that accurately and concisely captures the main theme or idea of the text.

Architecture

flan-t5-base-tldr_news uses the T5 architecture, which has been shown to be effective for a variety of NLP tasks. The T5 architecture consists of an encoder and a decoder, which are trained to generate a summary or title given an input text.

Model Size

The model has 247,577,856 parameters, which represents the number of tunable weights in the model. The size of the model can impact the speed and memory requirements during training and inference, as well as the performance of the model on specific tasks.

Training Data

The model was fine-tuned on a dataset of TLDR news articles. This dataset was selected because it contains a large number of news articles that have been condensed into short summaries, making it a good choice for training a model for text summarization. The training data was preprocessed to perform all types of standard preprocessing steps, including tokenization, to prepare the data for input into the model.

Evaluation Metrics

To evaluate the performance of the model on the tasks of text summarization and title generation, we used the ROUGE metric. ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, measures the overlap between the generated text and the reference text, which in this case is the original news article or its summary. The ROUGE metric is commonly used in NLP evaluations and provides a good way to measure the quality of the generated summaries and titles.

The following table shows the ROUGE scores for the model on the test set, which provides a good indication of its overall performance on the text summarization and title generation tasks:

Metric Score
Rouge1 45.04
Rouge2 25.24
RougeL 41.89
RougeIsum 41.84

It's important to note that these scores are just a snapshot of the model's performance on a specific test set, and the performance of the model may vary depending on the input text, the quality of the training data, and the specific application for which the model is being used.

How to use via API

from transformers import pipeline

summarizer = pipeline(
              'summarization',
              'ybagoury/flan-t5-base-tldr_news',
              )
raw_text = """ your text here... """
results = summarizer(raw_text)
print(results)
Downloads last month
18

Dataset used to train ybagoury/flan-t5-base-tldr_news