ybagoury commited on
Commit
a833ac1
1 Parent(s): 752248b

Rename README.md to Readme.md

Browse files
Files changed (2) hide show
  1. README.md +0 -19
  2. Readme.md +60 -0
README.md DELETED
@@ -1,19 +0,0 @@
1
- ---
2
- datasets:
3
- - JulesBelveze/tldr_news
4
- metrics:
5
- - rouge
6
- pipeline_tag: summarization
7
- ---
8
-
9
- # TLDR News: FLAN T-5
10
-
11
- A fine-tuned version of google/flan-t5-base for summarisation and title generation trained on the JulesBelveze/tldr_news dataset.
12
-
13
-
14
- # How to use with the pipeline API
15
-
16
- ```
17
- from transformers install pipeline
18
-
19
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Readme.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - JulesBelveze/tldr_news
4
+ metrics:
5
+ - rouge
6
+ pipeline_tag: summarization
7
+ language:
8
+ - en
9
+ tags:
10
+ - tldr
11
+ ---
12
+
13
+ # flan-t5-base-tldr_news
14
+
15
+ A fine-tuned T5 model for text summarization and title generation on TLDR (Too Long; Didn't Read) news articles.
16
+
17
+ ## Introduction
18
+ flan-t5-base-tldr_news is a deep learning model that has been fine-tuned on a dataset of TLDR news articles. The model is specifically designed to perform the tasks of text summarization and title generation.
19
+
20
+ The T5 architecture is a transformer-based neural network architecture that has been used to achieve state-of-the-art results on a variety of NLP tasks. By fine-tuning the T5 architecture on a dataset of TLDR news articles, we aim to create a model that is capable of generating concise and informative summaries and titles for news articles.
21
+
22
+ ## Task
23
+ The main goal of this model is to perform two NLP tasks: text summarization and title generation. Text summarization involves generating a shortened version of a longer text that retains the most important information and ideas. Title generation, on the other hand, involves generating a headline or title for a given text that accurately and concisely captures the main theme or idea of the text.
24
+
25
+ ## Architecture
26
+ flan-t5-base-tldr_news uses the T5 architecture, which has been shown to be effective for a variety of NLP tasks. The T5 architecture consists of an encoder and a decoder, which are trained to generate a summary or title given an input text.
27
+
28
+ ## Model Size
29
+ The model has 247,577,856 parameters, which represents the number of tunable weights in the model. The size of the model can impact the speed and memory requirements during training and inference, as well as the performance of the model on specific tasks.
30
+
31
+ ## Training Data
32
+ The model was fine-tuned on a dataset of TLDR news articles. This dataset was selected because it contains a large number of news articles that have been condensed into short summaries, making it a good choice for training a model for text summarization. The training data was preprocessed to perform all types of standard preprocessing steps, including tokenization, to prepare the data for input into the model.
33
+
34
+ ## Evaluation Metrics
35
+ To evaluate the performance of the model on the tasks of text summarization and title generation, we used the ROUGE metric. ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, measures the overlap between the generated text and the reference text, which in this case is the original news article or its summary. The ROUGE metric is commonly used in NLP evaluations and provides a good way to measure the quality of the generated summaries and titles.
36
+
37
+ The following table shows the ROUGE scores for the model on the test set, which provides a good indication of its overall performance on the text summarization and title generation tasks:
38
+
39
+ | Metric | Score |
40
+ | ------ | ------|
41
+ | Rouge1 | 45.04 |
42
+ | Rouge2 | 25.24 |
43
+ | RougeL | 41.89 |
44
+ | RougeIsum | 41.84 |
45
+
46
+ It's important to note that these scores are just a snapshot of the model's performance on a specific test set, and the performance of the model may vary depending on the input text, the quality of the training data, and the specific application for which the model is being used.
47
+
48
+ ## How to use via API
49
+
50
+ ```python
51
+ from transformers import pipeline
52
+
53
+ summarizer = pipeline(
54
+ 'summarization',
55
+ 'ybagoury/flan-t5-base-tldr_news',
56
+ )
57
+ raw_text = """ your text here... """
58
+ results = summarizer(raw_text)
59
+ print(results)
60
+ ```