File size: 5,124 Bytes
85e65d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d1df9c9
85e65d1
 
 
 
 
 
 
 
 
 
d1df9c9
 
 
 
85e65d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d1df9c9
 
 
 
 
85e65d1
 
 
 
 
 
 
 
0c50e84
 
 
 
 
 
 
 
41b6d96
0c50e84
 
b7236d5
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- samsum
metrics:
- rouge
model-index:
- name: flan-t5-base-samsum
  results:
  - task:
      name: Sequence-to-sequence Language Modeling
      type: text2text-generation
    dataset:
      name: samsum
      type: samsum
      config: samsum
      split: test
      args: samsum
    metrics:
    - name: Rouge1
      type: rouge
      value: 47.4798
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# flan-t5-base-samsum

This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the samsum dataset.
It achieves the following results on the evaluation set:
- Loss: 1.3772
- Rouge1: 47.4798
- Rouge2: 23.9756
- Rougel: 40.0392
- Rougelsum: 43.6545
- Gen Len: 17.3162

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 1.4403        | 1.0   | 1842 | 1.3829          | 46.5346 | 23.1326 | 39.4401 | 42.8272   | 17.0977 |
| 1.3534        | 2.0   | 3684 | 1.3732          | 47.0911 | 23.5074 | 39.5951 | 43.2279   | 17.4554 |
| 1.2795        | 3.0   | 5526 | 1.3709          | 46.8895 | 23.3243 | 39.5909 | 43.1286   | 17.2027 |
| 1.2313        | 4.0   | 7368 | 1.3736          | 47.4946 | 23.7802 | 39.9999 | 43.5903   | 17.2198 |
| 1.1934        | 5.0   | 9210 | 1.3772          | 47.4798 | 23.9756 | 40.0392 | 43.6545   | 17.3162 |


### Framework versions

- Transformers 4.26.0
- Pytorch 1.13.1+cu116
- Datasets 2.9.0
- Tokenizers 0.13.2

### Papers With Code Results

As of 2 February 2023 the Papers with Code page for this task has the following leaderboard.

Our score (Rouge 1 score of 47.4798) puts this model's performance between fourth and fifth place on the leaderboard:


![PwC leaderboard](https://i.imgur.com/Nea77uL.jpg)



## Model Recycling

[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=9.04&mnli_lp=nan&20_newsgroup=3.55&ag_news=1.66&amazon_reviews_multi=0.19&anli=14.53&boolq=16.60&cb=24.91&cola=10.35&copa=25.50&dbpedia=5.73&esnli=5.31&financial_phrasebank=19.96&imdb=0.05&isear=0.59&mnli=11.74&mrpc=15.89&multirc=5.99&poem_sentiment=23.27&qnli=3.93&qqp=5.54&rotten_tomatoes=3.54&rte=23.90&sst2=-0.14&sst_5bins=5.12&stsb=20.58&trec_coarse=4.15&trec_fine=10.93&tweet_ev_emoji=12.87&tweet_ev_emotion=6.02&tweet_ev_hate=-0.04&tweet_ev_irony=7.12&tweet_ev_offensive=2.16&tweet_ev_sentiment=-0.00&wic=12.03&wnli=9.44&wsc=9.37&yahoo_answers=3.04&model_name=andreaparker%2Fflan-t5-base-samsum&base_name=google%2Ft5-v1_1-base) using andreaparker/flan-t5-base-samsum as a base model yields average score of 77.86 in comparison to 68.82 by google/t5-v1_1-base.

The model is ranked 2nd among all tested models for the google/t5-v1_1-base architecture as of 07/02/2023
Results:

|   20_newsgroup |   ag_news |   amazon_reviews_multi |    anli |   boolq |      cb |    cola |   copa |   dbpedia |   esnli |   financial_phrasebank |   imdb |   isear |    mnli |    mrpc |   multirc |   poem_sentiment |    qnli |     qqp |   rotten_tomatoes |     rte |   sst2 |   sst_5bins |    stsb |   trec_coarse |   trec_fine |   tweet_ev_emoji |   tweet_ev_emotion |   tweet_ev_hate |   tweet_ev_irony |   tweet_ev_offensive |   tweet_ev_sentiment |     wic |   wnli |     wsc |   yahoo_answers |
|---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|-------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|-------:|--------:|----------------:|
|        86.4312 |   89.8333 |                   67.1 | 52.5937 | 82.1713 | 80.3571 | 80.5369 |     66 |      76.5 | 90.8897 |                   86.7 | 93.044 | 71.6428 | 87.2457 | 88.7255 |   62.1287 |          91.3462 | 93.3004 | 89.1393 |           89.5872 | 84.4765 | 93.578 |     56.9683 | 89.3674 |          97.4 |          93 |           46.334 |            81.6327 |         51.4815 |          74.7449 |              84.7674 |              69.8795 | 67.8683 | 56.338 | 57.6923 |            72.3 |


For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)