File size: 3,273 Bytes
249cda2
b59941a
e8cdf1c
b59941a
 
 
 
 
 
 
 
 
 
6b33854
e8cdf1c
b59941a
 
 
 
e8cdf1c
 
 
249cda2
 
b59941a
 
249cda2
b59941a
249cda2
b59941a
 
e8cdf1c
 
 
 
 
b59941a
249cda2
b59941a
249cda2
b59941a
249cda2
b59941a
249cda2
b59941a
249cda2
b59941a
249cda2
b59941a
249cda2
b59941a
249cda2
b59941a
249cda2
b59941a
e8cdf1c
 
b59941a
 
e8cdf1c
 
b59941a
 
e8cdf1c
 
b59941a
249cda2
b59941a
249cda2
b59941a
 
e8cdf1c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
249cda2
 
b59941a
249cda2
864c4df
b59941a
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
license: apache-2.0
base_model: t5-small
tags:
- generated_from_trainer
datasets:
- bills-summarization
metrics:
- rouge
model-index:
- name: ft-t5-with-dill-sum
  results:
  - task:
      name: Summarization
      type: summarization
    dataset:
      name: billsum
      type: bills-summarization
    metrics:
    - name: Rouge1
      type: rouge
      value: 0.1886
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# ft-t5-with-dill-sum

This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the billsum dataset.
It achieves the following results on the evaluation set:
- Loss: 2.3109
- Rouge1: 0.1886
- Rouge2: 0.104
- Rougel: 0.166
- Rougelsum: 0.1659
- Gen Len: 19.0

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 15
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
| 2.5462        | 1.0   | 31   | 2.4185          | 0.187  | 0.1023 | 0.1637 | 0.1639    | 19.0    |
| 2.5478        | 2.0   | 62   | 2.4166          | 0.187  | 0.1018 | 0.1637 | 0.1639    | 19.0    |
| 2.5729        | 3.0   | 93   | 2.4114          | 0.1868 | 0.1015 | 0.1637 | 0.1638    | 19.0    |
| 2.5806        | 4.0   | 124  | 2.4072          | 0.1855 | 0.1006 | 0.1626 | 0.1627    | 19.0    |
| 2.5231        | 5.0   | 155  | 2.4025          | 0.1877 | 0.1042 | 0.165  | 0.165     | 19.0    |
| 2.5245        | 6.0   | 186  | 2.3948          | 0.1869 | 0.1024 | 0.1642 | 0.1642    | 19.0    |
| 2.5273        | 7.0   | 217  | 2.3860          | 0.1886 | 0.1032 | 0.1652 | 0.1653    | 19.0    |
| 2.4941        | 8.0   | 248  | 2.3765          | 0.188  | 0.1033 | 0.1649 | 0.165     | 19.0    |
| 2.4612        | 9.0   | 279  | 2.3698          | 0.19   | 0.1057 | 0.1671 | 0.1671    | 19.0    |
| 2.463         | 10.0  | 310  | 2.3578          | 0.1882 | 0.1039 | 0.1662 | 0.1663    | 19.0    |
| 2.4539        | 11.0  | 341  | 2.3491          | 0.1898 | 0.1057 | 0.1667 | 0.1667    | 19.0    |
| 2.441         | 12.0  | 372  | 2.3392          | 0.1901 | 0.1055 | 0.1669 | 0.1668    | 19.0    |
| 2.4389        | 13.0  | 403  | 2.3292          | 0.1893 | 0.1053 | 0.1666 | 0.1665    | 19.0    |
| 2.3945        | 14.0  | 434  | 2.3203          | 0.1903 | 0.1051 | 0.1676 | 0.1675    | 19.0    |
| 2.4148        | 15.0  | 465  | 2.3109          | 0.1886 | 0.104  | 0.166  | 0.1659    | 19.0    |


### Framework versions

- Transformers 4.41.1
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1