File size: 5,314 Bytes
5ddee0a
 
 
 
d8c794b
 
c846433
 
5ddee0a
 
 
d8c794b
 
 
 
 
 
5ddee0a
 
 
 
 
 
 
c846433
5ddee0a
c846433
 
 
 
 
 
5ddee0a
 
 
 
 
 
 
 
 
 
 
d8c794b
5ddee0a
 
 
 
 
 
c846433
 
5ddee0a
 
 
c846433
5ddee0a
 
 
c846433
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5ddee0a
 
 
 
 
 
d8c794b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
license: apache-2.0
tags:
- generated_from_trainer
- gov report
- long document
metrics:
- rouge
model-index:
- name: long-t5-base-govreport
  results: []
datasets:
- pszemraj/govreport-summarization-8192
language:
- en
library_name: transformers
pipeline_tag: summarization
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# long-t5-base-govreport

This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on the None dataset.
It achieves the following results on the evaluation set:
- Gen Len: 787.34
- Loss: 1.5448
- Rouge1: 57.2303
- Rouge2: 24.9705
- Rougel: 26.8081
- Rougelsum: 54.2747

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

Refer to the [pszemraj/govreport-summarization-8192](https://huggingface.co/datasets/pszemraj/govreport-summarization-8192) dataset.

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 3
- eval_batch_size: 1
- seed: 4299
- gradient_accumulation_steps: 128
- total_train_batch_size: 384
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 25.0

### Training results

| Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum |
|:-------------:|:-----:|:----:|:-------:|:---------------:|:-------:|:-------:|:-------:|:---------:|
| 2.1198        | 0.39  | 25   | 805.336 | 1.8720          | 29.4332 | 7.3761  | 17.0816 | 25.065    |
| 1.8609        | 0.78  | 50   | 833.404 | 1.7601          | 35.3533 | 10.6624 | 18.643  | 31.6979   |
| 1.7805        | 1.17  | 75   | 866.356 | 1.6833          | 36.5786 | 11.1185 | 20.0358 | 33.2116   |
| 1.7352        | 1.56  | 100  | 822.348 | 1.6524          | 40.5489 | 13.0695 | 20.1256 | 37.1369   |
| 1.7371        | 1.95  | 125  | 765.6   | 1.6294          | 43.8594 | 15.2962 | 20.7807 | 40.3461   |
| 1.6428        | 2.34  | 150  | 844.184 | 1.6055          | 44.5054 | 15.731  | 21.2582 | 40.9775   |
| 1.6567        | 2.73  | 175  | 857.236 | 1.6031          | 47.3641 | 16.9664 | 21.4998 | 43.994    |
| 1.5773        | 3.12  | 200  | 841.86  | 1.5855          | 47.2284 | 17.3099 | 21.6793 | 43.9018   |
| 1.5614        | 3.52  | 225  | 832.8   | 1.5883          | 46.4612 | 17.1368 | 21.5931 | 43.1184   |
| 1.5328        | 3.91  | 250  | 790.056 | 1.5730          | 46.5685 | 17.5423 | 22.2082 | 43.1811   |
| 1.5194        | 4.3   | 275  | 825.868 | 1.5690          | 47.6205 | 18.377  | 22.7639 | 44.3701   |
| 1.571         | 4.69  | 300  | 794.032 | 1.5676          | 49.2203 | 19.1109 | 22.8005 | 46.0679   |
| 1.4275        | 5.08  | 325  | 833.068 | 1.5656          | 50.6982 | 20.0278 | 23.5585 | 47.5036   |
| 1.4912        | 5.47  | 350  | 793.068 | 1.5625          | 50.3371 | 19.8639 | 23.3666 | 47.1898   |
| 1.4764        | 5.86  | 375  | 819.86  | 1.5532          | 50.9702 | 20.7532 | 23.8765 | 47.9915   |
| 1.3972        | 6.25  | 400  | 770.78  | 1.5564          | 49.279  | 19.4781 | 23.1018 | 46.1942   |
| 1.4479        | 6.64  | 425  | 806.244 | 1.5529          | 50.3317 | 20.2888 | 23.4454 | 47.3491   |
| 1.4567        | 7.03  | 450  | 787.48  | 1.5590          | 52.2209 | 21.2868 | 23.9284 | 49.1691   |
| 1.3933        | 7.42  | 475  | 842.664 | 1.5561          | 51.9578 | 20.5806 | 23.7177 | 48.9121   |
| 1.4245        | 7.81  | 500  | 813.772 | 1.5420          | 52.3725 | 21.7787 | 24.5209 | 49.4003   |
| 1.3033        | 8.2   | 525  | 824.66  | 1.5499          | 52.7839 | 21.589  | 24.5617 | 49.8609   |
| 1.3673        | 8.59  | 550  | 807.348 | 1.5530          | 53.2339 | 22.152  | 24.7587 | 50.2502   |
| 1.3634        | 8.98  | 575  | 767.952 | 1.5458          | 53.0293 | 22.3194 | 25.174  | 50.078    |
| 1.3095        | 9.37  | 600  | 856.252 | 1.5412          | 53.7658 | 22.5229 | 25.0448 | 50.708    |
| 1.3492        | 9.76  | 625  | 826.064 | 1.5389          | 51.8662 | 21.6229 | 24.6819 | 48.8648   |
| 1.3007        | 10.16 | 650  | 843.544 | 1.5404          | 53.6692 | 22.154  | 24.6218 | 50.6864   |
| 1.2729        | 10.55 | 675  | 808.764 | 1.5428          | 54.6479 | 23.3029 | 25.5647 | 51.6394   |
| 1.3758        | 10.94 | 700  | 800.152 | 1.5403          | 54.9418 | 23.3323 | 25.6087 | 51.9256   |
| 1.3357        | 11.33 | 725  | 814.496 | 1.5455          | 55.2511 | 23.5606 | 25.8237 | 52.3183   |
| 1.2817        | 11.72 | 750  | 811.144 | 1.5412          | 55.2847 | 23.6632 | 25.9341 | 52.3146   |
| 1.2771        | 12.11 | 775  | 852.704 | 1.5450          | 55.1956 | 23.5545 | 25.677  | 52.1841   |
| 1.2892        | 12.5  | 800  | 805.844 | 1.5369          | 54.9563 | 23.5105 | 25.8876 | 51.9568   |
| 1.2757        | 12.89 | 825  | 813.476 | 1.5467          | 56.4728 | 24.6875 | 26.4415 | 53.4939   |
| 1.2382        | 13.28 | 850  | 787.34  | 1.5448          | 57.2303 | 24.9705 | 26.8081 | 54.2747   |


### Framework versions

- Transformers 4.25.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.7.0
- Tokenizers 0.13.2