File size: 4,527 Bytes
61228fd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
03234c8
 
 
 
 
 
61228fd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
03234c8
61228fd
 
 
 
 
03234c8
61228fd
 
 
 
 
03234c8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61228fd
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
license: apache-2.0
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: flan-t5-large-work-filters
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# flan-t5-large-work-filters

This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0362
- Rouge1: 41.8961
- Rouge2: 31.4402
- Rougel: 41.841
- Rougelsum: 41.9024
- Gen Len: 18.9259

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 0.5256        | 1.0   | 213  | 0.2899          | 41.8953 | 29.9382 | 41.4023 | 41.4413   | 18.9788 |
| 0.2377        | 2.0   | 426  | 0.1172          | 42.3662 | 31.0031 | 41.997  | 42.0997   | 19.0    |
| 0.1501        | 3.0   | 639  | 0.1091          | 42.0009 | 31.2986 | 41.8735 | 41.9067   | 19.0    |
| 0.1256        | 4.0   | 852  | 0.0905          | 43.6233 | 32.9567 | 43.5606 | 43.6054   | 18.9788 |
| 0.0997        | 5.0   | 1065 | 0.0936          | 43.4929 | 32.9118 | 43.5026 | 43.5589   | 18.9577 |
| 0.0792        | 6.0   | 1278 | 0.0743          | 43.3921 | 32.8388 | 43.3863 | 43.4487   | 18.9577 |
| 0.0738        | 7.0   | 1491 | 0.0613          | 42.3912 | 31.6893 | 42.3324 | 42.375    | 18.9577 |
| 0.0621        | 8.0   | 1704 | 0.0753          | 42.4408 | 31.7954 | 42.391  | 42.4501   | 18.9577 |
| 0.0664        | 9.0   | 1917 | 0.0568          | 42.0348 | 31.4631 | 41.9591 | 42.0159   | 18.9577 |
| 0.0575        | 10.0  | 2130 | 0.0576          | 43.0601 | 32.8756 | 42.9724 | 43.0502   | 18.9577 |
| 0.0488        | 11.0  | 2343 | 0.0473          | 42.3785 | 31.845  | 42.2759 | 42.37     | 18.9577 |
| 0.0528        | 12.0  | 2556 | 0.0503          | 43.1495 | 32.7992 | 43.1017 | 43.1919   | 18.9577 |
| 0.0392        | 13.0  | 2769 | 0.0407          | 42.0459 | 31.7063 | 41.9685 | 42.0368   | 18.9259 |
| 0.0462        | 14.0  | 2982 | 0.0446          | 43.473  | 33.1682 | 43.4607 | 43.5482   | 18.9259 |
| 0.0449        | 15.0  | 3195 | 0.0426          | 43.2263 | 32.5799 | 43.2171 | 43.255    | 18.9577 |
| 0.0432        | 16.0  | 3408 | 0.0419          | 42.2094 | 31.7081 | 42.1549 | 42.2244   | 18.9577 |
| 0.037         | 17.0  | 3621 | 0.0398          | 42.2089 | 31.5243 | 42.1439 | 42.213    | 18.9259 |
| 0.0376        | 18.0  | 3834 | 0.0402          | 42.624  | 31.7967 | 42.5462 | 42.6104   | 18.9259 |
| 0.0423        | 19.0  | 4047 | 0.0406          | 42.6076 | 31.9496 | 42.5665 | 42.6086   | 18.9259 |
| 0.0364        | 20.0  | 4260 | 0.0406          | 43.4863 | 33.0331 | 43.4492 | 43.5222   | 18.9259 |
| 0.0326        | 21.0  | 4473 | 0.0362          | 41.8961 | 31.4402 | 41.841  | 41.9024   | 18.9259 |
| 0.0302        | 22.0  | 4686 | 0.0410          | 42.9891 | 32.761  | 42.9509 | 42.9624   | 18.9259 |
| 0.0318        | 23.0  | 4899 | 0.0411          | 42.861  | 32.4727 | 42.8046 | 42.8544   | 18.9259 |
| 0.034         | 24.0  | 5112 | 0.0387          | 42.6177 | 32.1915 | 42.4974 | 42.5653   | 18.9259 |
| 0.0307        | 25.0  | 5325 | 0.0373          | 43.2371 | 32.9299 | 43.2075 | 43.2857   | 18.9259 |
| 0.0308        | 26.0  | 5538 | 0.0377          | 42.8476 | 32.5806 | 42.7802 | 42.837    | 18.9259 |
| 0.0282        | 27.0  | 5751 | 0.0381          | 42.9285 | 32.4737 | 42.8965 | 42.945    | 18.9259 |
| 0.0277        | 28.0  | 5964 | 0.0383          | 42.6384 | 31.6781 | 42.567  | 42.6305   | 18.9259 |
| 0.0316        | 29.0  | 6177 | 0.0380          | 42.9983 | 32.5656 | 42.9407 | 42.9974   | 18.9259 |
| 0.0357        | 30.0  | 6390 | 0.0378          | 43.0447 | 32.5656 | 43.0102 | 43.0802   | 18.9259 |


### Framework versions

- Transformers 4.27.2
- Pytorch 2.0.0+cu118
- Datasets 2.9.0
- Tokenizers 0.13.3