File size: 3,471 Bytes
0fc7a62
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
402750e
 
 
 
 
 
0fc7a62
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3202021
 
0fc7a62
 
 
3202021
0fc7a62
 
 
 
 
402750e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0fc7a62
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: long_t5
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# long_t5

This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.5158
- Rouge1: 0.5214
- Rouge2: 0.3347
- Rougel: 0.4751
- Rougelsum: 0.4746
- Gen Len: 25.9513

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
| 2.232         | 1.0   | 1600  | 1.6810          | 0.4704 | 0.2861 | 0.4256 | 0.4251    | 26.6112 |
| 2.0229        | 2.0   | 3200  | 1.6167          | 0.4859 | 0.2991 | 0.4412 | 0.4407    | 26.1006 |
| 1.9239        | 3.0   | 4800  | 1.5805          | 0.4924 | 0.3049 | 0.4475 | 0.4468    | 26.8169 |
| 1.8454        | 4.0   | 6400  | 1.5669          | 0.4968 | 0.3093 | 0.4517 | 0.4511    | 25.925  |
| 1.7626        | 5.0   | 8000  | 1.5432          | 0.4973 | 0.3132 | 0.453  | 0.4525    | 26.4362 |
| 1.6995        | 6.0   | 9600  | 1.5352          | 0.5045 | 0.3188 | 0.4596 | 0.459     | 26.1219 |
| 1.682         | 7.0   | 11200 | 1.5255          | 0.5066 | 0.3198 | 0.4613 | 0.4609    | 26.1581 |
| 1.6286        | 8.0   | 12800 | 1.5210          | 0.5113 | 0.3245 | 0.4663 | 0.466     | 26.1725 |
| 1.593         | 9.0   | 14400 | 1.5195          | 0.5102 | 0.3235 | 0.464  | 0.4638    | 25.8944 |
| 1.5784        | 10.0  | 16000 | 1.5166          | 0.5133 | 0.3265 | 0.4665 | 0.4661    | 25.685  |
| 1.5615        | 11.0  | 17600 | 1.5135          | 0.5161 | 0.3284 | 0.47   | 0.4695    | 25.8681 |
| 1.5391        | 12.0  | 19200 | 1.5106          | 0.5156 | 0.3303 | 0.4703 | 0.4701    | 26.1781 |
| 1.5077        | 13.0  | 20800 | 1.5095          | 0.5177 | 0.3317 | 0.4724 | 0.4721    | 26.0456 |
| 1.4923        | 14.0  | 22400 | 1.5163          | 0.5185 | 0.3321 | 0.4728 | 0.4723    | 26.17   |
| 1.4545        | 15.0  | 24000 | 1.5128          | 0.5181 | 0.3337 | 0.4727 | 0.4724    | 25.8219 |
| 1.4489        | 16.0  | 25600 | 1.5135          | 0.5209 | 0.3349 | 0.4744 | 0.4743    | 26.0369 |
| 1.4481        | 17.0  | 27200 | 1.5153          | 0.5218 | 0.3349 | 0.4751 | 0.4748    | 26.1744 |
| 1.4287        | 18.0  | 28800 | 1.5134          | 0.521  | 0.335  | 0.4752 | 0.4747    | 25.9525 |
| 1.389         | 19.0  | 30400 | 1.5155          | 0.5212 | 0.3348 | 0.4756 | 0.4751    | 26.0369 |
| 1.4215        | 20.0  | 32000 | 1.5158          | 0.5214 | 0.3347 | 0.4751 | 0.4746    | 25.9513 |


### Framework versions

- Transformers 4.41.2
- Pytorch 2.3.1+cu118
- Datasets 2.20.0
- Tokenizers 0.19.1