File size: 4,684 Bytes
33f2655
525e3b8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33f2655
525e3b8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
license: mit
tags:
- generated_from_trainer
datasets:
- it5/datasets
metrics:
- rouge
model-index:
- name: it5-efficient-small-el32-fst-i2f-0.0003
  results:
  - task:
      name: Summarization
      type: summarization
    dataset:
      name: it5/datasets fst
      type: it5/datasets
      args: fst
    metrics:
    - name: Rouge1
      type: rouge
      value: 56.585
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# it5-efficient-small-el32-fst-i2f-0.0003

This model is a fine-tuned version of [stefan-it/it5-efficient-small-el32](https://huggingface.co/stefan-it/it5-efficient-small-el32) on the it5/datasets fst dataset.
It achieves the following results on the evaluation set:
- Loss: 2.2160
- Rouge1: 56.585
- Rouge2: 36.9335
- Rougel: 53.7782
- Rougelsum: 53.7779
- Gen Len: 13.0891

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10.0

### Training results

| Training Loss | Epoch | Step   | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:------:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 2.9377        | 0.35  | 5000   | 2.5157          | 54.6148 | 35.1518 | 51.8908 | 51.8957   | 12.8717 |
| 2.803         | 0.7   | 10000  | 2.4086          | 55.641  | 36.1214 | 52.8683 | 52.8572   | 12.7513 |
| 2.5483        | 1.05  | 15000  | 2.3420          | 55.6604 | 36.0085 | 52.9599 | 52.9433   | 12.7754 |
| 2.4978        | 1.39  | 20000  | 2.3145          | 56.204  | 36.5896 | 53.338  | 53.3351   | 12.8804 |
| 2.5383        | 1.74  | 25000  | 2.2697          | 56.1356 | 36.6963 | 53.3579 | 53.3664   | 12.795  |
| 2.3368        | 2.09  | 30000  | 2.2603          | 56.0271 | 36.4249 | 53.3113 | 53.3272   | 12.7478 |
| 2.371         | 2.44  | 35000  | 2.2328          | 56.5041 | 36.8718 | 53.8064 | 53.7995   | 12.8243 |
| 2.3567        | 2.79  | 40000  | 2.2079          | 56.5318 | 36.9437 | 53.8359 | 53.8254   | 12.6851 |
| 2.1753        | 3.14  | 45000  | 2.2168          | 56.3831 | 36.8896 | 53.6542 | 53.6708   | 12.67   |
| 2.2069        | 3.48  | 50000  | 2.2055          | 56.7171 | 37.1665 | 53.9299 | 53.9259   | 12.8014 |
| 2.2396        | 3.83  | 55000  | 2.1801          | 56.936  | 37.5465 | 54.1064 | 54.1125   | 12.7989 |
| 2.0657        | 4.18  | 60000  | 2.1915          | 56.6312 | 37.1618 | 53.8646 | 53.8791   | 12.6987 |
| 2.0806        | 4.53  | 65000  | 2.1809          | 56.6599 | 37.1282 | 53.8838 | 53.8781   | 12.715  |
| 2.0933        | 4.88  | 70000  | 2.1771          | 56.5891 | 36.9461 | 53.8058 | 53.8087   | 12.6593 |
| 1.9949        | 5.23  | 75000  | 2.1932          | 56.4956 | 36.9679 | 53.7634 | 53.7731   | 12.6723 |
| 1.9954        | 5.57  | 80000  | 2.1813          | 56.4827 | 36.8319 | 53.6397 | 53.6399   | 12.6599 |
| 1.9912        | 5.92  | 85000  | 2.1755          | 56.6723 | 37.0432 | 53.8339 | 53.8233   | 12.7534 |
| 1.9068        | 6.27  | 90000  | 2.1849          | 56.6574 | 37.0691 | 53.9029 | 53.892    | 12.7037 |
| 1.9173        | 6.62  | 95000  | 2.1787          | 56.5701 | 36.861  | 53.6855 | 53.6699   | 12.6467 |
| 1.9131        | 6.97  | 100000 | 2.1862          | 56.7175 | 37.0749 | 53.8761 | 53.8794   | 12.7072 |
| 1.8164        | 7.32  | 105000 | 2.1999          | 56.6104 | 37.0809 | 53.8098 | 53.8216   | 12.6364 |
| 1.8489        | 7.66  | 110000 | 2.1945          | 56.6645 | 37.1267 | 53.9009 | 53.9008   | 12.5741 |
| 1.82          | 8.01  | 115000 | 2.2075          | 56.6075 | 37.0359 | 53.8792 | 53.8833   | 12.6428 |
| 1.772         | 8.36  | 120000 | 2.2067          | 56.4716 | 36.8675 | 53.6826 | 53.6742   | 12.6591 |
| 1.7795        | 8.71  | 125000 | 2.2056          | 56.4112 | 36.9011 | 53.6554 | 53.6495   | 12.608  |
| 1.72          | 9.06  | 130000 | 2.2197          | 56.4735 | 36.9255 | 53.6592 | 53.6463   | 12.6758 |
| 1.7174        | 9.41  | 135000 | 2.2169          | 56.4209 | 36.8139 | 53.5778 | 53.5685   | 12.6568 |
| 1.7466        | 9.75  | 140000 | 2.2165          | 56.3715 | 36.767  | 53.555  | 53.5468   | 12.6416 |


### Framework versions

- Transformers 4.15.0
- Pytorch 1.10.0+cu102
- Datasets 1.17.0
- Tokenizers 0.10.3