|
--- |
|
license: mit |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- it5/datasets |
|
metrics: |
|
- rouge |
|
model-index: |
|
- name: it5-efficient-small-el32-st_r2g-0.0003 |
|
results: |
|
- task: |
|
name: Summarization |
|
type: summarization |
|
dataset: |
|
name: it5/datasets st_r2g |
|
type: it5/datasets |
|
args: st_r2g |
|
metrics: |
|
- name: Rouge1 |
|
type: rouge |
|
value: 30.0502 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# it5-efficient-small-el32-st_r2g-0.0003 |
|
|
|
This model is a fine-tuned version of [stefan-it/it5-efficient-small-el32](https://huggingface.co/stefan-it/it5-efficient-small-el32) on the it5/datasets st_r2g dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 2.6135 |
|
- Rouge1: 30.0502 |
|
- Rouge2: 11.5687 |
|
- Rougel: 26.5953 |
|
- Rougelsum: 27.0402 |
|
- Gen Len: 16.9578 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0003 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 10.0 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |
|
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| |
|
| 3.1265 | 0.74 | 5000 | 2.7247 | 26.8378 | 9.3464 | 23.9521 | 24.2837 | 15.5914 | |
|
| 2.8786 | 1.49 | 10000 | 2.6532 | 27.5869 | 10.0861 | 24.7406 | 25.0245 | 15.3272 | |
|
| 2.6587 | 2.23 | 15000 | 2.6080 | 28.2336 | 10.5229 | 25.3053 | 25.6716 | 15.4338 | |
|
| 2.664 | 2.98 | 20000 | 2.5630 | 28.6673 | 10.8421 | 25.7032 | 26.0245 | 15.6255 | |
|
| 2.4896 | 3.72 | 25000 | 2.5679 | 28.842 | 10.885 | 25.6757 | 26.0633 | 16.1841 | |
|
| 2.34 | 4.47 | 30000 | 2.5564 | 29.3246 | 11.1981 | 26.1637 | 26.5392 | 15.7826 | |
|
| 2.2204 | 5.21 | 35000 | 2.5744 | 29.5545 | 11.3806 | 26.3237 | 26.6993 | 15.8374 | |
|
| 2.2301 | 5.96 | 40000 | 2.5614 | 29.5872 | 11.4227 | 26.3139 | 26.7196 | 15.7213 | |
|
| 2.1219 | 6.7 | 45000 | 2.5617 | 29.8256 | 11.3702 | 26.4156 | 26.8465 | 15.936 | |
|
| 2.007 | 7.45 | 50000 | 2.6014 | 29.743 | 11.4336 | 26.38 | 26.772 | 15.7144 | |
|
| 1.9398 | 8.19 | 55000 | 2.6080 | 29.9478 | 11.4801 | 26.5352 | 26.9746 | 15.9308 | |
|
| 1.9426 | 8.94 | 60000 | 2.6022 | 30.097 | 11.5602 | 26.705 | 27.1092 | 15.8598 | |
|
| 1.8853 | 9.68 | 65000 | 2.6138 | 30.1588 | 11.5823 | 26.6984 | 27.1371 | 15.803 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.15.0 |
|
- Pytorch 1.10.0+cu102 |
|
- Datasets 1.17.0 |
|
- Tokenizers 0.10.3 |
|
|