gsarti's picture
Initial commit
7bfdcbc
metadata
license: mit
tags:
  - generated_from_trainer
datasets:
  - it5/datasets
metrics:
  - rouge
model-index:
  - name: it5-efficient-small-el32-st_r2g-0.0003
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: it5/datasets st_r2g
          type: it5/datasets
          args: st_r2g
        metrics:
          - name: Rouge1
            type: rouge
            value: 30.0502

it5-efficient-small-el32-st_r2g-0.0003

This model is a fine-tuned version of stefan-it/it5-efficient-small-el32 on the it5/datasets st_r2g dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6135
  • Rouge1: 30.0502
  • Rouge2: 11.5687
  • Rougel: 26.5953
  • Rougelsum: 27.0402
  • Gen Len: 16.9578

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.1265 0.74 5000 2.7247 26.8378 9.3464 23.9521 24.2837 15.5914
2.8786 1.49 10000 2.6532 27.5869 10.0861 24.7406 25.0245 15.3272
2.6587 2.23 15000 2.6080 28.2336 10.5229 25.3053 25.6716 15.4338
2.664 2.98 20000 2.5630 28.6673 10.8421 25.7032 26.0245 15.6255
2.4896 3.72 25000 2.5679 28.842 10.885 25.6757 26.0633 16.1841
2.34 4.47 30000 2.5564 29.3246 11.1981 26.1637 26.5392 15.7826
2.2204 5.21 35000 2.5744 29.5545 11.3806 26.3237 26.6993 15.8374
2.2301 5.96 40000 2.5614 29.5872 11.4227 26.3139 26.7196 15.7213
2.1219 6.7 45000 2.5617 29.8256 11.3702 26.4156 26.8465 15.936
2.007 7.45 50000 2.6014 29.743 11.4336 26.38 26.772 15.7144
1.9398 8.19 55000 2.6080 29.9478 11.4801 26.5352 26.9746 15.9308
1.9426 8.94 60000 2.6022 30.097 11.5602 26.705 27.1092 15.8598
1.8853 9.68 65000 2.6138 30.1588 11.5823 26.6984 27.1371 15.803

Framework versions

  • Transformers 4.15.0
  • Pytorch 1.10.0+cu102
  • Datasets 1.17.0
  • Tokenizers 0.10.3