gsarti's picture
Initial commit
525e3b8
metadata
license: mit
tags:
  - generated_from_trainer
datasets:
  - it5/datasets
metrics:
  - rouge
model-index:
  - name: it5-efficient-small-el32-fst-i2f-0.0003
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: it5/datasets fst
          type: it5/datasets
          args: fst
        metrics:
          - name: Rouge1
            type: rouge
            value: 56.585

it5-efficient-small-el32-fst-i2f-0.0003

This model is a fine-tuned version of stefan-it/it5-efficient-small-el32 on the it5/datasets fst dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2160
  • Rouge1: 56.585
  • Rouge2: 36.9335
  • Rougel: 53.7782
  • Rougelsum: 53.7779
  • Gen Len: 13.0891

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.9377 0.35 5000 2.5157 54.6148 35.1518 51.8908 51.8957 12.8717
2.803 0.7 10000 2.4086 55.641 36.1214 52.8683 52.8572 12.7513
2.5483 1.05 15000 2.3420 55.6604 36.0085 52.9599 52.9433 12.7754
2.4978 1.39 20000 2.3145 56.204 36.5896 53.338 53.3351 12.8804
2.5383 1.74 25000 2.2697 56.1356 36.6963 53.3579 53.3664 12.795
2.3368 2.09 30000 2.2603 56.0271 36.4249 53.3113 53.3272 12.7478
2.371 2.44 35000 2.2328 56.5041 36.8718 53.8064 53.7995 12.8243
2.3567 2.79 40000 2.2079 56.5318 36.9437 53.8359 53.8254 12.6851
2.1753 3.14 45000 2.2168 56.3831 36.8896 53.6542 53.6708 12.67
2.2069 3.48 50000 2.2055 56.7171 37.1665 53.9299 53.9259 12.8014
2.2396 3.83 55000 2.1801 56.936 37.5465 54.1064 54.1125 12.7989
2.0657 4.18 60000 2.1915 56.6312 37.1618 53.8646 53.8791 12.6987
2.0806 4.53 65000 2.1809 56.6599 37.1282 53.8838 53.8781 12.715
2.0933 4.88 70000 2.1771 56.5891 36.9461 53.8058 53.8087 12.6593
1.9949 5.23 75000 2.1932 56.4956 36.9679 53.7634 53.7731 12.6723
1.9954 5.57 80000 2.1813 56.4827 36.8319 53.6397 53.6399 12.6599
1.9912 5.92 85000 2.1755 56.6723 37.0432 53.8339 53.8233 12.7534
1.9068 6.27 90000 2.1849 56.6574 37.0691 53.9029 53.892 12.7037
1.9173 6.62 95000 2.1787 56.5701 36.861 53.6855 53.6699 12.6467
1.9131 6.97 100000 2.1862 56.7175 37.0749 53.8761 53.8794 12.7072
1.8164 7.32 105000 2.1999 56.6104 37.0809 53.8098 53.8216 12.6364
1.8489 7.66 110000 2.1945 56.6645 37.1267 53.9009 53.9008 12.5741
1.82 8.01 115000 2.2075 56.6075 37.0359 53.8792 53.8833 12.6428
1.772 8.36 120000 2.2067 56.4716 36.8675 53.6826 53.6742 12.6591
1.7795 8.71 125000 2.2056 56.4112 36.9011 53.6554 53.6495 12.608
1.72 9.06 130000 2.2197 56.4735 36.9255 53.6592 53.6463 12.6758
1.7174 9.41 135000 2.2169 56.4209 36.8139 53.5778 53.5685 12.6568
1.7466 9.75 140000 2.2165 56.3715 36.767 53.555 53.5468 12.6416

Framework versions

  • Transformers 4.15.0
  • Pytorch 1.10.0+cu102
  • Datasets 1.17.0
  • Tokenizers 0.10.3