gsarti's picture
Initial commit
8ab6b9a
metadata
license: mit
tags:
  - generated_from_trainer
datasets:
  - it5/datasets
metrics:
  - rouge
model-index:
  - name: it5-efficient-small-el32-qa-0.0003
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: it5/datasets qa
          type: it5/datasets
          args: qa
        metrics:
          - name: Rouge1
            type: rouge
            value: 74.2234

it5-efficient-small-el32-qa-0.0003

This model is a fine-tuned version of stefan-it/it5-efficient-small-el32 on the it5/datasets qa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8225
  • Rouge1: 74.2234
  • Rouge2: 40.5909
  • Rougel: 74.1327
  • Rougelsum: 74.2081
  • Gen Len: 4.7055

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 7.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.1164 0.8 5000 0.8244 66.4678 35.3554 66.4543 66.4522 4.541
0.9097 1.59 10000 0.7299 70.0574 37.5535 69.9512 70.0084 4.5548
0.6637 2.39 15000 0.7314 72.0767 39.2263 72.0257 72.0473 4.703
0.5015 3.19 20000 0.7147 73.0185 39.9998 72.9347 72.9576 4.75
0.5101 3.99 25000 0.7055 73.7898 40.5481 73.7235 73.7901 4.8728
0.3903 4.78 30000 0.7442 74.0845 39.9841 74.0172 74.0635 4.5938
0.2993 5.58 35000 0.8184 73.8405 40.2569 73.7756 73.7972 4.7412
0.2227 6.38 40000 0.8278 74.0159 40.6403 73.9412 73.9722 4.742

Framework versions

  • Transformers 4.15.0
  • Pytorch 1.10.0+cu102
  • Datasets 1.17.0
  • Tokenizers 0.10.3