metadata

license: mit
tags:
  - generated_from_trainer
datasets:
  - it5/datasets
metrics:
  - rouge
model-index:
  - name: it5-efficient-small-el32-qa-0.0003
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: it5/datasets qa
          type: it5/datasets
          args: qa
        metrics:
          - name: Rouge1
            type: rouge
            value: 74.2234

it5-efficient-small-el32-qa-0.0003

This model is a fine-tuned version of stefan-it/it5-efficient-small-el32 on the it5/datasets qa dataset. It achieves the following results on the evaluation set:

Loss: 0.8225
Rouge1: 74.2234
Rouge2: 40.5909
Rougel: 74.1327
Rougelsum: 74.2081
Gen Len: 4.7055

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 7.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.1164	0.8	5000	0.8244	66.4678	35.3554	66.4543	66.4522	4.541
0.9097	1.59	10000	0.7299	70.0574	37.5535	69.9512	70.0084	4.5548
0.6637	2.39	15000	0.7314	72.0767	39.2263	72.0257	72.0473	4.703
0.5015	3.19	20000	0.7147	73.0185	39.9998	72.9347	72.9576	4.75
0.5101	3.99	25000	0.7055	73.7898	40.5481	73.7235	73.7901	4.8728
0.3903	4.78	30000	0.7442	74.0845	39.9841	74.0172	74.0635	4.5938
0.2993	5.58	35000	0.8184	73.8405	40.2569	73.7756	73.7972	4.7412
0.2227	6.38	40000	0.8278	74.0159	40.6403	73.9412	73.9722	4.742

Framework versions

Transformers 4.15.0
Pytorch 1.10.0+cu102
Datasets 1.17.0
Tokenizers 0.10.3