metadata

license: mit
tags:
  - generated_from_trainer
datasets:
  - it5/datasets
metrics:
  - rouge
model-index:
  - name: it5-efficient-small-el32-st_r2g-0.0003
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: it5/datasets st_r2g
          type: it5/datasets
          args: st_r2g
        metrics:
          - name: Rouge1
            type: rouge
            value: 30.0502

it5-efficient-small-el32-st_r2g-0.0003

This model is a fine-tuned version of stefan-it/it5-efficient-small-el32 on the it5/datasets st_r2g dataset. It achieves the following results on the evaluation set:

Loss: 2.6135
Rouge1: 30.0502
Rouge2: 11.5687
Rougel: 26.5953
Rougelsum: 27.0402
Gen Len: 16.9578

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
3.1265	0.74	5000	2.7247	26.8378	9.3464	23.9521	24.2837	15.5914
2.8786	1.49	10000	2.6532	27.5869	10.0861	24.7406	25.0245	15.3272
2.6587	2.23	15000	2.6080	28.2336	10.5229	25.3053	25.6716	15.4338
2.664	2.98	20000	2.5630	28.6673	10.8421	25.7032	26.0245	15.6255
2.4896	3.72	25000	2.5679	28.842	10.885	25.6757	26.0633	16.1841
2.34	4.47	30000	2.5564	29.3246	11.1981	26.1637	26.5392	15.7826
2.2204	5.21	35000	2.5744	29.5545	11.3806	26.3237	26.6993	15.8374
2.2301	5.96	40000	2.5614	29.5872	11.4227	26.3139	26.7196	15.7213
2.1219	6.7	45000	2.5617	29.8256	11.3702	26.4156	26.8465	15.936
2.007	7.45	50000	2.6014	29.743	11.4336	26.38	26.772	15.7144
1.9398	8.19	55000	2.6080	29.9478	11.4801	26.5352	26.9746	15.9308
1.9426	8.94	60000	2.6022	30.097	11.5602	26.705	27.1092	15.8598
1.8853	9.68	65000	2.6138	30.1588	11.5823	26.6984	27.1371	15.803

Framework versions

Transformers 4.15.0
Pytorch 1.10.0+cu102
Datasets 1.17.0
Tokenizers 0.10.3