metadata

license: mit
tags:
  - generated_from_trainer
datasets:
  - it5/datasets
metrics:
  - rouge
model-index:
  - name: it5-efficient-small-el32-fst-i2f-0.0003
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: it5/datasets fst
          type: it5/datasets
          args: fst
        metrics:
          - name: Rouge1
            type: rouge
            value: 56.585

it5-efficient-small-el32-fst-i2f-0.0003

This model is a fine-tuned version of stefan-it/it5-efficient-small-el32 on the it5/datasets fst dataset. It achieves the following results on the evaluation set:

Loss: 2.2160
Rouge1: 56.585
Rouge2: 36.9335
Rougel: 53.7782
Rougelsum: 53.7779
Gen Len: 13.0891

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.9377	0.35	5000	2.5157	54.6148	35.1518	51.8908	51.8957	12.8717
2.803	0.7	10000	2.4086	55.641	36.1214	52.8683	52.8572	12.7513
2.5483	1.05	15000	2.3420	55.6604	36.0085	52.9599	52.9433	12.7754
2.4978	1.39	20000	2.3145	56.204	36.5896	53.338	53.3351	12.8804
2.5383	1.74	25000	2.2697	56.1356	36.6963	53.3579	53.3664	12.795
2.3368	2.09	30000	2.2603	56.0271	36.4249	53.3113	53.3272	12.7478
2.371	2.44	35000	2.2328	56.5041	36.8718	53.8064	53.7995	12.8243
2.3567	2.79	40000	2.2079	56.5318	36.9437	53.8359	53.8254	12.6851
2.1753	3.14	45000	2.2168	56.3831	36.8896	53.6542	53.6708	12.67
2.2069	3.48	50000	2.2055	56.7171	37.1665	53.9299	53.9259	12.8014
2.2396	3.83	55000	2.1801	56.936	37.5465	54.1064	54.1125	12.7989
2.0657	4.18	60000	2.1915	56.6312	37.1618	53.8646	53.8791	12.6987
2.0806	4.53	65000	2.1809	56.6599	37.1282	53.8838	53.8781	12.715
2.0933	4.88	70000	2.1771	56.5891	36.9461	53.8058	53.8087	12.6593
1.9949	5.23	75000	2.1932	56.4956	36.9679	53.7634	53.7731	12.6723
1.9954	5.57	80000	2.1813	56.4827	36.8319	53.6397	53.6399	12.6599
1.9912	5.92	85000	2.1755	56.6723	37.0432	53.8339	53.8233	12.7534
1.9068	6.27	90000	2.1849	56.6574	37.0691	53.9029	53.892	12.7037
1.9173	6.62	95000	2.1787	56.5701	36.861	53.6855	53.6699	12.6467
1.9131	6.97	100000	2.1862	56.7175	37.0749	53.8761	53.8794	12.7072
1.8164	7.32	105000	2.1999	56.6104	37.0809	53.8098	53.8216	12.6364
1.8489	7.66	110000	2.1945	56.6645	37.1267	53.9009	53.9008	12.5741
1.82	8.01	115000	2.2075	56.6075	37.0359	53.8792	53.8833	12.6428
1.772	8.36	120000	2.2067	56.4716	36.8675	53.6826	53.6742	12.6591
1.7795	8.71	125000	2.2056	56.4112	36.9011	53.6554	53.6495	12.608
1.72	9.06	130000	2.2197	56.4735	36.9255	53.6592	53.6463	12.6758
1.7174	9.41	135000	2.2169	56.4209	36.8139	53.5778	53.5685	12.6568
1.7466	9.75	140000	2.2165	56.3715	36.767	53.555	53.5468	12.6416

Framework versions

Transformers 4.15.0
Pytorch 1.10.0+cu102
Datasets 1.17.0
Tokenizers 0.10.3