metadata

license: apache-2.0
base_model: facebook/bart-large
tags:
  - generated_from_trainer
metrics:
  - rouge
  - wer
model-index:
  - name: bart_bertsum_1024_375_1000
    results: []

bart_bertsum_1024_375_1000

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.0535
Rouge1: 0.6801
Rouge2: 0.4119
Rougel: 0.6159
Rougelsum: 0.616
Wer: 0.4729
Bleurt: -0.3664

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 6
eval_batch_size: 6
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Wer	Bleurt
No log	0.13	250	1.2919	0.636	0.3519	0.567	0.567	0.5296	-0.0182
2.2326	0.27	500	1.2002	0.6503	0.3707	0.5816	0.5817	0.5113	-0.7073
2.2326	0.4	750	1.1735	0.6564	0.3791	0.5898	0.5898	0.5048	-0.3421
1.2886	0.53	1000	1.1476	0.661	0.3843	0.594	0.5939	0.4994	0.0835
1.2886	0.66	1250	1.1289	0.6615	0.3863	0.5938	0.5938	0.4945	-0.5247
1.2306	0.8	1500	1.1197	0.67	0.3952	0.6046	0.6045	0.4909	-0.192
1.2306	0.93	1750	1.1077	0.6734	0.3989	0.6068	0.6067	0.4876	-0.3867
1.1852	1.06	2000	1.0917	0.6731	0.4027	0.609	0.609	0.4833	-0.6453
1.1852	1.2	2250	1.0852	0.6707	0.4013	0.6054	0.6054	0.4824	-0.5589
1.0875	1.33	2500	1.0785	0.6738	0.4049	0.6096	0.6096	0.4794	-0.5107
1.0875	1.46	2750	1.0709	0.6743	0.4046	0.6096	0.6095	0.478	-0.3387
1.0857	1.6	3000	1.0627	0.6778	0.41	0.6137	0.6137	0.4757	-0.4275
1.0857	1.73	3250	1.0636	0.675	0.4088	0.6121	0.612	0.4745	-0.3664
1.0634	1.86	3500	1.0552	0.6775	0.4103	0.6136	0.6136	0.4729	-0.3664
1.0634	1.99	3750	1.0535	0.6801	0.4119	0.6159	0.616	0.4729	-0.3664

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2