metadata

tags:
  - generated_from_trainer
datasets:
  - mlsum
metrics:
  - rouge
model-index:
  - name: eval-bart-turkish
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: mlsum tu
          type: mlsum
          args: tu
        metrics:
          - name: Rouge1
            type: rouge
            value: 43.2049

mukayese/bart-turkish-mlsum

This model is a initialized from scratch and trained only the mlsum/tu dataset with no pre-training.

It achieves the following results on the evaluation set:

Rouge1: 43.2049
Rouge2: 30.7082
Rougel: 38.1981
Rougelsum: 39.9453

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 2
total_train_batch_size: 64
total_eval_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 15.0
mixed_precision_training: Native AMP
label_smoothing_factor: 0.1

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
4.4304	1.0	3895	4.3749	33.2844	22.8262	29.9423	30.7953	19.7732
3.65	2.0	7790	3.7414	33.8392	23.517	30.4871	31.3309	19.9031
3.397	3.0	11685	3.5651	34.2335	23.9113	30.9237	31.7434	19.894
3.2202	4.0	15580	3.5054	34.2535	23.9595	30.9811	31.7961	19.9212
3.0827	5.0	19475	3.4547	34.5545	24.1991	31.2609	32.085	19.9195
2.9801	6.0	23370	3.4328	34.6721	24.2537	31.372	32.1777	19.9331
2.8689	7.0	27265	3.4377	34.6764	24.3314	31.4376	32.1981	19.9278
2.7813	8.0	31160	3.4407	34.746	24.345	31.4511	32.2708	19.9468
2.6848	9.0	35055	3.4539	34.7376	24.3224	31.4784	32.2817	19.9096
2.5974	10.0	38950	3.4683	34.9174	24.4716	31.5641	32.4039	19.9384
2.5228	11.0	42845	3.4903	34.9845	24.4972	31.6585	32.4753	19.93
2.4633	12.0	46740	3.5105	34.8496	24.3559	31.5256	32.3635	19.9275
2.4022	13.0	50635	3.5234	34.9109	24.4008	31.5449	32.4021	19.9374
2.3605	14.0	54530	3.5306	34.9545	24.4365	31.6208	32.4711	19.9366
2.3216	15.0	58425	3.5379	34.9079	24.4077	31.5734	32.4287	19.9365

Framework versions

Transformers 4.11.3
Pytorch 1.8.2+cu111
Datasets 1.14.0
Tokenizers 0.10.3