asafaya's picture
Update README.md
049eb32
metadata
tags:
  - generated_from_trainer
datasets:
  - mlsum
metrics:
  - rouge
model-index:
  - name: eval-bart-turkish
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: mlsum tu
          type: mlsum
          args: tu
        metrics:
          - name: Rouge1
            type: rouge
            value: 43.2049

mukayese/bart-turkish-mlsum

This model is a initialized from scratch and trained only the mlsum/tu dataset with no pre-training.

It achieves the following results on the evaluation set:

  • Rouge1: 43.2049
  • Rouge2: 30.7082
  • Rougel: 38.1981
  • Rougelsum: 39.9453

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 15.0
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
4.4304 1.0 3895 4.3749 33.2844 22.8262 29.9423 30.7953 19.7732
3.65 2.0 7790 3.7414 33.8392 23.517 30.4871 31.3309 19.9031
3.397 3.0 11685 3.5651 34.2335 23.9113 30.9237 31.7434 19.894
3.2202 4.0 15580 3.5054 34.2535 23.9595 30.9811 31.7961 19.9212
3.0827 5.0 19475 3.4547 34.5545 24.1991 31.2609 32.085 19.9195
2.9801 6.0 23370 3.4328 34.6721 24.2537 31.372 32.1777 19.9331
2.8689 7.0 27265 3.4377 34.6764 24.3314 31.4376 32.1981 19.9278
2.7813 8.0 31160 3.4407 34.746 24.345 31.4511 32.2708 19.9468
2.6848 9.0 35055 3.4539 34.7376 24.3224 31.4784 32.2817 19.9096
2.5974 10.0 38950 3.4683 34.9174 24.4716 31.5641 32.4039 19.9384
2.5228 11.0 42845 3.4903 34.9845 24.4972 31.6585 32.4753 19.93
2.4633 12.0 46740 3.5105 34.8496 24.3559 31.5256 32.3635 19.9275
2.4022 13.0 50635 3.5234 34.9109 24.4008 31.5449 32.4021 19.9374
2.3605 14.0 54530 3.5306 34.9545 24.4365 31.6208 32.4711 19.9366
2.3216 15.0 58425 3.5379 34.9079 24.4077 31.5734 32.4287 19.9365

Framework versions

  • Transformers 4.11.3
  • Pytorch 1.8.2+cu111
  • Datasets 1.14.0
  • Tokenizers 0.10.3