File size: 3,340 Bytes
8a25033
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
049eb32
 
 
8a25033
 
049eb32
8a25033
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
tags:
- generated_from_trainer
datasets:
- mlsum
metrics:
- rouge
model-index:
- name: eval-bart-turkish
  results:
  - task:
      name: Summarization
      type: summarization
    dataset:
      name: mlsum tu
      type: mlsum
      args: tu
    metrics:
    - name: Rouge1
      type: rouge
      value: 43.2049
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mukayese/bart-turkish-mlsum

This model is a initialized from scratch and trained only the mlsum/tu dataset with no pre-training.

It achieves the following results on the evaluation set:

- Rouge1: 43.2049
- Rouge2: 30.7082
- Rougel: 38.1981
- Rougelsum: 39.9453

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 15.0
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 4.4304        | 1.0   | 3895  | 4.3749          | 33.2844 | 22.8262 | 29.9423 | 30.7953   | 19.7732 |
| 3.65          | 2.0   | 7790  | 3.7414          | 33.8392 | 23.517  | 30.4871 | 31.3309   | 19.9031 |
| 3.397         | 3.0   | 11685 | 3.5651          | 34.2335 | 23.9113 | 30.9237 | 31.7434   | 19.894  |
| 3.2202        | 4.0   | 15580 | 3.5054          | 34.2535 | 23.9595 | 30.9811 | 31.7961   | 19.9212 |
| 3.0827        | 5.0   | 19475 | 3.4547          | 34.5545 | 24.1991 | 31.2609 | 32.085    | 19.9195 |
| 2.9801        | 6.0   | 23370 | 3.4328          | 34.6721 | 24.2537 | 31.372  | 32.1777   | 19.9331 |
| 2.8689        | 7.0   | 27265 | 3.4377          | 34.6764 | 24.3314 | 31.4376 | 32.1981   | 19.9278 |
| 2.7813        | 8.0   | 31160 | 3.4407          | 34.746  | 24.345  | 31.4511 | 32.2708   | 19.9468 |
| 2.6848        | 9.0   | 35055 | 3.4539          | 34.7376 | 24.3224 | 31.4784 | 32.2817   | 19.9096 |
| 2.5974        | 10.0  | 38950 | 3.4683          | 34.9174 | 24.4716 | 31.5641 | 32.4039   | 19.9384 |
| 2.5228        | 11.0  | 42845 | 3.4903          | 34.9845 | 24.4972 | 31.6585 | 32.4753   | 19.93   |
| 2.4633        | 12.0  | 46740 | 3.5105          | 34.8496 | 24.3559 | 31.5256 | 32.3635   | 19.9275 |
| 2.4022        | 13.0  | 50635 | 3.5234          | 34.9109 | 24.4008 | 31.5449 | 32.4021   | 19.9374 |
| 2.3605        | 14.0  | 54530 | 3.5306          | 34.9545 | 24.4365 | 31.6208 | 32.4711   | 19.9366 |
| 2.3216        | 15.0  | 58425 | 3.5379          | 34.9079 | 24.4077 | 31.5734 | 32.4287   | 19.9365 |

### Framework versions

- Transformers 4.11.3
- Pytorch 1.8.2+cu111
- Datasets 1.14.0
- Tokenizers 0.10.3