File size: 4,676 Bytes
6061ee1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
license: apache-2.0
base_model: facebook/bart-base
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: xsum_1677_bart-base
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# xsum_1677_bart-base

This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6469
- Rouge1: 0.3879
- Rouge2: 0.1787
- Rougel: 0.3238
- Rougelsum: 0.3238
- Gen Len: 19.6644

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 10

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
| 0.8336        | 0.31  | 500   | 0.7274          | 0.3493 | 0.139  | 0.2847 | 0.2847    | 19.511  |
| 0.7963        | 0.63  | 1000  | 0.6994          | 0.3637 | 0.1506 | 0.2977 | 0.2976    | 19.6179 |
| 0.7543        | 0.94  | 1500  | 0.6876          | 0.365  | 0.1531 | 0.2999 | 0.2999    | 19.5356 |
| 0.7461        | 1.25  | 2000  | 0.6795          | 0.3709 | 0.1584 | 0.3052 | 0.3051    | 19.6224 |
| 0.7193        | 1.57  | 2500  | 0.6739          | 0.3684 | 0.1593 | 0.3048 | 0.3047    | 19.5721 |
| 0.7225        | 1.88  | 3000  | 0.6666          | 0.371  | 0.16   | 0.3063 | 0.3063    | 19.5672 |
| 0.6779        | 2.2   | 3500  | 0.6660          | 0.3745 | 0.1632 | 0.31   | 0.31      | 19.5619 |
| 0.673         | 2.51  | 4000  | 0.6618          | 0.3763 | 0.1653 | 0.3117 | 0.3117    | 19.6738 |
| 0.6848        | 2.82  | 4500  | 0.6578          | 0.3803 | 0.168  | 0.3145 | 0.3145    | 19.6308 |
| 0.6526        | 3.14  | 5000  | 0.6581          | 0.3803 | 0.1679 | 0.3141 | 0.3141    | 19.6503 |
| 0.6497        | 3.45  | 5500  | 0.6555          | 0.3776 | 0.1681 | 0.3132 | 0.3133    | 19.643  |
| 0.6483        | 3.76  | 6000  | 0.6520          | 0.3803 | 0.17   | 0.3153 | 0.3152    | 19.6666 |
| 0.6249        | 4.08  | 6500  | 0.6535          | 0.383  | 0.1736 | 0.3186 | 0.3185    | 19.6371 |
| 0.628         | 4.39  | 7000  | 0.6531          | 0.3825 | 0.1728 | 0.3181 | 0.318     | 19.6159 |
| 0.6288        | 4.7   | 7500  | 0.6495          | 0.3827 | 0.1727 | 0.3181 | 0.3181    | 19.6695 |
| 0.5921        | 5.02  | 8000  | 0.6509          | 0.3825 | 0.173  | 0.318  | 0.318     | 19.6447 |
| 0.6003        | 5.33  | 8500  | 0.6513          | 0.3833 | 0.1742 | 0.3198 | 0.3197    | 19.6866 |
| 0.5922        | 5.65  | 9000  | 0.6482          | 0.3837 | 0.1737 | 0.3195 | 0.3195    | 19.719  |
| 0.5878        | 5.96  | 9500  | 0.6483          | 0.3824 | 0.1737 | 0.3185 | 0.3185    | 19.6156 |
| 0.5646        | 6.27  | 10000 | 0.6503          | 0.3851 | 0.1754 | 0.3203 | 0.3204    | 19.6693 |
| 0.5753        | 6.59  | 10500 | 0.6473          | 0.3855 | 0.1761 | 0.3206 | 0.3206    | 19.6873 |
| 0.579         | 6.9   | 11000 | 0.6467          | 0.3861 | 0.1769 | 0.3223 | 0.3223    | 19.6635 |
| 0.5865        | 7.21  | 11500 | 0.6480          | 0.3862 | 0.176  | 0.3213 | 0.3212    | 19.7016 |
| 0.5746        | 7.53  | 12000 | 0.6480          | 0.3878 | 0.1785 | 0.3235 | 0.3236    | 19.6531 |
| 0.5678        | 7.84  | 12500 | 0.6460          | 0.3868 | 0.1776 | 0.3221 | 0.322     | 19.7039 |
| 0.5584        | 8.15  | 13000 | 0.6485          | 0.3875 | 0.178  | 0.3233 | 0.3233    | 19.6565 |
| 0.5484        | 8.47  | 13500 | 0.6477          | 0.3867 | 0.1777 | 0.3223 | 0.3224    | 19.6937 |
| 0.558         | 8.78  | 14000 | 0.6468          | 0.3873 | 0.1781 | 0.323  | 0.323     | 19.6823 |
| 0.5482        | 9.1   | 14500 | 0.6475          | 0.3878 | 0.1787 | 0.3231 | 0.3232    | 19.6896 |
| 0.5551        | 9.41  | 15000 | 0.6475          | 0.388  | 0.1783 | 0.3238 | 0.3237    | 19.666  |
| 0.5488        | 9.72  | 15500 | 0.6469          | 0.3879 | 0.1787 | 0.3238 | 0.3238    | 19.6644 |


### Framework versions

- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1