File size: 6,458 Bytes
ce946f4
8b0bf1c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ce946f4
8b0bf1c
 
 
 
 
 
6982dd1
8b0bf1c
 
 
 
 
 
 
 
 
 
b484d98
8b0bf1c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
tags:
- generated_from_trainer
datasets:
- multi_news
metrics:
- rouge
model-index:
- name: Centrum-multinews
  results:
  - task:
      name: Summarization
      type: summarization
    dataset:
      name: multi_news
      type: multi_news
      args: default
    metrics:
    - name: Rouge1
      type: rouge
      value: 46.2987
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Centrum-multinews

This model is a fine-tuned version of [Centrum](https://huggingface.co/ratishsp/Centrum) on the multi_news dataset. The details of the model are mentioned in the preprint [Multi-Document Summarization with Centroid-Based Pretraining](https://arxiv.org/abs/2208.01006) (Ratish Puduppully and Mark Steedman).
It achieves the following results on the evaluation set:
- Loss: 3.2740
- Rouge1: 46.2987
- Rouge2: 18.4863
- Rougel: 24.2428
- Rougelsum: 42.5102
- Gen Len: 308.6606

## Model description

The script for training and inference of Centrum-multinews is available on https://github.com/ratishsp/centrum

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 1
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2500
- training_steps: 25000
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len  |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:|
| 3.2702        | 1.78  | 5000  | 3.2853          | 44.0203 | 16.6061 | 23.3846 | 40.3853   | 277.1855 |
| 3.2762        | 1.96  | 5500  | 3.2853          | 44.725  | 16.9262 | 23.475  | 41.0003   | 288.4173 |
| 3.2114        | 2.14  | 6000  | 3.2857          | 44.6456 | 17.0245 | 23.7328 | 40.9131   | 257.2761 |
| 3.1981        | 2.31  | 6500  | 3.2817          | 44.7869 | 17.0849 | 23.8372 | 41.0669   | 254.8618 |
| 3.2298        | 2.49  | 7000  | 3.2802          | 45.2657 | 17.2618 | 23.8204 | 41.5807   | 263.0854 |
| 3.2167        | 2.67  | 7500  | 3.2773          | 44.9516 | 17.0538 | 23.7894 | 41.1673   | 244.6939 |
| 3.2069        | 2.85  | 8000  | 3.2712          | 45.2153 | 17.2766 | 23.9883 | 41.4558   | 245.4036 |
| 3.1822        | 3.02  | 8500  | 3.2786          | 45.4747 | 17.6754 | 24.1878 | 41.7304   | 254.6624 |
| 3.1529        | 3.2   | 9000  | 3.2740          | 44.9033 | 17.1386 | 23.8511 | 41.177    | 246.0157 |
| 3.1407        | 3.38  | 9500  | 3.2704          | 45.1045 | 17.2335 | 23.9124 | 41.3243   | 243.4922 |
| 3.1376        | 3.56  | 10000 | 3.2721          | 45.2694 | 17.4797 | 24.1072 | 41.5441   | 243.8396 |
| 3.1545        | 3.74  | 10500 | 3.2720          | 45.3105 | 17.6338 | 24.1547 | 41.5731   | 231.1805 |
| 3.1307        | 3.91  | 11000 | 3.2684          | 45.4309 | 17.2665 | 23.8954 | 41.6518   | 250.1039 |
| 3.1022        | 4.09  | 11500 | 3.2719          | 45.1959 | 17.4017 | 24.056  | 41.5363   | 242.5923 |
| 3.1139        | 4.27  | 12000 | 3.2711          | 45.3864 | 17.4653 | 24.028  | 41.6797   | 240.5701 |
| 3.0978        | 4.45  | 12500 | 3.2722          | 45.5694 | 17.501  | 24.1452 | 41.7894   | 232.1149 |
| 3.1082        | 4.63  | 13000 | 3.2687          | 45.504  | 17.5137 | 24.1067 | 41.7686   | 245.1845 |
| 3.1059        | 4.8   | 13500 | 3.2686          | 45.3603 | 17.1619 | 23.8655 | 41.5953   | 248.6327 |
| 3.1141        | 4.98  | 14000 | 3.2658          | 45.2741 | 17.3814 | 24.0377 | 41.5263   | 234.0194 |
| 3.0294        | 5.16  | 14500 | 3.2716          | 45.7203 | 17.5962 | 24.1367 | 41.9119   | 244.4207 |
| 3.0613        | 5.34  | 15000 | 3.2697          | 45.775  | 17.6959 | 24.1867 | 42.0018   | 242.0381 |
| 3.0549        | 5.52  | 15500 | 3.2703          | 45.8193 | 17.686  | 24.1997 | 42.0109   | 242.5493 |
| 3.0725        | 5.69  | 16000 | 3.2655          | 45.3515 | 17.3438 | 24.0586 | 41.6126   | 240.2812 |
| 3.0728        | 5.87  | 16500 | 3.2671          | 45.6791 | 17.5028 | 24.0691 | 41.9219   | 250.455  |
| 3.0142        | 6.05  | 17000 | 3.2708          | 46.0287 | 17.8079 | 24.2916 | 42.2369   | 245.6204 |
| 3.0312        | 6.23  | 17500 | 3.2701          | 45.5731 | 17.5404 | 24.0925 | 41.7584   | 236.2234 |
| 3.0231        | 6.41  | 18000 | 3.2719          | 46.1094 | 17.7117 | 24.1117 | 42.2882   | 260.1686 |
| 3.0414        | 6.58  | 18500 | 3.2703          | 45.9178 | 17.6987 | 24.1882 | 42.1382   | 245.0961 |
| 3.0434        | 6.76  | 19000 | 3.2715          | 46.0129 | 17.7545 | 24.2235 | 42.245    | 247.8225 |
| 3.0456        | 6.94  | 19500 | 3.2682          | 45.8634 | 17.6462 | 24.1366 | 42.1194   | 256.9835 |
| 3.0188        | 7.12  | 20000 | 3.2752          | 45.8366 | 17.6771 | 24.165  | 42.0438   | 240.1866 |
| 3.0227        | 7.3   | 20500 | 3.2722          | 46.0509 | 17.8248 | 24.2389 | 42.2681   | 245.8337 |
| 2.9895        | 7.47  | 21000 | 3.2726          | 45.7896 | 17.5833 | 24.1226 | 42.016    | 243.867  |
| 3.0146        | 7.65  | 21500 | 3.2693          | 46.0179 | 17.6952 | 24.2204 | 42.2436   | 244.0598 |
| 3.014         | 7.83  | 22000 | 3.2708          | 46.0704 | 17.75   | 24.2308 | 42.2591   | 240.4804 |
| 3.0427        | 8.01  | 22500 | 3.2734          | 46.0662 | 17.7231 | 24.1915 | 42.2227   | 242.4203 |
| 2.9835        | 8.19  | 23000 | 3.2740          | 46.165  | 17.8947 | 24.366  | 42.3521   | 236.6266 |
| 2.987         | 8.36  | 23500 | 3.2719          | 45.9025 | 17.7625 | 24.2432 | 42.1257   | 238.479  |
| 2.9922        | 8.54  | 24000 | 3.2731          | 46.1971 | 17.7962 | 24.2279 | 42.3853   | 245.2081 |
| 2.9788        | 8.72  | 24500 | 3.2718          | 46.0806 | 17.8417 | 24.3261 | 42.264    | 240.1747 |
| 2.9878        | 8.9   | 25000 | 3.2715          | 46.0618 | 17.7725 | 24.2234 | 42.2574   | 242.5598 |


### Framework versions

- Transformers 4.20.0.dev0
- Pytorch 1.11.0
- Datasets 2.2.2
- Tokenizers 0.12.1