--- tags: - generated_from_trainer datasets: - multi_news metrics: - rouge model-index: - name: Centrum-multinews results: - task: name: Summarization type: summarization dataset: name: multi_news type: multi_news args: default metrics: - name: Rouge1 type: rouge value: 46.2987 --- # Centrum-multinews This model is a fine-tuned version of [Centrum](https://huggingface.co/ratishsp/Centrum) on the multi_news dataset. The details of the model are mentioned in the preprint [Multi-Document Summarization with Centroid-Based Pretraining](https://arxiv.org/abs/2208.01006) (Ratish Puduppully and Mark Steedman). It achieves the following results on the evaluation set: - Loss: 3.2740 - Rouge1: 46.2987 - Rouge2: 18.4863 - Rougel: 24.2428 - Rougelsum: 42.5102 - Gen Len: 308.6606 ## Model description The script for training and inference of Centrum-multinews is available on https://github.com/ratishsp/centrum ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 1 - eval_batch_size: 4 - seed: 42 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 4 - total_train_batch_size: 16 - total_eval_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 2500 - training_steps: 25000 - mixed_precision_training: Native AMP - label_smoothing_factor: 0.1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:| | 3.2702 | 1.78 | 5000 | 3.2853 | 44.0203 | 16.6061 | 23.3846 | 40.3853 | 277.1855 | | 3.2762 | 1.96 | 5500 | 3.2853 | 44.725 | 16.9262 | 23.475 | 41.0003 | 288.4173 | | 3.2114 | 2.14 | 6000 | 3.2857 | 44.6456 | 17.0245 | 23.7328 | 40.9131 | 257.2761 | | 3.1981 | 2.31 | 6500 | 3.2817 | 44.7869 | 17.0849 | 23.8372 | 41.0669 | 254.8618 | | 3.2298 | 2.49 | 7000 | 3.2802 | 45.2657 | 17.2618 | 23.8204 | 41.5807 | 263.0854 | | 3.2167 | 2.67 | 7500 | 3.2773 | 44.9516 | 17.0538 | 23.7894 | 41.1673 | 244.6939 | | 3.2069 | 2.85 | 8000 | 3.2712 | 45.2153 | 17.2766 | 23.9883 | 41.4558 | 245.4036 | | 3.1822 | 3.02 | 8500 | 3.2786 | 45.4747 | 17.6754 | 24.1878 | 41.7304 | 254.6624 | | 3.1529 | 3.2 | 9000 | 3.2740 | 44.9033 | 17.1386 | 23.8511 | 41.177 | 246.0157 | | 3.1407 | 3.38 | 9500 | 3.2704 | 45.1045 | 17.2335 | 23.9124 | 41.3243 | 243.4922 | | 3.1376 | 3.56 | 10000 | 3.2721 | 45.2694 | 17.4797 | 24.1072 | 41.5441 | 243.8396 | | 3.1545 | 3.74 | 10500 | 3.2720 | 45.3105 | 17.6338 | 24.1547 | 41.5731 | 231.1805 | | 3.1307 | 3.91 | 11000 | 3.2684 | 45.4309 | 17.2665 | 23.8954 | 41.6518 | 250.1039 | | 3.1022 | 4.09 | 11500 | 3.2719 | 45.1959 | 17.4017 | 24.056 | 41.5363 | 242.5923 | | 3.1139 | 4.27 | 12000 | 3.2711 | 45.3864 | 17.4653 | 24.028 | 41.6797 | 240.5701 | | 3.0978 | 4.45 | 12500 | 3.2722 | 45.5694 | 17.501 | 24.1452 | 41.7894 | 232.1149 | | 3.1082 | 4.63 | 13000 | 3.2687 | 45.504 | 17.5137 | 24.1067 | 41.7686 | 245.1845 | | 3.1059 | 4.8 | 13500 | 3.2686 | 45.3603 | 17.1619 | 23.8655 | 41.5953 | 248.6327 | | 3.1141 | 4.98 | 14000 | 3.2658 | 45.2741 | 17.3814 | 24.0377 | 41.5263 | 234.0194 | | 3.0294 | 5.16 | 14500 | 3.2716 | 45.7203 | 17.5962 | 24.1367 | 41.9119 | 244.4207 | | 3.0613 | 5.34 | 15000 | 3.2697 | 45.775 | 17.6959 | 24.1867 | 42.0018 | 242.0381 | | 3.0549 | 5.52 | 15500 | 3.2703 | 45.8193 | 17.686 | 24.1997 | 42.0109 | 242.5493 | | 3.0725 | 5.69 | 16000 | 3.2655 | 45.3515 | 17.3438 | 24.0586 | 41.6126 | 240.2812 | | 3.0728 | 5.87 | 16500 | 3.2671 | 45.6791 | 17.5028 | 24.0691 | 41.9219 | 250.455 | | 3.0142 | 6.05 | 17000 | 3.2708 | 46.0287 | 17.8079 | 24.2916 | 42.2369 | 245.6204 | | 3.0312 | 6.23 | 17500 | 3.2701 | 45.5731 | 17.5404 | 24.0925 | 41.7584 | 236.2234 | | 3.0231 | 6.41 | 18000 | 3.2719 | 46.1094 | 17.7117 | 24.1117 | 42.2882 | 260.1686 | | 3.0414 | 6.58 | 18500 | 3.2703 | 45.9178 | 17.6987 | 24.1882 | 42.1382 | 245.0961 | | 3.0434 | 6.76 | 19000 | 3.2715 | 46.0129 | 17.7545 | 24.2235 | 42.245 | 247.8225 | | 3.0456 | 6.94 | 19500 | 3.2682 | 45.8634 | 17.6462 | 24.1366 | 42.1194 | 256.9835 | | 3.0188 | 7.12 | 20000 | 3.2752 | 45.8366 | 17.6771 | 24.165 | 42.0438 | 240.1866 | | 3.0227 | 7.3 | 20500 | 3.2722 | 46.0509 | 17.8248 | 24.2389 | 42.2681 | 245.8337 | | 2.9895 | 7.47 | 21000 | 3.2726 | 45.7896 | 17.5833 | 24.1226 | 42.016 | 243.867 | | 3.0146 | 7.65 | 21500 | 3.2693 | 46.0179 | 17.6952 | 24.2204 | 42.2436 | 244.0598 | | 3.014 | 7.83 | 22000 | 3.2708 | 46.0704 | 17.75 | 24.2308 | 42.2591 | 240.4804 | | 3.0427 | 8.01 | 22500 | 3.2734 | 46.0662 | 17.7231 | 24.1915 | 42.2227 | 242.4203 | | 2.9835 | 8.19 | 23000 | 3.2740 | 46.165 | 17.8947 | 24.366 | 42.3521 | 236.6266 | | 2.987 | 8.36 | 23500 | 3.2719 | 45.9025 | 17.7625 | 24.2432 | 42.1257 | 238.479 | | 2.9922 | 8.54 | 24000 | 3.2731 | 46.1971 | 17.7962 | 24.2279 | 42.3853 | 245.2081 | | 2.9788 | 8.72 | 24500 | 3.2718 | 46.0806 | 17.8417 | 24.3261 | 42.264 | 240.1747 | | 2.9878 | 8.9 | 25000 | 3.2715 | 46.0618 | 17.7725 | 24.2234 | 42.2574 | 242.5598 | ### Framework versions - Transformers 4.20.0.dev0 - Pytorch 1.11.0 - Datasets 2.2.2 - Tokenizers 0.12.1