--- license: apache-2.0 --- GPT-2 fine-tuned on CNN/DM summarization dataset. Training args:\ { "learning_rate": 0.0001\ "logging_steps": 5000\ "lr_scheduler_type": "cosine"\ "num_train_epochs": 2\ "per_device_train_batch_size": 12, # Total batch size: 36\ "weight_decay": 0.1\ } {"generation_kwargs": {"do_sample": true, "max_new_tokens": 100, "min_length": 50} Pre-processing to truncate the article to contain only 500 tokens. Post-processing to consider only first three sentences as the summary. Test split metrics: Meteor: 0.2562237219960531\ Rouge1: 0.3754558158439447\ Rouge2: 0.15532626375157227\ RougeL: 0.25813023509572597\ RougeLsum: 0.3489472885043494\ BLEU: 0.09285941365815623\ Bert_score: 0.87570951795246\