Edit model card

GPT-2 fine-tuned on CNN/DM summarization dataset.

Training args:
{ "learning_rate": 0.0001
"logging_steps": 5000
"lr_scheduler_type": "cosine"
"num_train_epochs": 2
"per_device_train_batch_size": 12, # Total batch size: 36
"weight_decay": 0.1
}

{"generation_kwargs": {"do_sample": true, "max_new_tokens": 100, "min_length": 50}

Pre-processing to truncate the article to contain only 500 tokens. Post-processing to consider only first three sentences as the summary.

Test split metrics:

Meteor: 0.2562237219960531
Rouge1: 0.3754558158439447
Rouge2: 0.15532626375157227
RougeL: 0.25813023509572597
RougeLsum: 0.3489472885043494
BLEU: 0.09285941365815623
Bert_score: 0.87570951795246\

Downloads last month
12