Edit model card

GPT-2 fine-tuned on CNN/DM summarization dataset.

Training args:
{ "learning_rate": 0.0001
"logging_steps": 5000
"lr_scheduler_type": "cosine"
"num_train_epochs": 2
"per_device_train_batch_size": 12, # Total batch size: 36
"weight_decay": 0.1
}

{"generation_kwargs": {"do_sample": true, "max_new_tokens": 100, "min_length": 50}

Pre-processing to truncate the article to contain only 500 tokens. Post-processing to consider only first three sentences as the summary.

Test split metrics:

Meteor: 0.2562237219960531
Rouge1: 0.3754558158439447
Rouge2: 0.15532626375157227
RougeL: 0.25813023509572597
RougeLsum: 0.3489472885043494
BLEU: 0.09285941365815623
Bert_score: 0.87570951795246\

Downloads last month
10
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.