Paper-Summarization-ArXiv

This model is a fine-tuned version of google/pegasus-x-base on the arxiv-summarization dataset.

Base Model: Pegasus-x-base (State-of-the-art for Long Context Summarization)

Finetuning Dataset:

  • We used full of ArXiv Dataset (Cohan et al., 2018, NAACL-HLT 2018) [PDF]
    • (Full length is 200,000+)

GPU: (RTX A6000) x 1

Train time: About 120 hours for 5 epochs

Test time: About 8 hours for test dataset.

Intended uses & limitations

  • Research Paper Summarization

Compare to Baseline

  • Pegasus-X-base zero-shot Performance:

    • R-1 | R-2 | R-L | R-LSUM : 6.2269 | 0.7894 | 4.6905 | 5.4591
  • This model

    • R-1 | R-2 | R-L | R-LSUM : 43.2305 | 16.6571 | 24.4315 | 33.9399 at
    model.generate(input_ids =inputs["input_ids"].to(device),
                                attention_mask=inputs["attention_mask"].to(device),
                                length_penalty=1, num_beams=2, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, top_k=25,top_p=0.95)
      
    
    • R-1 | R-2 | R-L | R-LSUM : 40.8486 | 16.3717 | 25.2937 | 33.6923 (refer to PEGASUS-X's paper) at
    model.generate(input_ids =inputs["input_ids"].to(device),
                                attention_mask=inputs["attention_mask"].to(device),
                                length_penalty=1, num_beams=1, max_length=128*2,top_p=1)
    
    • R-1 | R-2 | R-L | R-LSUM : 38.1317 | 15.0357 | 23.0286 | 30.9938 (Diverse Beam-Search Decoding) at
    model.generate(input_ids =inputs["input_ids"].to(device),
                                attention_mask=inputs["attention_mask"].to(device),
                                num_beam_groups=5,diversity_penalty=1.0,num_beams=5,min_length=150,max_length=128*4)
    
    • R-1 | R-2 | R-L | R-LSUM : 43.3017 | 16.6023 | 24.1867 | 33.7019 at
    model.generate(input_ids =inputs["input_ids"].to(device),
                                attention_mask=inputs["attention_mask"].to(device),
                                length_penalty=1.2, num_beams=4, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, temperature=0.9,top_k=50,top_p=0.92)
     
    

Training procedure

We use huggingface-based environment such as datasets, trainer, etc.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05,
train_batch_size: 1,
eval_batch_size: 1,
seed: 42,
gradient_accumulation_steps: 64,
total_train_batch_size: 64,
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08,
lr_scheduler_type: linear,
lr_scheduler_warmup_steps: 1586,
num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
2.6153 1.0 3172 2.1045
2.202 2.0 6344 2.0511
2.1547 3.0 9516 2.0282
2.132 4.0 12688 2.0164
2.1222 5.0 15860 2.0127

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.0.1
  • Datasets 2.12.0
  • Tokenizers 0.13.2
Downloads last month
80
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for UNIST-Eunchan/Research-Paper-Summarization-Pegasus-x-ArXiv

Finetuned
(12)
this model

Dataset used to train UNIST-Eunchan/Research-Paper-Summarization-Pegasus-x-ArXiv

Spaces using UNIST-Eunchan/Research-Paper-Summarization-Pegasus-x-ArXiv 2

Evaluation results