Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This model is developed with transformers v4.13 with minor patch in this fork.

Setup

git clone https://github.com/vuiseng9/transformers
cd transformers
git checkout pegasus-v4p13 && git reset --hard 3db4b452
# installation, set summarization dependency
# . . .

Train

#!/usr/bin/env bash

export CUDA_VISIBLE_DEVICES=0,1 # 2 cards on xsum

NEPOCH=10
RUNID=pegasus-xsum-${NEPOCH}eph-run1
OUTDIR=/data1/vchua/pegasus-hf4p13/pegasus/${RUNID}
mkdir -p $OUTDIR

nohup python run_summarization.py \
    --model_name_or_path google/pegasus-large \
    --dataset_name xsum \
    --do_train \
    --adafactor \
    --learning_rate 1e-4 \
    --label_smoothing_factor 0.1 \
    --num_train_epochs $NEPOCH \
    --per_device_train_batch_size 8 \
    --do_eval \
    --per_device_eval_batch_size 8 \
    --num_beams 8 \
    --max_source_length 512 \
    --max_target_length 64 \
    --evaluation_strategy steps \
    --eval_steps 1000 \
    --save_strategy steps \
    --save_steps 2000 \
    --logging_steps 1 \
    --overwrite_output_dir \
    --run_name $RUNID \
    --output_dir $OUTDIR > $OUTDIR/run.log 2>&1

Eval

#!/usr/bin/env bash

export CUDA_VISIBLE_DEVICES=3

DT=$(date +%F_%H-%M)
RUNID=pegasus-xsum-${DT}
OUTDIR=/data1/vchua/pegasus-hf4p13/pegasus-test/${RUNID}
mkdir -p $OUTDIR

nohup python run_summarization.py \
    --model_name_or_path vuiseng9/pegasus-xsum \
    --dataset_name xsum \
    --max_source_length 512 \
    --max_target_length 64 \
    --do_predict \
    --per_device_eval_batch_size 16 \
    --predict_with_generate \
    --num_beams 8 \
    --overwrite_output_dir \
    --run_name $RUNID \
    --output_dir $OUTDIR > $OUTDIR/run.log 2>&1 &

Although fine-tuning is carried out for 10 epochs, this model is the checkpoint (@62000 steps, 4.9epoch, 20hrs) with lower loss during training. Test/predict with this checkpoint should give results below.

***** predict metrics *****
  predict_gen_len            =    24.0499
  predict_loss               =     1.5801
  predict_rouge1             =    47.2124
  predict_rouge2             =    24.3673
  predict_rougeL             =    39.0055
  predict_rougeLsum          =    39.0007
  predict_runtime            = 0:34:23.32
  predict_samples            =      11334
  predict_samples_per_second =      5.493
  predict_steps_per_second   =      0.344
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.