pszemraj/pegasus-x-large-book-summary

Get SparkNotes-esque summaries of arbitrary text! Due to the model size, it's recommended to try it out in Colab (linked above) as the API textbox may time out.

This model is a fine-tuned version of google/pegasus-x-large on the kmfoda/booksum dataset for approx eight epochs.

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

Epochs 1-4

TODO

Epochs 5 & 6

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 4
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 32
total_train_batch_size: 128
optimizer: ADAN using lucidrains' adan-pytorch with default betas
lr_scheduler_type: constant_with_warmup
data type: TF32
num_epochs: 2

Epochs 7 & 8

epochs 5 & 6 were trained with 12288 tokens input
this fixes that with 2 epochs at 16384 tokens input

The following hyperparameters were used during training:

learning_rate: 0.0004
train_batch_size: 4
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: ADAN using lucidrains' adan-pytorch with default betas
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
num_epochs: 2

Framework versions

Transformers 4.22.0
Pytorch 1.11.0a0+17540c5
Datasets 2.4.0
Tokenizers 0.12.1

Downloads last month: 926

Safetensors

Model size

569M params

Tensor type

F32

Inference Examples

Summarization

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.