pszemraj
/

pegasus-large-book-summary

text2text-generation

Model card Files Files and versions

checkpoints

This model is a fine-tuned version of google/pegasus-large on the booksum dataset.

Model description

More information needed

Intended uses & limitations

standard pegasus has a max input length of 1024 tokens, therefore the model only saw the first 1024 tokens of a chapter when training, and learned to try to make the chapter's summary from that. Keep this in mind when using this model, as information at the end of a text sequence longer than 1024 tokens may be excluded from the final summary/the model will be biased towards information presented first.
this was only trained on the dataset for an epoch but still provides reasonable results.

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 16
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine_with_restarts
lr_scheduler_warmup_ratio: 0.03
num_epochs: 1

Training results

Framework versions

Transformers 4.16.1
Pytorch 1.10.0+cu111
Datasets 1.18.2
Tokenizers 0.10.3

Downloads last month: 22

Safetensors

Model size

569M params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train pszemraj/pegasus-large-book-summary