pszemraj/pegasus-large-summary-explain

This model is a fine-tuned version of google/pegasus-large on the booksum dataset for four total epochs.

It achieves the following results on the evaluation set:

A 1-epoch checkpoint can be found at pszemraj/pegasus-large-book-summary, which is where the second training session started from.

Model description

After some initial tests, it was found that models trained on the booksum dataset seem to inherit the summaries' SparkNotes-style explanations; so the user gets a shorter and easier-to-understand version of the text instead of just more compact.
This quality (anecdotally) is favourable for learning/comprehension because summarization datasets that simply make the information more compact (* cough * arXiv) can be so dense that the overall time spent trying to comprehend what it is saying can be the same as just reading the original material.

standard pegasus has a max input length of 1024 tokens, therefore the model only saw the first 1024 tokens of a chapter when training, and learned to try to make the chapter's summary from that. Keep this in mind when using this model, as information at the end of a text sequence longer than 1024 tokens may be excluded from the final summary/the model will be biased towards information presented first.

More information needed

The following hyperparameters were used during training: