pszemraj/pegasus-large-summary-explain
This model is a fine-tuned version of google/pegasus-large on the booksum dataset for four total epochs.
It achieves the following results on the evaluation set:
- eval_loss: 1.1193
- eval_runtime: 6.6754
- eval_samples_per_second: 27.714
- eval_steps_per_second: 1.798
- epoch: 3.0
- step: 900
A 1-epoch checkpoint can be found at pszemraj/pegasus-large-book-summary, which is where the second training session started from.
Model description
- After some initial tests, it was found that models trained on the booksum dataset seem to inherit the summaries' SparkNotes-style explanations; so the user gets a shorter and easier-to-understand version of the text instead of just more compact.
- This quality (anecdotally) is favourable for learning/comprehension because summarization datasets that simply make the information more compact (* cough * arXiv) can be so dense that the overall time spent trying to comprehend what it is saying can be the same as just reading the original material.
Intended uses & limitations
- standard pegasus has a max input length of 1024 tokens, therefore the model only saw the first 1024 tokens of a chapter when training, and learned to try to make the chapter's summary from that. Keep this in mind when using this model, as information at the end of a text sequence longer than 1024 tokens may be excluded from the final summary/the model will be biased towards information presented first.
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 4
Framework versions
- Transformers 4.16.2
- Pytorch 1.10.2+cu113
- Datasets 1.18.3
- Tokenizers 0.11.0
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Dataset used to train pszemraj/pegasus-large-summary-explain
Evaluation results
- ROUGE-1 on kmfoda/booksumtest set verified29.102
- ROUGE-2 on kmfoda/booksumtest set verified6.244
- ROUGE-L on kmfoda/booksumtest set verified14.750
- ROUGE-LSUM on kmfoda/booksumtest set verified27.238
- loss on kmfoda/booksumtest set verified2.979
- gen_len on kmfoda/booksumtest set verified467.269