pszemraj commited on
Commit
4da593a
1 Parent(s): 09eda8e

peterszemraj@gmail.com

Browse files
Files changed (1) hide show
  1. README.md +9 -3
README.md CHANGED
@@ -62,7 +62,8 @@ inference:
62
 
63
  # checkpoints
64
 
65
- This model is a fine-tuned version of [pszemraj/pegasus-large-book-summary](https://huggingface.co/pszemraj/pegasus-large-book-summary) on an unknown dataset.
 
66
  It achieves the following results on the evaluation set:
67
  - eval_loss: 1.1193
68
  - eval_runtime: 6.6754
@@ -71,13 +72,18 @@ It achieves the following results on the evaluation set:
71
  - epoch: 3.0
72
  - step: 900
73
 
 
 
74
  ## Model description
75
 
76
- More information needed
 
 
77
 
78
  ## Intended uses & limitations
79
 
80
- More information needed
 
81
 
82
  ## Training and evaluation data
83
 
 
62
 
63
  # checkpoints
64
 
65
+ This model is a fine-tuned version of [google/pegasus-large](https://huggingface.co/google/pegasus-large) on the [booksum](https://github.com/salesforce/booksum) dataset for four total epochs.
66
+
67
  It achieves the following results on the evaluation set:
68
  - eval_loss: 1.1193
69
  - eval_runtime: 6.6754
 
72
  - epoch: 3.0
73
  - step: 900
74
 
75
+ A 1-epoch checkpoint can be found at [pszemraj/pegasus-large-book-summary](https://huggingface.co/pszemraj/pegasus-large-book-summary), which is where the second training session started from.
76
+
77
  ## Model description
78
 
79
+ - After some initial tests, it was found that models trained on the [booksum](https://github.com/salesforce/booksum) dataset seem to inherit the summaries' SparkNotes-style explanations; so the user gets a shorter and easier-to-understand version of the text instead of **just** more compact.
80
+ - This quality (anecdotally) is favourable for learning/comprehension because summarization datasets that simply make the information more compact (* cough * arXiv) can be so dense that the overall time spent trying to _comprehend_ what it is saying can be the same as just reading the original material.
81
+
82
 
83
  ## Intended uses & limitations
84
 
85
+ - standard pegasus has a max input length of 1024 tokens, therefore the model only saw the first 1024 tokens of a chapter when training, and learned to try to make the chapter's summary from that. Keep this in mind when using this model, as information at the end of a text sequence longer than 1024 tokens may be excluded from the final summary/the model will be biased towards information presented first.
86
+ - this was only trained on the dataset for an epoch but still provides reasonable results.
87
 
88
  ## Training and evaluation data
89