pszemraj commited on
Commit
c74dcb6
1 Parent(s): 205ea6f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -30,6 +30,7 @@ A summary of the [infamous navy seals copypasta](https://knowyourmeme.com/memes/
30
 
31
  > In this chapter, the monster explains how he intends to exact revenge on "the little b****" who insulted him. He tells the kiddo that he is a highly trained and experienced killer who will use his arsenal of weapons--including his access to the internet--to exact justice on the little brat.
32
 
 
33
 
34
  ---
35
 
@@ -72,14 +73,14 @@ Pass [other parameters related to beam search textgen](https://huggingface.co/bl
72
 
73
  While this model seems to improve upon factual consistency, **do not take summaries to be foolproof and check things that seem odd**.
74
 
75
- Specifically: negation statements (i.e. model says: _This thing does not have [ATTRIBUTE]_ where instead it should have said _This thing has a lot of [ATTRIBUTE]_).
76
  - I'm sure someone will write a paper on this eventually (if there isn't one already), but you can usually fact-check this by comparing a specific claim to what the surrounding sentences imply.
77
 
78
  ### Training and evaluation data
79
 
80
  `kmfoda/booksum` dataset on HuggingFace - read [the original paper here](https://arxiv.org/abs/2105.08209).
81
 
82
- - **Initial fine-tuning** only used input text with 12288 tokens input or less and 1024 tokens output or less (_i.e. rows with longer were dropped prior to train_) for memory reasons. Per brief analysis, summaries in the 12288-16384 range in this dataset are in the **small** minority
83
  - In addition, this initial training combined the training and validation sets and trained on these in aggregate to increase the functional dataset size. **Therefore, take the validation set results with a grain of salt; primary metrics should be (always) the test set.**
84
  - **final phases of fine-tuning** used the standard conventions of 16384 input/1024 output keeping everything (truncating longer sequences). This did not appear to change the loss/performance much.
85
 
 
30
 
31
  > In this chapter, the monster explains how he intends to exact revenge on "the little b****" who insulted him. He tells the kiddo that he is a highly trained and experienced killer who will use his arsenal of weapons--including his access to the internet--to exact justice on the little brat.
32
 
33
+ While a somewhat crude example, try running this copypasta through other summarization models to see the difference in comprehension (_despite it not even being a "long" text!_)
34
 
35
  ---
36
 
 
73
 
74
  While this model seems to improve upon factual consistency, **do not take summaries to be foolproof and check things that seem odd**.
75
 
76
+ Specifically: negation statements (i.e., model says: _This thing does not have [ATTRIBUTE]_ where instead it should have said _This thing has a lot of [ATTRIBUTE]_).
77
  - I'm sure someone will write a paper on this eventually (if there isn't one already), but you can usually fact-check this by comparing a specific claim to what the surrounding sentences imply.
78
 
79
  ### Training and evaluation data
80
 
81
  `kmfoda/booksum` dataset on HuggingFace - read [the original paper here](https://arxiv.org/abs/2105.08209).
82
 
83
+ - **Initial fine-tuning** only used input text with 12288 tokens input or less and 1024 tokens output or less (_i.e. rows with longer were dropped before training_) for memory reasons. Per brief analysis, summaries in the 12288-16384 range in this dataset are in the **small** minority
84
  - In addition, this initial training combined the training and validation sets and trained on these in aggregate to increase the functional dataset size. **Therefore, take the validation set results with a grain of salt; primary metrics should be (always) the test set.**
85
  - **final phases of fine-tuning** used the standard conventions of 16384 input/1024 output keeping everything (truncating longer sequences). This did not appear to change the loss/performance much.
86