pszemraj commited on
Commit
9b33f3e
1 Parent(s): 5e6d18e

add details on use cases

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -69,7 +69,8 @@ inference:
69
 
70
  # Longformer Encoder-Decoder (LED) fine-tuned on Booksum
71
 
72
- - an 'upgraded' version of [`pszemraj/led-base-16384-finetuned-booksum`](https://huggingface.co/pszemraj/led-base-16384-finetuned-booksum), it was trained for an additional epoch with a max summary length of 1024 tokens (original was trained with 512) as a small portion of the summaries are between 512-1024 tokens long.
 
73
  - all the parameters for generation on the API are the same for easy comparison between versions.
74
  - works well on lots of text, can hand 16384 tokens/batch.
75
 
@@ -83,7 +84,7 @@ inference:
83
  # Usage - Basics
84
 
85
  - it is recommended to use `encoder_no_repeat_ngram_size=3` when calling the pipeline object to improve summary quality.
86
- - this param forces the model to use new vocabulary and create an abstractive summary, otherwise it may l compile the best _extractive_ summary from the input provided.
87
  - create the pipeline object:
88
 
89
  ```
 
69
 
70
  # Longformer Encoder-Decoder (LED) fine-tuned on Booksum
71
 
72
+ - **Use cases:** long narrative summarization (think stories - as the dataset intended), article/paper/textbook/other summarization, technical:simple summarization. Models trained on this dataset tend to also _explain_ what they are summarizing, which IMO is awesome.
73
+ - This is an 'upgraded' version of [`pszemraj/led-base-16384-finetuned-booksum`](https://huggingface.co/pszemraj/led-base-16384-finetuned-booksum), it was trained for an additional epoch with a max summary length of 1024 tokens (original was trained with 512) as a small portion of the summaries are between 512-1024 tokens long.
74
  - all the parameters for generation on the API are the same for easy comparison between versions.
75
  - works well on lots of text, can hand 16384 tokens/batch.
76
 
 
84
  # Usage - Basics
85
 
86
  - it is recommended to use `encoder_no_repeat_ngram_size=3` when calling the pipeline object to improve summary quality.
87
+ - this param forces the model to use new vocabulary and create an abstractive summary otherwise it may l compile the best _extractive_ summary from the input provided.
88
  - create the pipeline object:
89
 
90
  ```