pszemraj commited on
Commit
3381aec
1 Parent(s): 683bb6c

add details

Browse files
Files changed (1) hide show
  1. README.md +53 -6
README.md CHANGED
@@ -69,19 +69,66 @@ inference:
69
 
70
  # Longformer Encoder-Decoder (LED) fine-tuned on Booksum
71
 
72
- This model is a fine-tuned version of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on an unknown dataset.
 
 
 
73
 
74
- ## Model description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
 
76
- More information needed
 
 
 
 
 
 
 
 
 
 
77
 
78
- ## Intended uses & limitations
79
 
80
- More information needed
81
 
82
  ## Training and evaluation data
83
 
84
- More information needed
 
85
 
86
  ## Training procedure
87
 
69
 
70
  # Longformer Encoder-Decoder (LED) fine-tuned on Booksum
71
 
72
+ - This model is a fine-tuned version of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the booksum dataset.
73
+ - the goal was to create a model that can generalize well and is useful in summarizing lots of text in academic and daily usage.
74
+ - all the parameters for generation on the API are the same as [the base model](https://huggingface.co/pszemraj/led-base-book-summary) for easy comparison between versions.
75
+ - works well on lots of text, can hand 16384 tokens/batch.
76
 
77
+ ---
78
+
79
+ # Usage - Basics
80
+
81
+ - it is recommended to use `encoder_no_repeat_ngram_size=3` when calling the pipeline object to improve summary quality.
82
+ - this param forces the model to use new vocabulary and create an abstractive summary, otherwise it may l compile the best _extractive_ summary from the input provided.
83
+ - create the pipeline object:
84
+
85
+ ```
86
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
87
+ from transformers import pipeline
88
+
89
+ hf_name = 'pszemraj/led-base-book-summary'
90
+
91
+ _model = AutoModelForSeq2SeqLM.from_pretrained(
92
+ hf_name,
93
+ low_cpu_mem_usage=True,
94
+ )
95
+
96
+ _tokenizer = AutoTokenizer.from_pretrained(
97
+ hf_name
98
+ )
99
+
100
+
101
+ summarizer = pipeline(
102
+ "summarization",
103
+ model=_model,
104
+ tokenizer=_tokenizer
105
+ )
106
+ ```
107
+
108
+ - put words into the pipeline object:
109
+
110
+ ```
111
+ wall_of_text = "your words here"
112
 
113
+ result = summarizer(
114
+ wall_of_text,
115
+ min_length=16,
116
+ max_length=256,
117
+ no_repeat_ngram_size=3,
118
+ encoder_no_repeat_ngram_size =3,
119
+ clean_up_tokenization_spaces=True,
120
+ repetition_penalty=3.7,
121
+ num_beams=4,
122
+ early_stopping=True,
123
+ )
124
 
 
125
 
126
+ ```
127
 
128
  ## Training and evaluation data
129
 
130
+ - the [booksum](https://arxiv.org/abs/2105.08209) dataset
131
+ - During training, the input text was the text of the chapter, and the output was the summary text
132
 
133
  ## Training procedure
134