pszemraj commited on
Commit
1bb84f7
1 Parent(s): 25f8f89

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -173,13 +173,15 @@ long_text = "Here is a lot of text I don't want to read. Replace me"
173
  result = summarizer(long_text)
174
  print(result[0]["summary_text"])
175
  ```
176
- ### beyond the basics
177
 
178
- ### decoding performance
 
 
179
 
180
  Pass [other parameters related to beam search textgen](https://huggingface.co/blog/how-to-generate) when calling `summarizer` to get even higher quality results.
181
 
182
- ### LLM.int8 Quantization
183
 
184
  > alternate section title: how to get this monster to run inference on free Colab runtimes
185
 
@@ -211,6 +213,8 @@ model = AutoModelForSeq2SeqLM.from_pretrained(
211
  )
212
  ```
213
 
 
 
214
  Do you love to ask questions? Awesome. But first, check out the [how LLM.int8 works blog post](https://huggingface.co/blog/hf-bitsandbytes-integration) by huggingface.
215
 
216
  \* More rigorous metric-based investigation into comparing beam-search summarization with and without LLM.int8 will take place over time.
 
173
  result = summarizer(long_text)
174
  print(result[0]["summary_text"])
175
  ```
176
+ ### Beyond the basics
177
 
178
+ There are two additional points to consider beyond simple inference: adjusting decoding parameters for improved performance, and quantization for decreased memory devouring.
179
+
180
+ #### Adjusting parameters
181
 
182
  Pass [other parameters related to beam search textgen](https://huggingface.co/blog/how-to-generate) when calling `summarizer` to get even higher quality results.
183
 
184
+ #### LLM.int8 Quantization
185
 
186
  > alternate section title: how to get this monster to run inference on free Colab runtimes
187
 
 
213
  )
214
  ```
215
 
216
+ The above is already present in the Colab demo linked at the top of the model card.
217
+
218
  Do you love to ask questions? Awesome. But first, check out the [how LLM.int8 works blog post](https://huggingface.co/blog/hf-bitsandbytes-integration) by huggingface.
219
 
220
  \* More rigorous metric-based investigation into comparing beam-search summarization with and without LLM.int8 will take place over time.