Update README.md
Browse files
README.md
CHANGED
@@ -173,13 +173,15 @@ long_text = "Here is a lot of text I don't want to read. Replace me"
|
|
173 |
result = summarizer(long_text)
|
174 |
print(result[0]["summary_text"])
|
175 |
```
|
176 |
-
###
|
177 |
|
178 |
-
|
|
|
|
|
179 |
|
180 |
Pass [other parameters related to beam search textgen](https://huggingface.co/blog/how-to-generate) when calling `summarizer` to get even higher quality results.
|
181 |
|
182 |
-
|
183 |
|
184 |
> alternate section title: how to get this monster to run inference on free Colab runtimes
|
185 |
|
@@ -211,6 +213,8 @@ model = AutoModelForSeq2SeqLM.from_pretrained(
|
|
211 |
)
|
212 |
```
|
213 |
|
|
|
|
|
214 |
Do you love to ask questions? Awesome. But first, check out the [how LLM.int8 works blog post](https://huggingface.co/blog/hf-bitsandbytes-integration) by huggingface.
|
215 |
|
216 |
\* More rigorous metric-based investigation into comparing beam-search summarization with and without LLM.int8 will take place over time.
|
|
|
173 |
result = summarizer(long_text)
|
174 |
print(result[0]["summary_text"])
|
175 |
```
|
176 |
+
### Beyond the basics
|
177 |
|
178 |
+
There are two additional points to consider beyond simple inference: adjusting decoding parameters for improved performance, and quantization for decreased memory devouring.
|
179 |
+
|
180 |
+
#### Adjusting parameters
|
181 |
|
182 |
Pass [other parameters related to beam search textgen](https://huggingface.co/blog/how-to-generate) when calling `summarizer` to get even higher quality results.
|
183 |
|
184 |
+
#### LLM.int8 Quantization
|
185 |
|
186 |
> alternate section title: how to get this monster to run inference on free Colab runtimes
|
187 |
|
|
|
213 |
)
|
214 |
```
|
215 |
|
216 |
+
The above is already present in the Colab demo linked at the top of the model card.
|
217 |
+
|
218 |
Do you love to ask questions? Awesome. But first, check out the [how LLM.int8 works blog post](https://huggingface.co/blog/hf-bitsandbytes-integration) by huggingface.
|
219 |
|
220 |
\* More rigorous metric-based investigation into comparing beam-search summarization with and without LLM.int8 will take place over time.
|