Set "use_cache": true for faster generation
#2
by
srowen
- opened
We found that on the related Dolly v2 model, use_cache got set to false during training, but it really should be true for faster generation. Worked well! Could be applied to all of these similar models.
Ex: https://huggingface.co/databricks/dolly-v2-12b/commit/a7077365ca9caa324d6fdda760e953f2f75fac54