Set "use_cache": true for faster generation

#2
by srowen - opened

We found that on the related Dolly v2 model, use_cache got set to false during training, but it really should be true for faster generation. Worked well! Could be applied to all of these similar models.
Ex: https://huggingface.co/databricks/dolly-v2-12b/commit/a7077365ca9caa324d6fdda760e953f2f75fac54

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment