Set "use_cache": true for faster generation

We found that on the related Dolly v2 model, use_cache got set to false during training, but it really should be true for faster generation. Worked well! Could be applied to all of these similar models.
Ex: https://huggingface.co/databricks/dolly-v2-12b/commit/a7077365ca9caa324d6fdda760e953f2f75fac54

Files changed (1) hide show

config.json +1 -1

config.json CHANGED Viewed

@@ -41,6 +41,6 @@
   },
   "torch_dtype": "float16",
   "transformers_version": "4.25.1",
-  "use_cache": false,
   "vocab_size": 50260
 }

   },
   "torch_dtype": "float16",
   "transformers_version": "4.25.1",
+  "use_cache": true,
   "vocab_size": 50260
 }