Fix the bug in generation config

#2

If we follow the default generation configuration, it does not utilize the key-value cache during inference. This can cause the model to be too slow to generate text efficiently.

AI Singapore org

Thank you very much for the fix.

RaymondAISG changed pull request status to merged

Sign up or log in to comment