Why is "use_cache" disabled by default in the generation_config.json?

#10
by jdpressman - opened
{
  "_from_model_config": true,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "transformers_version": "4.35.2",
  "use_cache": false
}```

This confused me as a new user because my inference was suddenly getting slower with each token outputted. Is there a specific reason why it's disabled?

Sign up or log in to comment