Text Generation
Transformers
English
llama
Inference Endpoints

Fix for slow speed

#20
by CyberTimon - opened

You guys have to set use_cache to true in the config.json - that is very important for the speed. This fixes the slow speeds.

Thanks for the remark, I've set it true by default now.

Sign up or log in to comment