Slow generation

by darkandpure - opened Oct 12, 2023

Oct 12, 2023

I was trying with a prompt where Im feeding codebase in that , what /I found Is that it is taking hell lot of time for generation with 22gb of gpu where as mistral was taking very less time, Can I know the reason behind and what will be solution for better latency ?

lewtun

Hugging Face H4 org Oct 13, 2023

Hello @darkandpure can you please share a code snippet of what you're running for inference, along with the tokens / s you're getting? It would also be useful to know what hardware you're running on. Thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment