Slower than standard Llama 8b?

#10
by Sijuade - opened

I ran a speed test of this model compared to meta-llama/Meta-Llama-3-8B-Instruct, and it seems slower. Is it just me?

Hugging Face 1Bit LLMs org

Yes you are right, you can look at this issue for context https://github.com/huggingface/transformers/issues/34277

Sign up or log in to comment