How many tokens per second when using Deepseek-V2(236B) as inference model in 8*A100

#7
by harvin-cn - opened

I want to deploy in 8*a100,But speed needs considerations,I can't get the inference speed of Deepseek-V2(236B) from README,thanks!

Sign up or log in to comment