How many tokens per second when using Deepseek-V2(236B) as inference model in 8*A100
#7
by
harvin-cn
- opened
I want to deploy in 8*a100,But speed needs considerations,I can't get the inference speed of Deepseek-V2(236B) from README,thanks!