How many tokens per second when using Deepseek-V2(236B) as inference model in 8*A100

by harvin-cn - opened May 28

May 28

•

I want to deploy in 8*a100,But speed needs considerations,I can't get the inference speed of Deepseek-V2(236B) from README,thanks!

Jul 15

•

I tried, about 30 tokens/s @harvin-cn

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment