How many tokens per second when using Deepseek-V2(236B) as inference model in 8*A100
#7 opened 22 days ago
by
harvin-cn
Can DeepSeek-V2 run on two nodes (each with 4 A100)?
#5 opened 25 days ago
by
jy395
Calculation of _mscale during YARN RoPE scaling
1
#4 opened about 1 month ago
by
sszymczyk
keyError: 'sdpa'
1
#3 opened about 1 month ago
by
fengzi258
Smaller Models
1
#2 opened about 1 month ago
by
puffy310
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1630816930903-noauth.jpeg)
KV Cache for compress_kv or key-value states
5
#1 opened about 1 month ago
by
House-99