How many tokens per second when using Deepseek-V2(236B) as inference model in 8*A100
#7 opened 5 days ago
by
harvin-cn
Can DeepSeek-V2 run on two nodes (each with 4 A100)?
#5 opened 8 days ago
by
jy395
Calculation of _mscale during YARN RoPE scaling
1
#4 opened 17 days ago
by
sszymczyk
keyError: 'sdpa'
1
#3 opened 26 days ago
by
fengzi258
Smaller Models
1
#2 opened 26 days ago
by
puffy310
KV Cache for compress_kv or key-value states
5
#1 opened 27 days ago
by
House-99