How much memory is needed if you make the 128k context length
#13 opened 3 days ago
by
ggbondcxk
Implement MLA inference optimizations to DeepseekV2Attention
#12 opened 12 days ago
by
sy-chen
Join LMSYS Chatbot Arena?
1
#11 opened 12 days ago
by
Light4Bear
Can you provide a sample code for training with DeepSpeed ZeRO3?
#10 opened 25 days ago
by
SupercarryNg
Ollama support
1
#9 opened 25 days ago
by
Dao3
MoE offloading strategy?
2
#8 opened 27 days ago
by
Minami-su
Update README.md
#7 opened 28 days ago
by
VanishingPsychopath
kv cache
1
#6 opened about 1 month ago
by
FrankWu
function/tool calling support
6
#5 opened about 1 month ago
by
kaijietti
fail to run the example
8
#4 opened about 1 month ago
by
Leymore
GPTQ plz
9
#3 opened about 1 month ago
by
xuchen123
vllm support
5
#2 opened about 1 month ago
by
Sihangli
llama.cpp support
5
#1 opened about 1 month ago
by
cpumaxx