New discussion

Serving with TGI or vLLM?

1
#3 opened 6 months ago by kno10

only use one gpu?

2
#2 opened 7 months ago by jgbrblmd

persist dequantized model

1
#1 opened 7 months ago by nudelbrot