Use ybelkada/Mixtral-8x7B-Instruct-v0.1-AWQ with VLLM instead
1
#10 opened 10 months ago
by
blobpenguin
![](https://cdn-avatars.huggingface.co/v1/production/uploads/6458f83cf1f5263c2ef17c3c/EeskzQ2dIA86C93AlXdjl.jpeg)
Inference taking too much time
3
#9 opened 12 months ago
by
tariksetia
Update README.md
#8 opened 12 months ago
by
skoita
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
2
#7 opened about 1 year ago
by
aaganaie
TGI - response is an empty string
2
#6 opened about 1 year ago
by
p-christ
OC is not a multiple of cta_N = 64
2
#5 opened about 1 year ago
by
lazyDataScientist
![](https://cdn-avatars.huggingface.co/v1/production/uploads/633a39ec8f27255b6b571101/7J_BcRm7ua0WZNIGwEzlo.png)
Not supporting with TGI
1
#4 opened about 1 year ago
by
abhishek3jangid
always getting 0 in output
15
#3 opened about 1 year ago
by
xubuild
OOM under vLLM even with 80GB GPU
5
#2 opened about 1 year ago
by
mike-ravkine
Not supported for TGI > 1.3 ?
20
#1 opened about 1 year ago
by
paulcx