Spaces:
Running
Running
Errors when quanting Llama 3 based models
#62 opened 4 days ago
by
YorkieOH10
How to use it as an API
6
#57 opened 13 days ago
by
tushar310
Split/shard support
3
#38 opened about 1 month ago
by
phymbert
Add IQ Quantization support with the help of imatrix and GPUs
3
#35 opened about 1 month ago
by
qnixsynapse
Test responsiveness when queue is busy
#34 opened about 1 month ago
by
pcuenq
Maybe impose a max model size?
2
#33 opened about 1 month ago
by
pcuenq