GGUF version?
#13 opened 9 months ago
by
CeeGee
![](https://cdn-avatars.huggingface.co/v1/production/uploads/631ecedf124782a19ef65d88/I4yjJ1j5HdnIoLdbAZuNq.png)
Using llama_cpp
#12 opened 10 months ago
by
axcelkuhn
Particularly Censored.
1
#11 opened 10 months ago
by
BingoBird
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/ZdU0rWpmK17L7-ECZI3-H.png)
Weird responses from the LLM
#10 opened 11 months ago
by
PoyBoi
How to generate token by token?
2
#8 opened about 1 year ago
by
YaTharThShaRma999
Question about which .bin file to use and quantization
4
#7 opened about 1 year ago
by
florestankorp
New k-quants formats
1
#6 opened about 1 year ago
by
mudler
![](https://cdn-avatars.huggingface.co/v1/production/uploads/647374aa7ff32a81ac6d35d4/82f2o_ji51oZ9NlG8HVXM.jpeg)
GGML models become dumb when used in python.
2
#5 opened about 1 year ago
by
supercharge19
New quant 8bit method, how is it performing on your CPU? (share your token/s, CPU model and -- thread)
24
#2 opened about 1 year ago
by
alphaprime90