Transformers
GGUF
English
tinyllama

Upload tinyllama-1.1b-chat-v1.0.Q4_1.gguf

#3
by jbochi - opened

I've been working on adding GGUF support to MLX, and Q4_1 seems like the format that's the most aligned with MLX quantization. The quantization error is also a bit better than Q4_0 (tested with gguf-tools)

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment