Upload tinyllama-1.1b-chat-v1.0.Q4_1.gguf

by jbochi - opened Jan 10

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

-0

Upload tinyllama-1.1b-chat-v1.0.Q4_1.gguf4d32359a

jbochi

Jan 10

I've been working on adding GGUF support to MLX, and Q4_1 seems like the format that's the most aligned with MLX quantization. The quantization error is also a bit better than Q4_0 (tested with gguf-tools)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment