maddes8cht's picture
Update README.md
7151a3b
|
raw
history blame
2.63 kB

banner

I am still building the structure of these descriptions.

These will contain increasingly more content to help find the best models for a purpose.

falcon-7b-instruct-grammar - GGUF

mzbac provides fine-tuned model variants specialized in Grammar-Correction.

This is their Falcon-Version.

About GGUF format

gguf is the current file format used by the ggml library. A growing list of Software is using it and can therefore use this model. The core project making use of the ggml library is the llama.cpp project by Georgi Gerganov

Quantization variants

There is a bunch of quantized files available. How to choose the best for you:

legacy quants

Q4_0, Q4_1, Q5_0, Q5_1 and Q8 are legacy quantization types. Nevertheless, they are fully supported, as there are several circumstances that cause certain model not to be compatible with the modern K-quants. Falcon 7B models cannot be quantized to K-quants.

K-quants

K-quants are based on the idea that the quantization of certain parts affects the quality in different ways. If you quantize certain parts more and others less, you get a more powerful model with the same file size, or a smaller file size and lower memory load with comparable performance. So, if possible, use K-quants. With a Q6_K you should find it really hard to find a quality difference to the original model - ask your model two times the same question and you may encounter bigger quality differences.

Original Model Card:

  <center>

GitHub profile for maddes8cht on Stack Exchange, a network of free, community-driven Q&A sites GitHub HuggingFace

HuggingFace