datatab
/

Yugo45-GPT-Quantized-GGUF

@@ -1,22 +1,58 @@
 ---
 language:
-- en
-license: apache-2.0
 tags:
 - text-generation-inference
 - transformers
-- unsloth
 - mistral
 - gguf
-base_model: datatab/Yugo45-GPT
 ---
-# Uploaded  model
-- **Developed by:** datatab
 - **License:** apache-2.0
-- **Finetuned from model :** datatab/Yugo45-GPT
-This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
 language:
+- sr
+license: cc
 tags:
 - text-generation-inference
 - transformers
 - mistral
 - gguf
+zero base_model: gordicaleksa/YugoGPT
+model_creator: Gordic Aleksa
+model_type: mistral
+quantized_by: datatab
+datasets:
+- datatab/alpaca-cleaned-serbian-full
 ---
+# Yugo45-GPT-Quantized-GGUF
+- **Quantized by:** datatab
 - **License:** apache-2.0
+<!-- description start -->
+## Description
+This repo contains GGUF format model files for [Yugo45-GPT](https://huggingface.co/datatab/Yugo45-GPT).
+<!-- description end -->
+# Quant. preference
+| Quant.           | Description                                                                           |
+|---------------|---------------------------------------------------------------------------------------|
+| not_quantized | Recommended. Fast conversion. Slow inference, big files.                              |
+| fast_quantized| Recommended. Fast conversion. OK inference, OK file size.                             |
+| quantized     | Recommended. Slow conversion. Fast inference, small files.                            |
+| f32           | Not recommended. Retains 100% accuracy, but super slow and memory hungry.             |
+| f16           | Fastest conversion + retains 100% accuracy. Slow and memory hungry.                   |
+| q8_0          | Fast conversion. High resource use, but generally acceptable.                         |
+| q4_k_m        | Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K |
+| q5_k_m        | Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K |
+| q2_k          | Uses Q4_K for the attention.vw and feed_forward.w2 tensors, Q2_K for the other tensors.|
+| q3_k_l        | Uses Q5_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else Q3_K  |
+| q3_k_m        | Uses Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else Q3_K  |
+| q3_k_s        | Uses Q3_K for all tensors                                                             |
+| q4_0          | Original quant method, 4-bit.                                                         |
+| q4_1          | Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models.|
+| q4_k_s        | Uses Q4_K for all tensors                                                             |
+| q4_k          | alias for q4_k_m                                                                      |
+| q5_k          | alias for q5_k_m                                                                      |
+| q5_0          | Higher accuracy, higher resource usage and slower inference.                          |
+| q5_1          | Even higher accuracy, resource usage and slower inference.                            |
+| q5_k_s        | Uses Q5_K for all tensors                                                             |
+| q6_k          | Uses Q8_K for all tensors                                                             |
+| iq2_xxs       | 2.06 bpw quantization                                                                 |
+| iq2_xs        | 2.31 bpw quantization                                                                 |
+| iq3_xxs       | 3.06 bpw quantization                                                                 |
+| q3_k_xs       | 3-bit extra small quantization                                                        |