Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,6 @@ This repository contains four quantized versions of Mistral-NeMo-Instruct-2407,
|
|
15 |
|
16 |
Models were quantized using llama.cpp (release [b3922](https://github.com/ggerganov/llama.cpp/releases/tag/b3922)). The imatrix versions used an `imatrix.dat` file created from Bartowski's [calibration dataset](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8), mentioned [here](https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF).
|
17 |
|
18 |
-
|
19 |
## Models
|
20 |
|
21 |
| Filename | Size | Description |
|
@@ -29,7 +28,7 @@ I've also included the `imatrix.dat` (7.05 MB) file used to create the imatrix-q
|
|
29 |
|
30 |
## Findings
|
31 |
|
32 |
-
Prompt sensitivity was
|
33 |
|
34 |
For further discussion please see my accompanying [blog post](https://www.drsimonbarnes.com/posts/prompt-sensitivity-revisited-open-source-models/).
|
35 |
|
|
|
15 |
|
16 |
Models were quantized using llama.cpp (release [b3922](https://github.com/ggerganov/llama.cpp/releases/tag/b3922)). The imatrix versions used an `imatrix.dat` file created from Bartowski's [calibration dataset](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8), mentioned [here](https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF).
|
17 |
|
|
|
18 |
## Models
|
19 |
|
20 |
| Filename | Size | Description |
|
|
|
28 |
|
29 |
## Findings
|
30 |
|
31 |
+
Prompt sensitivity was seen specifically in 5-bit models using imatrix quantization, but not with default llama.cpp quantization settings. Prompt sensitivity was not observed in 8-bit models with either quantization method.
|
32 |
|
33 |
For further discussion please see my accompanying [blog post](https://www.drsimonbarnes.com/posts/prompt-sensitivity-revisited-open-source-models/).
|
34 |
|