File size: 1,813 Bytes
c84bc36 1abd615 c84bc36 4a02c0d c84bc36 4a02c0d c84bc36 4a02c0d c84bc36 4a02c0d c84bc36 b50b177 c84bc36 ba4c5ba c84bc36 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
---
license: apache-2.0
language:
- en
base_model:
- mistralai/Mistral-Nemo-Instruct-2407
quantized_by: Simon Barnes
---
# Quantized Mistral-NeMo-Instruct-2407 versions for Prompt Sensitivity Blog
This repository contains four quantized versions of Mistral-NeMo-Instruct-2407, created using [llama.cpp](https://github.com/ggerganov/llama.cpp/). The goal was to examine how different quantization methods affect prompt sensitivity with sentiment classification tasks.
## Quantization Details
Models were quantized using llama.cpp (release [b3922](https://github.com/ggerganov/llama.cpp/releases/tag/b3922)). The imatrix versions used an `imatrix.dat` file created from Bartowski's [calibration dataset](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8), mentioned [here](https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF).
## Models
| Filename | Size | Description |
|----------|------|-------------|
| Mistral-NeMo-12B-Instruct-2407-Q8_0.gguf | 13 GB | 8-bit default quantization |
| Mistral-NeMo-12B-Instruct-2407-Q5_0.gguf | 8.73 GB | 5-bit default quantization |
| Mistral-NeMo-12B-Instruct-2407-imatrix-Q8_0.gguf | 13 GB | 8-bit with imatrix quantization |
| Mistral-NeMo-12B-Instruct-2407-imatrix-Q5_0.gguf | 8.73 GB | 5-bit with imatrix quantization |
I've also included the `imatrix.dat` (7.05 MB) file used to create the imatrix-quantized versions.
## Findings
Prompt sensitivity was seen specifically in 5-bit models using imatrix quantization, but not with default llama.cpp quantization settings. Prompt sensitivity was not observed in 8-bit models with either quantization method.
For further discussion please see my accompanying [blog post](https://www.drsimonbarnes.com/posts/prompt-sensitivity-revisited-open-source-models/).
## Author
Simon Barnes |