|
--- |
|
license: llama2 |
|
library_name: gguf |
|
pipeline_tag: text-generation |
|
--- |
|
GGUF importance matrix (imatrix) quants for https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf |
|
The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw. |
|
|
|
**NOTE**: The template for this model is very sensitive and must be set very precisely. |
|
All whitespace is intended, and special tokens `<s>` and `<step>` must be encoded properly, i.e. `1` and `32015` respectively. |
|
|
|
| Layers | Context | Template | |
|
| --- | --- | --- | |
|
| <pre>80</pre> | <pre>4096</pre> | <pre>\<s\> Source: system<br><br> {instructions}\<step\> Source: user<br><br> {prompt}\<step\> Source: assistant<br>Destination: user<br><br> {response}</pre> | |