README.md · dranger003/CodeLlama-70b-Instruct-iMat.GGUF at e67b83f862feb999beb1788c2041e596fabf9120

metadata

license: llama2
library_name: gguf
pipeline_tag: text-generation

GGUF importance matrix (imatrix) quants for https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf
The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.

The template for this model is very sensitive and must be set very precisely.
All whitespace are intended, and special tokens <s> and <step> must be encodded properly.

Layers	Context	Template
0	4096	<s> Source: system {instructions}<step> Source: user {prompt}<step> Source: assistant Destination: user {response}