dranger003's picture
Update README.md
e67b83f verified
metadata
license: llama2
library_name: gguf
pipeline_tag: text-generation

GGUF importance matrix (imatrix) quants for https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf
The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.

The template for this model is very sensitive and must be set very precisely.
All whitespace are intended, and special tokens <s> and <step> must be encodded properly.

Layers Context Template
0
4096
<s> Source: system

{instructions}<step> Source: user

{prompt}<step> Source: assistant
Destination: user

{response}