--- license: llama2 library_name: gguf pipeline_tag: text-generation --- GGUF importance matrix (imatrix) quants for https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw. The template for this model is very sensitive and must be set very precisely. All whitespace are intended, and special tokens `` and `` must be encodded properly. | Layers | Context | Template | | --- | --- | --- | |
0
|
4096
|
\ Source: system

{instructions}\ Source: user

{prompt}\ Source: assistant
Destination: user

{response}
|