Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

GGUF Quants with iMatrix for : https://huggingface.co/NousResearch/Yarn-Llama-2-70b-32k

iMatrix (Wiki-c512-ch1k) courtesy of Artefact2.

Quant :

  • IQ1_S ("v3") for full offload on 16GB VRAM, and a good partial offload on 12GB VRAM

Other quants :

https://huggingface.co/Artefact2/Yarn-Llama-2-70b-32k-GGUF

Downloads last month
23
GGUF
Model size
69B params
Architecture
llama
Unable to determine this model's library. Check the docs .