Llama-2-22B-GGML / README.md
IHaveNoClueAndIMustPost's picture
Update README.md
fb5ae76
|
raw
history blame
1.67 kB
metadata
datasets:
  - togethercomputer/RedPajama-Data-1T-Sample
library_name: transformers
pipeline_tag: text-generation
tags:
  - text-generation-inference

This is Llama2-22b in a couple of GGML formats. I have no idea what I'm doing so if something doesn't work as it should or not at all that's likely on me, not the models themselves.
While I haven't had any issues so far do note that the original repo states "Not intended for use as-is - this model is meant to serve as a base for further tuning".

Approximate VRAM requirements at 4K context:

MODEL SIZE VRAM
q5_1 16.4GB 21.5GB
q4_K_M 13.2GB 18.3GB
q3_K_M 10.6GB 16.1GB
q2_K 9.22GB 14.5GB