Quantization made by Richard Erkhov. [Github](https://github.com/RichardErkhov) [Discord](https://discord.gg/pvy7H8DZMG) [Request more models](https://github.com/RichardErkhov/quant_request) h2ogpt-4096-llama2-7b - GGUF - Model creator: https://huggingface.co/h2oai/ - Original model: https://huggingface.co/h2oai/h2ogpt-4096-llama2-7b/ | Name | Quant method | Size | | ---- | ---- | ---- | | [h2ogpt-4096-llama2-7b.Q2_K.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q2_K.gguf) | Q2_K | 2.36GB | | [h2ogpt-4096-llama2-7b.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.IQ3_XS.gguf) | IQ3_XS | 2.6GB | | [h2ogpt-4096-llama2-7b.IQ3_S.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.IQ3_S.gguf) | IQ3_S | 2.75GB | | [h2ogpt-4096-llama2-7b.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q3_K_S.gguf) | Q3_K_S | 2.75GB | | [h2ogpt-4096-llama2-7b.IQ3_M.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.IQ3_M.gguf) | IQ3_M | 2.9GB | | [h2ogpt-4096-llama2-7b.Q3_K.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q3_K.gguf) | Q3_K | 3.07GB | | [h2ogpt-4096-llama2-7b.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q3_K_M.gguf) | Q3_K_M | 3.07GB | | [h2ogpt-4096-llama2-7b.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q3_K_L.gguf) | Q3_K_L | 3.35GB | | [h2ogpt-4096-llama2-7b.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.IQ4_XS.gguf) | IQ4_XS | 3.4GB | | [h2ogpt-4096-llama2-7b.Q4_0.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q4_0.gguf) | Q4_0 | 3.56GB | | [h2ogpt-4096-llama2-7b.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.IQ4_NL.gguf) | IQ4_NL | 3.58GB | | [h2ogpt-4096-llama2-7b.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q4_K_S.gguf) | Q4_K_S | 3.59GB | | [h2ogpt-4096-llama2-7b.Q4_K.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q4_K.gguf) | Q4_K | 3.8GB | | [h2ogpt-4096-llama2-7b.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q4_K_M.gguf) | Q4_K_M | 3.8GB | | [h2ogpt-4096-llama2-7b.Q4_1.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q4_1.gguf) | Q4_1 | 3.95GB | | [h2ogpt-4096-llama2-7b.Q5_0.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q5_0.gguf) | Q5_0 | 4.33GB | | [h2ogpt-4096-llama2-7b.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q5_K_S.gguf) | Q5_K_S | 4.33GB | | [h2ogpt-4096-llama2-7b.Q5_K.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q5_K.gguf) | Q5_K | 4.45GB | | [h2ogpt-4096-llama2-7b.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q5_K_M.gguf) | Q5_K_M | 4.45GB | | [h2ogpt-4096-llama2-7b.Q5_1.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q5_1.gguf) | Q5_1 | 4.72GB | | [h2ogpt-4096-llama2-7b.Q6_K.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q6_K.gguf) | Q6_K | 5.15GB | | [h2ogpt-4096-llama2-7b.Q8_0.gguf](https://huggingface.co/RichardErkhov/h2oai_-_h2ogpt-4096-llama2-7b-gguf/blob/main/h2ogpt-4096-llama2-7b.Q8_0.gguf) | Q8_0 | 6.67GB | Original model description: --- inference: false language: - en license: llama2 model_type: llama pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-2 - h2ogpt --- h2oGPT clone of [Meta's Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b-hf). This model can be fine-tuned with [H2O.ai](https://h2o.ai/) open-source software: - h2oGPT https://github.com/h2oai/h2ogpt/ - H2O LLM Studio https://h2o.ai/platform/ai-cloud/make/llm-studio/ Try our live [h2oGPT demo](https://gpt.h2o.ai) with side-by-side LLM comparisons and private document chat! ## Model Architecture ``` LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(32000, 4096, padding_idx=0) (layers): ModuleList( (0-31): 32 x LlamaDecoderLayer( (self_attn): LlamaAttention( (q_proj): Linear(in_features=4096, out_features=4096, bias=False) (k_proj): Linear(in_features=4096, out_features=4096, bias=False) (v_proj): Linear(in_features=4096, out_features=4096, bias=False) (o_proj): Linear(in_features=4096, out_features=4096, bias=False) (rotary_emb): LlamaRotaryEmbedding() ) (mlp): LlamaMLP( (gate_proj): Linear(in_features=4096, out_features=11008, bias=False) (up_proj): Linear(in_features=4096, out_features=11008, bias=False) (down_proj): Linear(in_features=11008, out_features=4096, bias=False) (act_fn): SiLUActivation() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=4096, out_features=32000, bias=False) ) ```