--- license: apache-2.0 language: - en --- # Model Card for Model ID This is Meta's Llama 2 7B quantized in 2-bit using AutoGPTQ from Hugging Face Transformers. ## Model Details ### Model Description - **Developed by:** [The Kaitchup](https://kaitchup.substack.com/) - **Model type:** Causal (Llama 2) - **Language(s) (NLP):** English - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) ### Model Sources The method and code used to quantize the model are explained here: [Quantize and Fine-tune LLMs with GPTQ Using Transformers and TRL](https://kaitchup.substack.com/p/quantize-and-fine-tune-llms-with) ## Uses This model is pre-trained and not fine-tuned. You may fine-tune it with PEFT using adapters. Note that the 2-bit quantization significantly decreases the performance of Llama 2. ## Other versions - [kaitchup/Llama-2-7b-gptq-4bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-4bit) - [kaitchup/Llama-2-7b-gptq-3bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-3bit) ## Model Card Contact [The Kaitchup](https://kaitchup.substack.com/)