kaitchup
/

Llama-2-7b-gptq-2bit

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

bnjmnmarie commited on Sep 9, 2023

Commit

a8185f7

•

1 Parent(s): 1dba9fc

Create README.md

Files changed (1) hide show

README.md +45 -0

README.md ADDED Viewed

	@@ -0,0 +1,45 @@

+---
+license: apache-2.0
+language:
+- en
+---
+# Model Card for Model ID
+This is Meta's Llama 2 7B quantized in 2-bit using AutoGPTQ from Hugging Face Transformers.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
+- **Model type:** Causal (Llama 2)
+- **Language(s) (NLP):** English
+- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
+### Model Sources
+The method and code used to quantize the model are explained here:
+[Quantize and Fine-tune LLMs with GPTQ Using Transformers and TRL](https://kaitchup.substack.com/p/quantize-and-fine-tune-llms-with)
+## Uses
+This model is pre-trained and not fine-tuned. You may fine-tune it with PEFT using adapters.
+Note that the 2-bit quantization significantly decreases the performance of Llama 2.
+## Other versions
+- [kaitchup/Llama-2-7b-gptq-4bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-4bit)
+- [kaitchup/Llama-2-7b-gptq-3bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-3bit)
+## Model Card Contact
+[The Kaitchup](https://kaitchup.substack.com/)