alokabhishek
/

Mistral-7B-Instruct-v0.2-4.0-bpw-exl2

@@ -1,15 +1,138 @@
 ---
 license: apache-2.0
 pipeline_tag: text-generation
 tags:
-- finetuned
-inference: true
-widget:
-- messages:
-  - role: user
-    content: What is your favorite condiment?
 ---
 # Model Card for Mistral-7B-Instruct-v0.2
 The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.

 ---
+library_name: transformers
 license: apache-2.0
 pipeline_tag: text-generation
 tags:
+- ExLlamaV2
+- 4bit
+- Mistral
+- Mistral-7B
+- quantized
+- exl2
+- 4.0-bpw
 ---
+# Model Card for alokabhishek/Mistral-7B-Instruct-v0.2-5.0-bpw-exl2
+<!-- Provide a quick summary of what the model is/does. -->
+This repo contains 4-bit quantized (using ExLlamaV2) model Mistral AI_'s Mistral-7B-Instruct-v0.2
+## Model Details
+- Model creator: [Mistral AI_](https://huggingface.co/mistralai)
+- Original model: [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
+### About 4 bit quantization using ExLlamaV2
+- ExLlamaV2 github repo: [ExLlamaV2 github repo](https://github.com/turboderp/exllamav2)
+# How to Get Started with the Model
+Use the code below to get started with the model.
+## How to run from Python code
+#### First install the package
+```shell
+# Install ExLLamaV2
+!git clone https://github.com/turboderp/exllamav2
+!pip install -e exllamav2
+```
+#### Import
+```python
+from huggingface_hub import login, HfApi, create_repo
+from torch import bfloat16
+import locale
+import torch
+import os
+```
+#### set up variables
+```python
+# Define the model ID for the desired model
+model_id = "alokabhishek/Mistral-7B-Instruct-v0.2-5.0-bpw-exl2"
+BPW = 5.0
+# define variables
+model_name =  model_id.split("/")[-1]
+```
+#### Download the quantized model
+```shell
+!git-lfs install
+# download the model to loacl directory
+!git clone https://{username}:{HF_TOKEN}@huggingface.co/{model_id} {model_name}
+```
+#### Run Inference on quantized model using
+```shell
+# Run model
+!python exllamav2/test_inference.py -m {model_name}/ -p "Tell me a funny joke about Large Language Models meeting a Blackhole in an intergalactic Bar."
+```
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
 # Model Card for Mistral-7B-Instruct-v0.2
 The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.