|
--- |
|
base_model: mistralai/Mixtral-8x7B-Instruct-v0.1 |
|
inference: false |
|
language: |
|
- fr |
|
- it |
|
- de |
|
- es |
|
- en |
|
license: apache-2.0 |
|
model_creator: Mistral AI_ |
|
model_name: Mixtral 8X7B Instruct v0.1 |
|
model_type: mixtral |
|
prompt_template: '[INST] {prompt} [/INST] |
|
|
|
' |
|
quantized_by: OptimizeLLM |
|
--- |
|
|
|
This is Mistral AI's Mixtral Instruct v0.1 model, quantized on 02/24/2024. It works well. |
|
|
|
# How to quantize your own models with Windows and an RTX GPU: |
|
|
|
## Requirements: |
|
* git |
|
* python |
|
|
|
# Instructions: |
|
The following example starts at the root of D drive and quantizes mistral's Mixtral-9x7B-Instruct-v0.1. |
|
|
|
## Windows command prompt - folder setup and git clone llama.cpp |
|
* D: |
|
* mkdir Mixtral |
|
* git clone https://github.com/ggerganov/llama.cpp |
|
|
|
## Download llama.cpp |
|
Assuming you want CUDA for your NVIDIA RTX GPU(s) use the links below, or grab latest compiled executables from https://github.com/ggerganov/llama.cpp/releases |
|
|
|
### Latest version as of Feb 24, 2024: |
|
* https://github.com/ggerganov/llama.cpp/releases/download/b2253/cudart-llama-bin-win-cu12.2.0-x64.zip |
|
* https://github.com/ggerganov/llama.cpp/releases/download/b2253/llama-b2253-bin-win-cublas-cu12.2.0-x64.zip |
|
|
|
Extract the two .zip files directly into the llama.cpp folder you just git cloned. Overwrite files as prompted. |
|
|
|
## Download Mixtral |
|
* Download the full-blast version of the model by downloading all .safetensors, .json, and .model files to D:\Mixtral\: |
|
* https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1 |
|
|
|
|
|
## Windows command prompt - Convert the model to fp16: |
|
* D:\llama.cpp>python convert.py D:\Mixtral --outtype f16 --outfile D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin |
|
|
|
## Windows command prompt - Quantize the fp16 model to q5_k_m: |
|
* D:\llama.cpp>quantize.exe D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.q5_k_m.gguf q5_k_m |
|
|
|
That's it! |
|
|