OptimizeLLM
/

Mixtral-8x7B-Instruct-v0.1.q5_k_m

Model card Files Files and versions Community

Mixtral-8x7B-Instruct-v0.1.q5_k_m / README.md

OptimizeLLM's picture

Update README.md

7e61312 verified 8 months ago

|

No virus

1.87 kB

	---
	base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
	inference: false
	language:
	- fr
	- it
	- de
	- es
	- en
	license: apache-2.0
	model_creator: Mistral AI_
	model_name: Mixtral 8X7B Instruct v0.1
	model_type: mixtral
	prompt_template: '[INST] {prompt} [/INST]

	'
	quantized_by: OptimizeLLM
	---

	This is Mistral AI's Mixtral Instruct v0.1 model, quantized on 02/24/2024. It works well.

	# How to quantize your own models with Windows and an RTX GPU:

	## Requirements:
	* git
	* python

	# Instructions:
	The following example starts at the root of D drive and quantizes mistral's Mixtral-9x7B-Instruct-v0.1.

	## Windows command prompt - folder setup and git clone llama.cpp
	* D:
	* mkdir Mixtral
	* git clone https://github.com/ggerganov/llama.cpp

	## Download llama.cpp
	Assuming you want CUDA for your NVIDIA RTX GPU(s) use the links below, or grab latest compiled executables from https://github.com/ggerganov/llama.cpp/releases

	### Latest version as of Feb 24, 2024:
	* https://github.com/ggerganov/llama.cpp/releases/download/b2253/cudart-llama-bin-win-cu12.2.0-x64.zip
	* https://github.com/ggerganov/llama.cpp/releases/download/b2253/llama-b2253-bin-win-cublas-cu12.2.0-x64.zip

	Extract the two .zip files directly into the llama.cpp folder you just git cloned. Overwrite files as prompted.

	## Download Mixtral
	* Download the full-blast version of the model by downloading all .safetensors, .json, and .model files to D:\Mixtral\:
	* https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1


	## Windows command prompt - Convert the model to fp16:
	* D:\llama.cpp>python convert.py D:\Mixtral --outtype f16 --outfile D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin

	## Windows command prompt - Quantize the fp16 model to q5_k_m:
	* D:\llama.cpp>quantize.exe D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.fp16.bin D:\Mixtral\Mixtral-8x7B-Instruct-v0.1.q5_k_m.gguf q5_k_m

	That's it!