Edit model card

Mixtral 8x7B Instruct-v0.1 - bitsandbytes 4-bit

This repository contains the bitsandbytes 4-bit quantized version of mistralai/Mixtral-8x7B-Instruct-v0.1. To use it, make sure to have the latest version of bitsandbytes and transformers installed from source:

Loading this model as such: will directly load the quantized model in 4-bit precision.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ybelkada/Mixtral-8x7B-Instruct-v0.1-bnb-4bit"
model = AutoModelForCausalLM.from_pretrained(model_id)

Note you need a CUDA-compatible GPU device to run low-bit precision models with bitsandbytes

Downloads last month
2,699
Safetensors
Model size
24.2B params
Tensor type
F32
FP16
U8
Inference API
Input a message to start chatting with ybelkada/Mixtral-8x7B-Instruct-v0.1-bnb-4bit.
Inference API (serverless) has been turned off for this model.