ImportError: Using `load_in_8bit=True` requires Accelerate

#78
by aimananees - opened

Hi, I'm trying to load a model using quantization but it constantly errors out. I have pip installed all the needed libraries which includes accelerate and bitsandbytes. I have tried multiple times but no luck. Is anyone facing the same issue?

Here's my code snippet:

import torch
from transformers import BitsAndBytesConfig
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

model_id = "tiiuae/falcon-7b"

model_4bit = AutoModelForCausalLM.from_pretrained(
        model_id, 
        device_map="auto",
        quantization_config=quantization_config,
        )

Yeah, I am also facing same issue. Tried all options available on HF, Stackoverflow ,etc

Use previous version of transformers.
!pip install transformers==4.30

If you are in a notebook, restarting the session worked for me. See below.
https://github.com/huggingface/transformers/issues/23323#issuecomment-1568464656

Sign up or log in to comment