I keep getting this: ImportError: Using `load_in_8bit=True` requires Accelerate

#11
by Zelan - opened

I have already installed Accelerate, made sure it was in the PATH, yet I keep getting this message when I try to load the model. Also, I tried installing bitsandby, but after I did the webui wouldn't work at all.

you need to restart kernel

I have the same issue but I found no solution for it. Is there any solution?

I was trying this in Google Colab and keep getting this error as well. I ensured accelerate and bitsandbytes are installed. still keeps getting this issue. Then realized i was having a particular line in my train command that was causing this issue

autotrain llm --train --project_name '<project_name>' \
--model TinyPixel/Llama-2-7B-bf16-sharded \
--data_path timdettmers/openassistant-guanaco \
--text_column text \
--use_int4 \
--use_peft \
--learning_rate 2e-4 \
--train_batch_size 2 \
--num_train_epochs 3 \
--trainer sft \
--model_max_length 2048 \
--push_to_hub \
--repo_id <repo_id>/<project_name> \
--block_size 2048 > training.log &

After i removed --use_int4 line and executed, the issue got resolved. I hope this helps. Please ensure to use '\' at the end of everyline incase of using this command to train.

Thanks.

having this issue as well, tried restarting the kernel like a 1000 times, still didn't work

Here is the solution that worked for me:

  1. Install Nvidia docker + docker.
  2. Download an nvidia PyTorch 2.0 docker image with Cuda 12.
  3. Create and execute a container.
  4. Install all packages within the container and run your code.
    (you might still need to try different version of dependencies but it finally worked for me)

That was not the issue. My apologies >>>

In the import_utils.py the code used to check if package exist does not work for all packages.

TODO: This doesn't work for all packages (bs4, faiss, etc.) Talk to Sylvain to see how to do with it better.

def _is_package_available(pkg_name: str, return_version: bool = False) -> Union[Tuple[bool, str], bool]:
# Check we're not importing a "pkg_name" directory somewhere but the actual library by trying to grab the version
package_exists = importlib.util.find_spec(pkg_name) is not None
package_version = "N/A"
if package_exists:
try:
package_version = importlib.metadata.version(pkg_name)
package_exists = True
except importlib.metadata.PackageNotFoundError:
package_exists = False
logger.debug(f"Detected {pkg_name} version {package_version}")
if return_version:
return package_exists, package_version
else:
return package_exists

Did this work for anyone? I am facing the same issue

same issue

I think there might be an underlying problem with bitsandbytes.

I'm using slightly different scenario, but the same library and same results..
I'm on on a cpu only and been relying on tutorials from:
https://huggingface.co/blog/4bit-transformers-bitsandbytes
https://huggingface.co/docs/transformers/main_classes/quantization#general-usage

$ sudo pip install bitsandbytes accelerate transformers

'>>> from accelerate import Accelerator
'>>> from transformers import AutoModelForCausalLM

'>>> path = Path('/models/summarization/bart-large-cnn')
'>>> model_8bit = AutoModelForCausalLM.from_pretrained(path, load_in_8bit=True, device_map="auto")
'>>> model_4bit = AutoModelForCausalLM.from_pretrained(path, load_in_4bit=True, device_map="auto")

ImportError: Using load_in_8bit=True requires Accelerate: pip install accelerate and the latest version of bitsandbytes pip install -i https://test.pypi.org/simple/ bitsandbytes or pip install bitsandbytes`

after more digging around, you have to downgrade your version of transformers > pip install transformers==4.32.0
that enabled the load_in_8bit to be recognised, but still doesn't work for CPU.. accepts GPU only.

I'm getting this error too
"ImportError: Using load_in_8bit=True requires Accelerate: pip install accelerate and the latest version of bitsandbytes pip install -i https://test.pypi.org/simple/ bitsandbytes or pip install bitsandbytes`" ~ while using AutoModelCasualLM.
downgrading transformer didn't workout, installing/upgrading accelerate and bitsandbytes either didn't work. I'm using vscode on Mac m2.

You might also need scipy.

I downgraded transformers library to version 4.30 using the following command:
pip install transformers==4.30
Then I restarted the kernel it worked

Worked after downgrading transformers to 4.30.

Was getting the same issue, resolved after downgrading transformers.
pip install transformers==4.30

It turns out that the ImportError: Using load_in_8bit=True requires Accelerate... error message is shown when:

  1. The accelerate module is not found,
  2. The bitsandbytes module is not found, or
  3. torch does not recognize CUDA (PR).

If you can import both accelerate and bitsandbytes and still get this error, it might be that PyTorch is unable to see CUDA.

You can check the CUDA availability with:

torch.cuda.is_available()

If this evaluates to False, you might want to head to StackOverflow to see why PyTorch cannot recognize CUDA.

(In my case, Docker misconfiguration prevented CUDA from loading, which resulted in nvidia-smi showing CUDA Version: ERR!, causing the cryptic ImportError...)

Hi, I got the same error using a wrapper library called Ludwig.
It had a configuration parameter
quantization: bits: 4
By removing this configuration setting, the error disappeared.

At first, I installed the current version of Accelerate and had the error. I then backdated by installing an old version of Accelerate which worked, but it didn't support another pkg, so I installed the current version again, and it worked. Behavior like this suggests there is a history of versions that are required, with lurking version control issues in Accelerate and/or one of the packages it depends on. This is a mess, but I'm finally running after many hours.

This worked for me:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
python -m pip install bitsandbytes --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui
pip install transformers==4.34.0
pip install trl==0.7.1
pip install datasets==2.14.5

Basically you have to use that bitsandbytes instead of pip install bitsandbytes

I guess this can be a bug while importing packages. I have met the same problems, and add from peft import PeftModel, PeftConfig at the beginning worked for me.

I pip install requirements==4.28.0,and it works! I found in environments installation,the requirements.txt shows transformers>=4.28.0,so I did so.

I used llm_int8_enable_fp32_cpu_offload=True instead of load_in_8bit=True, with transformer version 4.30, and it worked.

I tried all the above solutions, but still getting this error
Here is my model.py code

from config import config
from prompts import get_vlm_prompt, get_llm_prompt

import torch

from transformers import (
BitsAndBytesConfig,
InstructBlipProcessor,
InstructBlipForConditionalGeneration,
)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
double_quant_config = BitsAndBytesConfig(load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16)

Here is my configuration (requirements.txt)

fastapi==0.103.2
langchain==0.0.311
multion==0.2.2
openai==0.27.10
Pillow==10.0.1
pydantic==2.4.2
python-dotenv==1.0.0
torch==1.13.1
transformers==4.33.3
sentencepiece==0.1.99
accelerate==0.23.0
bitsandbytes==0.41.1
pydantic-settings==2.0.3
python-multipart==0.0.6

I encountered a similar problem where I was facing an ImportError with the message "Using load_in_8bit=True requires Accelerate."

To resolve this issue, I checked my transformers version, which was initially at 4.30.2. After downgrading it to version 4.30, the problem was successfully resolved.

Anyone here with a Mac with M2 chip? That's apparently cause of my problems. (everything seems to be made for NVidia GPUs)

Couldn't find a workaround for CPU yet, but worked with transformers==4.35.0 on CUDA

I am using an apple M2 chip and can't get around this error

I'm on an M1 and also can't get this to work. I dug around a bit quickly in the Transformers source code and found this which seems to indicate there is no hope to get this working for us Mac users:

def is_bitsandbytes_available():
    if not is_torch_available():
        return False

    # bitsandbytes throws an error if cuda is not available
    # let's avoid that by adding a simple check
    import torch

    return _bitsandbytes_available and torch.cuda.is_available()

Note the requirement for torch.cuda.is_available()

Also, bitsandbytes seems to officially only support CUDA, with this issue about supporting MPS being open and not acted upon: https://github.com/TimDettmers/bitsandbytes/issues/252

Apologies if any of the above analysis is wrong. I'm pretty new to this all.

This is the kind of error that can occur when downgrading transformers to accomodate this error on other new models like Zephry-7b-beta based on mistral.

----> 8 llm = HuggingFaceLLM(
9 model_name="HuggingFaceH4/zephyr-7b-beta",
10 tokenizer_name="HuggingFaceH4/zephyr-7b-beta",

/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py in getitem(self, key)
669 return self._extra_content[key]
670 if key not in self._mapping:
--> 671 raise KeyError(key)
672 value = self._mapping[key]
673 module_name = model_type_to_module_name(key)
KeyError: 'mistral'

If using Colab or any other notebooks, Ensure to change from CPU to GPU. This solved this error in my case!

@vishyrjun 's answer was the solution for me, using AutoTrain Advanced hosted UI.

I was trying this in Google Colab and keep getting this error as well. I ensured accelerate and bitsandbytes are installed. still keeps getting this issue. Then realized i was having a particular line in my train command that was causing this issue

autotrain llm --train --project_name '<project_name>' \
--model TinyPixel/Llama-2-7B-bf16-sharded \
--data_path timdettmers/openassistant-guanaco \
--text_column text \
--use_int4 \
--use_peft \
--learning_rate 2e-4 \
--train_batch_size 2 \
--num_train_epochs 3 \
--trainer sft \
--model_max_length 2048 \
--push_to_hub \
--repo_id <repo_id>/<project_name> \
--block_size 2048 > training.log &

After i removed --use_int4 line and executed, the issue got resolved. I hope this helps. Please ensure to use '\' at the end of everyline incase of using this command to train.

Thanks.

Thanks! it works for me in this case in Jupyter lab:

quantization_config = BitsAndBytesConfig(
# load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)

Hello Everyone,

I was getting the same error and thought the problem might be related to pytorch.

https://stackoverflow.com/questions/60987997/why-torch-cuda-is-available-returns-false-even-after-installing-pytorch-with the solution here worked for me.

After checking the cuda with torch.zeros(1).cuda(), I received the error that the video card was not up to date, and after installing the updated video card driver, the problem disappeared.

!pip install transformers==4.34.0 worked for me in Google Colab GPU

I am using Mac M2 chip.
transformers=4.30.0 worked for me,

@sumittyagi25 thanks for sharing ! Are you also using the M2 chip with PyTorch, or the CPU? Do you use model.to('mps')? Or can you share an example script ? Thanks

i installed 4.30 but still it is snot working

@sumittyagi25 thank you! this worked for me

@apshah If the problem persists even after restarting the colab session, then make sure you are running your code on the GPU. This line in transformers library needs cuda enabled environment for accelerate to be loaded.

Sign up or log in to comment