ImportError: Using `load_in_8bit=True` requires Accelerate

#68
by ubermenchh - opened

I am getting the following error:
```

ImportError Traceback (most recent call last)
Cell In[6], line 7
1 bnb_config = BitsAndBytesConfig(
2 load_in_4bit=True,
3 bnb_4bit_quant_type='nf4',
4 bnb_4bit_compute_dtype=torch.bfloat16,
5 bnb_4bit_use_double_quant=False
6 )
----> 7 model = AutoModelForCausalLM.from_pretrained(base_model, quantization_config=bnb_config, device_map={'':0})
8 model.config.use_cache = False
9 model.config.pretraining_tp = 1

File /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:565, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
563 elif type(config) in cls._model_mapping.keys():
564 model_class = _get_model_class(config, cls._model_mapping)
--> 565 return model_class.from_pretrained(
566 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
567 )
568 raise ValueError(
569 f"Unrecognized configuration class {config.class} for this kind of AutoModel: {cls.name}.\n"
570 f"Model type should be one of {', '.join(c.name for c in cls._model_mapping.keys())}."
571 )

File /opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py:2681, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
2679 if load_in_8bit or load_in_4bit:
2680 if not (is_accelerate_available() and is_bitsandbytes_available()):
-> 2681 raise ImportError(
2682 "Using load_in_8bit=True requires Accelerate: pip install accelerate and the latest version of"
2683 " bitsandbytes pip install -i https://test.pypi.org/simple/ bitsandbytes or"
2684 " pip install bitsandbytes" 2685 ) 2687 if torch_dtype is None: 2688 # We force thedtypeto be float16, this is a requirement frombitsandbytes 2689 logger.info( 2690 f"Overriding torch_dtype={torch_dtype} withtorch_dtype=torch.float16due to " 2691 "requirements ofbitsandbytes` to enable model loading in 8-bit or 4-bit. "
2692 "Pass your own torch_dtype to specify the dtype of the remaining non-linear layers or pass"
2693 " torch_dtype=torch.float16 to remove this warning."
2694 )

ImportError: Using load_in_8bit=True requires Accelerate: pip install accelerate and the latest version of bitsandbytes pip install -i https://test.pypi.org/simple/ bitsandbytes or pip install bitsandbytes`


I tried updating the transformers lib to the latest version but then it is giving me, ```KeyError: 'mistral'```
I have been upgrading and downgrading the libraries for about 2 hours now.

Hello @ubermenchh , are you working in a notebook? If so, are you restarting the kernel after updating?
Could you share your current environment, given by the output of transformers-cli env? Thank you!

/opt/conda/lib/python3.10/site-packages/torch/cuda/__init__.py:138: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 11040). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
  File "/opt/conda/bin/transformers-cli", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.10/site-packages/transformers/commands/transformers_cli.py", line 55, in main
    service.run()
  File "/opt/conda/lib/python3.10/site-packages/transformers/commands/env.py", line 100, in run
    tf_cuda_available = tf.test.is_gpu_available()
  File "/opt/conda/lib/python3.10/site-packages/tensorflow/python/util/deprecation.py", line 371, in new_func
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/tensorflow/python/framework/test_util.py", line 1932, in is_gpu_available
    for local_device in device_lib.list_local_devices():
  File "/opt/conda/lib/python3.10/site-packages/tensorflow/python/client/device_lib.py", line 41, in list_local_devices
    _convert(s) for s in _pywrap_device_lib.list_devices(serialized_config)
RuntimeError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

This is the output.
Also, i am using kaggle notebooks and i have tried restarting the kernel several times.

I fear this is independent of Mistral or transformers, but linked to your setup of CUDA and torch. The error indicates a mismatch between your CUDA driver and CUDA runtime version. You can try downgrading to an older version of PyTorch to see if it solves your problem in your notebook.

it works with me, using the following code
model_8bit = AutoModelForCausalLM.from_pretrained(
model_path, load_in_8bit=True, device_map='auto',
)

Sign up or log in to comment