Getting "KeyError" when loading model
I have built from source using pip install -q git+https://github.com/huggingface/transformers.git
When trying to load the model:model = AutoModel.from_pretrained("nvidia/NV-Embed-v1",trust_remote_code=True,token=token)
I get the following exception:
KeyError Traceback (most recent call last)
Cell In[11], line 1
----> 1 model = AutoModel.from_pretrained("nvidia/NV-Embed-v1",
2 trust_remote_code=True,
3 token=token) KeyError: 'NVEmbedConfig'
Any hints?
Thank you
Thank you for reporting the issue. Can you try upgrading your transformers package? For example, upgrading the python packages as below,
pip uninstall -y transformer-engine
pip install torch==2.2.0
pip install transformers --upgrade
pip install flash-attn==2.2.0
Same error. KeyError: ‘NVEmbedConfig’. Have uninstalled and installed suggested libraries. Would like to use the model. Any suggestions are appreciated.
For me, passing token will have this issue, when i do
huggingface-cli login
This issue goes away after forcing re-download
@nada5
Could you please post the exact versions under which it works?
I use cuda version 11.8, V11.8.89 (this is fixed).
After the update, I have
sentence-transformers==2.7.0
transformers==4.41.2
torch==2.2.0
flash-attn==2.2.0
but then an ImportError occurs when trying to load "nvidia/NV-Embed-v1":
4 import torch.nn as nn
6 # isort: off
7 # We need to import the CUDA kernels after importing torch
----> 8 import flash_attn_2_cuda as flash_attn_cuda
10 # isort: on
13 def _get_block_size(device, head_dim, is_dropout, is_causal):
14 # This should match the block sizes in the CUDA kernel
When I just try to update flash-attn with !pip install --upgrade flash-attn --no-build-isolation
to flash-attn==2.5.9.post1
, I still get the same ImportError. When I downgrade to torch=2.1.2 (which works fine with other HF models), I am back to KeyError: 'NVEmbedConfig'
.
I got it to work. For me it required a newer Cuda version, it worked with cuda_12.1.r12.1
.