nomic-ai/nomic-embed-text-v1 · Commit 2a16d3d1ef9653c6e6fd68d935135cfb63eaa722 raises HTTPError when loading `nomic-ai/nomic-embed-text-v1` in `transformers`

May 2

•

I was loading the model as per documented however after the change in the config.json as well as the deletion of modeling_hf_nomic_bert.py and configuration_hf_nomic_bert.py makes huggingface_hub enable to find the model.

Could not locate the nomic-ai/nomic-bert-2048--configuration_hf_nomic_bert.py inside nomic-ai/nomic-embed-text-v1.
Traceback (most recent call last):
  File "C:\Users\User\anaconda3\envs\web-ext\lib\site-packages\huggingface_hub\utils\_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "C:\Users\User\anaconda3\envs\web-ext\lib\site-packages\requests\models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/nomic-ai/nomic-embed-text-v1/resolve/main/nomic-ai/nomic-bert-2048--configuration_hf_nomic_bert.py

zpn

Nomic AI org May 2

•

edited May 2

Hey sorry I was trying to consolidate all the code into a single place as we had 3-4 different versions! Does this code not work for you?

>>> from transformers import AutoModel
>>> model = AutoModel.from_pretrained("nomic-ai/nomic-embed-text-v1", trust_remote_code=True)
config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.03k/2.03k [00:00<00:00, 18.5MB/s]
configuration_hf_nomic_bert.py: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.96k/1.96k [00:00<00:00, 25.5MB/s]
A new version of the following files was downloaded from https://huggingface.co/nomic-ai/nomic-bert-2048:
- configuration_hf_nomic_bert.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
modeling_hf_nomic_bert.py: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 52.8k/52.8k [00:00<00:00, 79.4MB/s]
A new version of the following files was downloaded from https://huggingface.co/nomic-ai/nomic-bert-2048:
- modeling_hf_nomic_bert.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 547M/547M [00:01<00:00, 446MB/s]
<All keys matched successfully>

I tried with transformers version 4.40, 4.37, 4.35. What version are you using?

zpn changed discussion status to closed May 2

zpn changed discussion status to open May 2

GabrielFreeze-2

May 2

Ran your above code with both transformers==4.26.1 and transformers==4.40, however still the same.

Moreover this is what the .cache\huggingface\hub\models--nomic-ai--nomic-embed-text-v1\snapshots\{commit} looks like. Running on Windows 11.

zpn

Nomic AI org May 2

Hm does it work if you clear the huggingface cache?

GabrielFreeze-2

May 2

•

edited May 2

Clearing hugginface cache successfully redownloads and loads the model!

Python 3.9.19 (main, Mar 21 2024, 17:21:27) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from transformers import AutoModel
>>> model = AutoModel.from_pretrained("nomic-ai/nomic-embed-text-v1", trust_remote_code=True)
pytorch_model.bin: 100%|█████████████████████| 547M/547M [00:05<00:00, 102MB/s]
<All keys matched successfully>
>>> model.eval()
  (embeddings): NomicBertEmbeddings(
    (word_embeddings): Embedding(30528, 768)
    (token_type_embeddings): Embedding(2, 768)
  )
  (emb_drop): Dropout(p=0.0, inplace=False)
  (emb_ln): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
  (encoder): NomicBertEncoder(
    (layers): ModuleList(
      (0-11): 12 x NomicBertBlock(
        (attn): NomicBertAttention(
          (rotary_emb): NomicBertDynamicNTKRotaryEmbedding()
          (Wqkv): Linear(in_features=768, out_features=2304, bias=False)
          (out_proj): Linear(in_features=768, out_features=768, bias=False)
          (drop): Dropout(p=0.0, inplace=False)
        )
        (mlp): NomciBertGatedMLP(
          (fc11): Linear(in_features=768, out_features=3072, bias=False)
          (fc12): Linear(in_features=768, out_features=3072, bias=False)
          (fc2): Linear(in_features=3072, out_features=768, bias=False)
        )
        (dropout1): Dropout(p=0.0, inplace=False)
        (norm1): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (norm2): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout2): Dropout(p=0.0, inplace=False)
      )
    )
  )
)
>>> print("Thanks")
Thanks

GabrielFreeze-2 changed discussion status to closed May 2

GabrielFreeze-2

May 8

Just to add some more information about this issue

Clearing huggingface cache and then reloading the model only worked for me with transformers==4.40.1. I have another dependency that needs transformers==4.26.1 and when trying to load with this version I get the Could not locate the nomic-ai/nomic-bert-2048--configuration_hf_nomic_bert.py inside nomic-ai/nomic-embed-text-v1. like issue https://huggingface.co/nomic-ai/nomic-embed-text-v1/discussions/18

zpn

Nomic AI org May 8

•

edited May 8

@GabrielFreeze-2 Is there any way you can use 4.29.0 that was released last year? It seems like that was when the feature to reference other repos was introduced: https://github.com/huggingface/transformers/releases/tag/v4.29.0

GabrielFreeze-2

May 8

Unfortunately not because I am also using salesforce-lavis==1.0.2 which requires transformers<4.27,>=4.25.0. However I found a work-around by creating two python environments and running the scripts from their respective environment.

Downloading transformers-4.29.0-py3-none-any.whl (7.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.1/7.1 MB 18.8 MB/s eta 0:00:00
Installing collected packages: transformers
  Attempting uninstall: transformers
    Found existing installation: transformers 4.26.1
    Uninstalling transformers-4.26.1:
      Successfully uninstalled transformers-4.26.1
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
salesforce-lavis 1.0.2 requires transformers<4.27,>=4.25.0, but you have transformers 4.29.0 which is incompatible.
Successfully installed transformers-4.29.0

zpn

Nomic AI org May 8

agh i'm sorry that's quite annoying, let me try and think of a better workaround. I'd prefer not to have three files of the same model since they get out of sync and are hard to track hence why I made the switch over