I got a KeyError: 4

#1
by nuguriraccoon - opened

When i executed inference code, I got the error below.

Traceback (most recent call last):
File "C:\Users\guja1\OneDrive\Desktop\PycharmProject\CLIP\M3D-CLIP\main.py", line 30, in
image_features = model.encode_image(image)[:, 0]
File "C:\Users\guja1.cache\huggingface\modules\transformers_modules\GoodBaiBai88\M3D-CLIP\ae091d89a0ef38b533ecc4ed21426f7658853963\modeling_m3d_clip.py", line 186, in encode_image
image_feats, _ = self.vision_encoder(image)
File "C:\Users\guja1\anaconda3\envs\CLIP\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\guja1.cache\huggingface\modules\transformers_modules\GoodBaiBai88\M3D-CLIP\ae091d89a0ef38b533ecc4ed21426f7658853963\modeling_m3d_clip.py", line 141, in forward
x = self.patch_embedding(x)
File "C:\Users\guja1\anaconda3\envs\CLIP\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\guja1\anaconda3\envs\CLIP\lib\site-packages\monai\networks\blocks\patchembedding.py", line 141, in forward
x = self.patch_embeddings(x)
File "C:\Users\guja1\anaconda3\envs\CLIP\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\guja1\anaconda3\envs\CLIP\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
input = module(input)
File "C:\Users\guja1\anaconda3\envs\CLIP\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\guja1\anaconda3\envs\CLIP\lib\site-packages\einops\layers\torch.py", line 14, in forward
recipe = self._multirecipe[input.ndim]
KeyError: 4

And below is my code.

import torch
from transformers import AutoTokenizer, AutoModel
import numpy as np
from utils import extract_all_report

device = torch.device("cuda") # or cpu

tokenizer = AutoTokenizer.from_pretrained(
"GoodBaiBai88/M3D-CLIP",
model_max_length=512,
padding_side="right",
use_fast=False
)
model = AutoModel.from_pretrained(
"GoodBaiBai88/M3D-CLIP",
trust_remote_code=True
)
model = model.to(device=device)

image_path = "../data/1.npy"
input_txt = extract_all_report('../raw_data')

text_tensor = tokenizer(input_txt, max_length=512, truncation=True, padding="max_length", return_tensors="pt")
input_id = text_tensor["input_ids"].to(device=device)
attention_mask = text_tensor["attention_mask"].to(device=device)
image = torch.from_numpy(np.load(image_path)).to(device=device)
print(image.shape)

with torch.inference_mode():
image_features = model.encode_image(image)[:, 0]
text_features = model.encode_text(input_id, attention_mask)[:, 0]

As you informed, i prepared 1 * 3 * 256 * 256 normalized CT.npy file but i got this error. I would be very grateful if you would bring some attention to this problem. Thanks!

Hi,

Thank you for your interest in our work.
If you input only one image (1x32x256x256) directly, it should be added a batch dimension, like 1x1x32x256x256.
We recommend the image shape is CxDxHxW and feed to network with BxCxDxHxW. Please try it.

If you have any questions, pls feel free to contact me.

Best regards,
BAI Fan

Sign up or log in to comment