Issue running in Cuda

#6
by semoal - opened
ord [info] running call_function <built-in function

ord [info] log2>(*(s0**3,), **{}):

ord [info] must be real number, not SymFloat

ord [info] from user code:

ord [info] File

ord [info] "/data/hf/modules/transformers_modules/jinaai/jina

ord [info] -bert-v2-qk-devlin-norm-1e-2/a0ba9b2e7e2613a74d8cb

ord [info] a43f2bbd420699db17c/modeling_bert.py", line 728,

ord [info] in resume_in__get_alibi_head_slopes

ord [info] get_slopes_power_of_2(closest_power_of_2)

ord [info] File

ord [info] "/data/hf/modules/transformers_modules/jinaai/jina

ord [info] -bert-v2-qk-devlin-norm-1e-2/a0ba9b2e7e2613a74d8cb

ord [info] a43f2bbd420699db17c/modeling_bert.py", line 715,

ord [info] in get_slopes_power_of_2

ord [info] start = 2 ** (-(2 ** -(math.log2(n) - 3)))

ord [info] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1

ord [info] for more information

Nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA L40S                    Off | 00000000:00:06.0 Off |                    0 |
| N/A   50C    P0              88W / 350W |   1886MiB / 46068MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A       436      C   /usr/bin/python3.10                        1874MiB |
+---------------------------------------------------------------------------------------+

I'm not quite sure what could happen, any suggerence?

Seems to be an error with jinaai/jina-bert-v2-qk-devlin-norm-1e-2 in combination with torch.compile(model,dynamic=True)
https://github.com/michaelfeil/infinity/issues/115

Jina AI org

@semoal thanks! looking into it!

Jina AI org

hi @semoal can you clean the cache and retry?

Of course, will try asap

Looks fixed Bo Wang! Such a pleasure to use this model :)
Gonna open an issue on huggingface/optimum since it' crashes but it is not your fault

semoal changed discussion status to closed

Sign up or log in to comment