microsoft/Phi-3-small-8k-instruct · Multi-GPU training fails when using device

Jun 25, 2024

Hi, I get an error when finetuning the model using device_map = "auto". The issue looks similar to the 128k variant. The fix is also provided on the below discussion. Could any of you verify this and push a fix? Thanks
https://huggingface.co/microsoft/Phi-3-small-128k-instruct/discussions/19#6677dc5020ff491d382a0221

File "/opt/conda/lib/python3.10/site-packages/triton/runtime/jit.py", line 425, in run
kernel.run(grid_0, grid_1, grid_2, kernel.num_warps, kernel.num_ctas, # number of warps/ctas per instance
ValueError: Pointer argument (at 0) cannot be accessed from Triton (cpu tensor?)

barcelosallan

Jul 2, 2024

•

edited Jul 2, 2024

Same error here for phi-3-small-8k

barcelosallan

Jul 2, 2024

Solved with: https://huggingface.co/microsoft/Phi-3-small-8k-instruct/discussions/25

microsoft
/

Phi-3-small-8k-instruct

Multi-GPU training fails when using device_map = "auto"