How to enable multi-GPU inference?

by bendiu - opened Aug 15, 2023

Aug 15, 2023

I'm trying to use this on my chunked text docs to generate instruction formatted data for finetuning, but I'm getting this runtimeerror:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!

Any tips on how to fix this?

pszemraj

Owner Aug 17, 2023

I'm not sure, but also why do you need to do that instead of running in parallel separately on the GPUs? the model is like 1.8 GB. or do you have two 1 GB GPUs??

bendiu

Aug 17, 2023

•

edited Aug 17, 2023

I'm ignorant in parallel and distributed computing. I set the device_map to 'auto' thinking it would speed up inference.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment