How to implement model parallelism in gpt-neox?

by sunyt32 - opened Aug 20, 2022

Aug 20, 2022

In gpt-j-6b, the code contains "parallelize" function which helps to parallelize the model conveniently. It seems difficult to put the entire model in one GPU. Or there is a more fancy way to do that?

stellaathena

EleutherAI org Aug 20, 2022

This model does not fit on one GPU unless you have an 80 GB A100, an A40, or an A6000. It will just barely fit for running inference in 48 GB and does not fit in 40 GB.

stellaathena changed discussion status to closed Aug 27, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment