Support Auto Device Map

by Supreeth - opened May 6, 2023

May 6, 2023

I'm trying run the model on multiple GPU's use device_map="auto" but I get MPTForCausalLM does not support device_map="auto" yet.

rajarshic

May 6, 2023

Same here

pszemraj

May 9, 2023

I took a pass at this on the storywriter variant. can't promise it's perfect, but I added the main block to modeling_mpt.py and let it support gradient checkpointing, seems to work fine. you can check out the commit/details here: https://huggingface.co/ethzanalytics/mpt-7b-storywriter-sharded/commit/0688e28bf6d9c7c0ee98a03628948f81eca2bdd6

also there's a colab demo on the model card if you want to test first etc https://huggingface.co/ethzanalytics/mpt-7b-storywriter-sharded

alex-laptiev

May 24, 2023

Is there a plan to add 'device_map' ?

pszemraj

May 25, 2023

@alex-laptiev it already works! that said, it works for single-GPU as I don't have a multi-GPU setup, so debugging the custom modeling code is tricky without that. Discussing that here

Also - @jprafael replicated what I did on the storywriter and made an instruct version, you can see that here: https://huggingface.co/jprafael/mpt-7b-instruct-sharded

hope that helps!

abhi-mosaic

Jun 3, 2023

device_map is now supported with this PR: https://huggingface.co/mosaicml/mpt-7b-instruct/discussions/41

abhi-mosaic changed discussion status to closed Jun 3, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment