Text Generation
Transformers
PyTorch
mpt
Composer
MosaicML
llm-foundry
custom_code
text-generation-inference

Support Auto Device Map

#4
by Supreeth - opened

I'm trying run the model on multiple GPU's use device_map="auto" but I get MPTForCausalLM does not support device_map="auto" yet.

I took a pass at this on the storywriter variant. can't promise it's perfect, but I added the main block to modeling_mpt.py and let it support gradient checkpointing, seems to work fine. you can check out the commit/details here: https://huggingface.co/ethzanalytics/mpt-7b-storywriter-sharded/commit/0688e28bf6d9c7c0ee98a03628948f81eca2bdd6

also there's a colab demo on the model card if you want to test first etc https://huggingface.co/ethzanalytics/mpt-7b-storywriter-sharded

Is there a plan to add 'device_map' ?

@alex-laptiev it already works! that said, it works for single-GPU as I don't have a multi-GPU setup, so debugging the custom modeling code is tricky without that. Discussing that here

Also - @jprafael replicated what I did on the storywriter and made an instruct version, you can see that here: https://huggingface.co/jprafael/mpt-7b-instruct-sharded

hope that helps!

Mosaic ML, Inc. org

device_map is now supported with this PR: https://huggingface.co/mosaicml/mpt-7b-instruct/discussions/41

abhi-mosaic changed discussion status to closed

Sign up or log in to comment