deepspeed / accelerate

#1
by fblgit - opened
Shinoji Research org

do u mind sharing the deepspeed/accelerate config? I tried to replicate this training but on a 8xH100 OOM :D

Shinoji Research org

This model was trained on a single H100 using QLORA. I am currently doing a LORA on 5xa100 to fix the prompting issues (this one is a mixof chatml and mistral), be careful with that setting if finetuning a similar model.

To get the 5xa100s working I used https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/deepspeed_configs/zero3_bf16.json and enabled the zero init (it asks you in the accelerate config dialog), nothing else special.

I tried doing it on 8x 40gb a100s out of curiosity and could not get any configuration of that to work.

I also post a bit about it here: https://twitter.com/Alice_comfy/status/1756675929181467098

Shinoji Research org

Confirmed. Works, 0.4.0

fblgit changed discussion status to closed

Sign up or log in to comment