Is it QLoRA or a full finetune?

by Andriy - opened Mar 11, 2024

Mar 11, 2024

Hi! A question: did you have challenges with using DeepSpeed ZeRO-3 and full finetune? I'm asking because we have an issue with LLMs and DeepSpeed ZeRO-3. The issue is that if you load on LLM with ZeRO-3, then save, and then load again, the model becomes broken. Did you experience something like that?

ibivibiv

Owner Mar 24, 2024

I usually do a regular LoRa (not Q) and then merge the weights back to the original model. This also lets me target different layers as I work upward from base layers to the final ones. Hopefully that helps, I didn't use DeepSpeed at all since I cheat a bit by the repetitive LoRa trick :)

ibivibiv changed discussion status to closed Apr 3, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment