[BUG] {'use_reentrant': True} results in "Gradients will be None"

#74

by RonanMcGovern - opened 14 days ago

14 days ago

Seems there's no way to use reentrancy for gradient checkpointing without errors. This results in high memory for fine-tuning.

See this issue

2U1

4 days ago

model.enable_input_require_grads() maybe adding this line could help.
I'm using the code with {'use_reentrant': True}

RonanMcGovern

about 22 hours ago

Ok many thanks, I'll try that on the next fine-tune. (although I don't understand exactly what that would fix things, seems it might, I just don't understand the theory).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment