nans while fine tuning

#14

by edmond - opened Apr 17, 2023

Apr 17, 2023

How come, no matter my learning rate, I end up having a prediction giving nans ?
My inputs' max and min values are okay, the output's too, but for some reason I end up having nans.
I even instantly have nans if I train with a soft prompt self.trans(inputs_embeds=patch_emb) (min an max are okay).
When i predict, before training, the values are fine too. And if i train on bert and inject the information of the soft prompt by adding the same embedding on all the sequence it works fine.

rburke45

Apr 26, 2023

I'm experiencing the same issue. Did you find a solution?

edmond

Apr 26, 2023

•

edited Apr 26, 2023

Good to know, no it failed and I do not think its my fault as I had amazing results with using a masked decoder only https://huggingface.co/microsoft/layoutlmv3-large, but its not supposed to work good as its not trained on natural images.
I have no time to try https://huggingface.co/bigscience/mt0-base (I know bigscience are serious people as bloomz works amazinlgy for me) which might not be buggy, please tell me if you have any result with it.

I'm experiencing the same issue. Did you find a solution? @rburke45

rburke45

Apr 26, 2023

Were you using a reduced precision version of the model? Both the FP16 & INT8 models output nans when training, but the full precision model is training fine for me now.

edmond

Apr 27, 2023

•

edited Apr 27, 2023

Yes i am using fp16 but I read in a thread william falcon saying using fp32 will only postpone the phenomenon.
Maybe he was wrong, did you succeed training it till overfitting ? @rburke45

rburke45

May 1, 2023

Yep, trained a full 40 epochs with overfitting starting around epoch 20. Not sure what issue William Falcon has, but I'm not seeing it here.

edmond

May 29, 2023

Thanks ok

edmond changed discussion status to closed May 29, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment