Very high loss compared to keras

#46
by tanimazsin130 - opened

When I use H5 weights, the loss is stable and satisfactory. However, when I switch to using HuggingFace model weights with the HuggingFace trainer, the loss begins at 50 and does not decrease below 5. I believe there is a need for a proper conversion of the model weights, as the current weights seem ineffective.

tanimazsin130 changed discussion title from Very high loss compared to keras 3 to Very high loss compared to keras

I also encountered the same problem. I have always suspected that it is my own problem...

Google org

@osanseviero Any idea if this has to do with recent fixes made about model inconsistencies?

Google org

ROPE should already have solved some of them, we are looking into the loss

I'm also experiencing the same.

Google org

Can you try on the latest transformers? We recently fixed some issues with respect to training stability which has been pushed to a patch release

Sign up or log in to comment