Very high loss compared to keras
#46
by
tanimazsin130
- opened
When I use H5 weights, the loss is stable and satisfactory. However, when I switch to using HuggingFace model weights with the HuggingFace trainer, the loss begins at 50 and does not decrease below 5. I believe there is a need for a proper conversion of the model weights, as the current weights seem ineffective.
tanimazsin130
changed discussion title from
Very high loss compared to keras 3
to Very high loss compared to keras
I also encountered the same problem. I have always suspected that it is my own problem...
@osanseviero Any idea if this has to do with recent fixes made about model inconsistencies?
ROPE should already have solved some of them, we are looking into the loss
I'm also experiencing the same.
Can you try on the latest transformers? We recently fixed some issues with respect to training stability which has been pushed to a patch release