
by KnutJaegersberg - opened

Looks good.


Currently I play with 2 other fine tunes, as my previous qwen fine tunes lowered its performance. I think I might have overfit them. Perhaps the settings in autotrain-advanced are too limited.

I'm trying to understand what I did that affected performance negatively.

I think the loss as reported by autotrain advanced was 0.2 or so. Not sure, but I guess that's the training loss. That sounds rather low, not?

not sure about the learning rate. maybe a memetic learning rate works better

I think the loss as reported by autotrain advanced was 0.2 or so. Not sure, but I guess that's the training loss. That sounds rather low, not?

This is probably because you have a high learning rate. Usually for dataset that has more than 500k samples, i would do 2e-5.

KnutJaegersberg changed discussion status to closed

Sign up or log in to comment