Can textual inversion overfit?

#85
by xalex - opened

I wonder if textual inversion can overfit or may only work better with more training. If I understand it correctly, it optimizes the position of a single token in the text embedding. I wonder if an overfitted solution is still a good solution or even the best and if local minima are the bigger problem.
In one test, 6000 steps (with default parameters, i.e., a constant learning rate) seemed to perform worse than 3000, though.

Sign up or log in to comment