Can textual inversion overfit?

#85

by xalex - opened Oct 5, 2022

Oct 5, 2022

I wonder if textual inversion can overfit or may only work better with more training. If I understand it correctly, it optimizes the position of a single token in the text embedding. I wonder if an overfitted solution is still a good solution or even the best and if local minima are the bigger problem.
In one test, 6000 steps (with default parameters, i.e., a constant learning rate) seemed to perform worse than 3000, though.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment