Textual inversion: Are the imagenet templates fixed?

#84
by xalex - opened

The imagenet templates for objects all talk about a photo, which may be not ideal to train on drawn objects and other things that are not on photos.

Are the templates fixed (e.g. CLIP expects exactly these strings) or can one just change them or add a few like "a picture of {}", "a drawing of {}" and so on?

I'm also wondering this.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment