FashionClip on Large Clip

#12
by AliAkbarGH - opened

Hello,

I've recently come up with the Fashionclip. It's a fascinating model, good job!
I tested this model in my dataset. My issue is that because Fashionclip is trained on clip-base, it has lower memory than clip-large, so it works less efficiently on memory-based queries (e.x. "real Madrid t-shirts"). I wonder if you have any repo you trained Fashionclip based on clip-large? I guess this would solve the problem.

Thanks

Hello!

could you explain what do you mean with "lower memory than clip-large"?

Hello!

I mean, the number of parameters in the clip-large model is higher than in the clip-base. This helps the model to "memorize" more data but with higher computation and time costs in training and inference time + encoding in bigger latent space (768 vs 512).
@vinid

Hello again. just a quick follow-up.

We currently don't have a large version. Have you tried promoting with "a photo of a {}" that might help a tiny bit

hmm, I see. yes, I tried something like this. BTW if you could publish your training codes on your github so we can fine-tune other models on your data processing and funnel it would be great. :)

thanks

Hello!

The code we used is the standard you will find on the original clip repo (there's an issue dedicated to the training). You can also use open clip and get similar results.

AliAkbarGH changed discussion status to closed

Sign up or log in to comment