Usage with the updated CLIP
Can this be used without retraining on top of the updated ViT-L/14@336px version of CLIP? I already tried it and the results are a little different, but I don't know if it's the additional accuracy from the higher resolution or the improved predictor is not compatible.
No, it can't.
If you use the ViT-L/14@336px as a backbone, you should finetune added MLP layers with a training dataset for aesthetic predictor
Is that difficult for a novice? Can I just use the same dataset as was used for training the previous version or the images there are already cropped to 224?
Any chances of the author of this repo on github or here to update the predictor himself?
It’s not hard one but if you have no idea to train any DNN model, it could be hard.
I’m not involved with LAION Team but there is a very detail blog post in LAION Blog (https://laion.ai/blog/laion-aesthetics/)
You can find the training dataset and the way to train previous models.