Clarification on best pre-trained model for birds dataset in saved models folder

#1
by Chanuhf - opened

Hello VinayHajare, thanks for contributing the pre-trained models for T2I based on CLIP.

I noticed you provided three pre-trained models in your 'saved_models' folder:

  1. EfficientCLIP-GAN-CC12M.pth
  2. state_epoch_1480.pth
  3. state_epoch_1500.pth

I believe these models are primarily trained on the CUB/Birds dataset. Could you please clarify which of these models is the most efficient in terms of FID score for the Birds dataset? I assume it might be state_epoch_1500.pth, but I got confused because there is also a model (EfficientCLIP-GAN-CC12M.pth) intended for zero-shot tasks on the COCO dataset (CC12M).

Your guidance on which model performs best for birds would be greatly appreciated.

Additionally, if EfficientCLIP-GAN-CC12M.pth is indeed intended for the CC12M dataset, does it improve the FID score compared to GALIP, or does it offer comparable performance?

Chanuhf changed discussion title from Clarification on best pre-trained model for irds dataset in saved models folder to Clarification on best pre-trained model for birds dataset in saved models folder

As far as our study showed the EfficientCLIP-GAN-CC12M.pth is trained on CC12M and its zero shot FID doesn't show much improvement than GALIP, it's FID is nearly same as GALIP. Actually due to shortage of GPU we aren't able train it for more than 25 epoches so if we could train it for more epoches it can show better results. (Note: This is just a assumption)

The Both the state_epoch_1500.th and state_epoch_1480.pth yield near about same FID on CUB dataset. But the state_epoch_1480.pth is more better than state_epoch_1500.pth with the FID difference of .37.

As in demo I am also using state_epoch_1480.pth, so I would recommend same.

The demo space is available on my HF space and it is pinned.

Thanks for your interest in our work.

VinayHajare changed discussion status to closed

Sign up or log in to comment