Questions about using pre-trained models

#4
by BBracke - opened

Hello,

our team is looking forward to participate in the competition once again this year.
However, we still have some questions about the competition rules regarding the "use of external data and pre-trained models":

  • Is it allowed to use only pre-trained models on the ImageNet dataset this year?
  • Does the restriction "pre-trained models from standard sources" also include models from the timm library (pytorch-image-models) or the HuggingFace platform?
  • To what extent is (independent) pre-training of models on publically available datasets allowed? In particular, some approaches of last year used the MetaFormer architecture pre-trained on iNaturalist21, which could have advantages over pre-trained ImageNet architectures. For this reason, would pre-training on iNaturalist21 followed by FineTuning on the SnakeCLEF23 dataset be allowed?

Thank you for clarifying our questions.

With kind regards
Team FHDO-BCSG
Dortmund University of Applied Sciences

Hi there,

Sure, you can use timm and all models available through huggingface.
About pre-training of your own models. You can do it if you publish the model on the HuggingFace hub or anywhere else while letting others know about it. For example in this thread.

Hope this is clear. Let me know in case of any doubts.

PS: Didn't they use ImageNet-21k, instead of iNaturalist21?

Best,
Lukas

Hello there,

for our approach this year we use a pre-trained "ConvNeXt-Base V2" model as feature extractor of the timm-libray (model name there: "convnextv2_base.fcmae_ft_in22k_in1k_384").
We additionally fine-tuned this model on the iNaturalist21 dataset for 10 epochs with image size 384x384px (Top1 accuracy: ~88% and Top5 accuracy: 97% on iNaturalist21 validation dataset).
Using this model in our approach, we observed a faster convergence with a lower number of epochs. However, in our approach it comes down to about +1.5% Macro-F1 on the SnakeCLEF23 validation dataset compared to the "convnextv2_base.fcmae_ft_in22k_in1k_384" pre-trained model from timm-libray.

Nevertheless, as Lukas mentioned, we publish the weights of this model for reasons of fairness: https://huggingface.co/BBracke/convnextv2_base.inat21_384

With kind regards
Team FHDO-BCSG
Dortmund University of Applied Sciences

Sign up or log in to comment