You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model card for CLIP ViT-T-16 from CLIP-KD method trained from CC3MCC12M and knowledge distillation from CLIP ViT-B-16 to CC3MCC12M CLIP ViT-T-16

Model Description

A CLIP ViT-T/16 model trained by CLIP-KD method with the using CC3MCC12M. The weight of this model is converted from ViT-B-16_cc3m_12m_kd_ViT-T-16_cc3m_12m_ep32.pt in open_clip to huggingface clip format.

Reference

Please refer to the original work.

@inproceedings{yang2024clip,
  title={CLIP-KD: An Empirical Study of CLIP Model Distillation},
  author={Yang, Chuanguang and An, Zhulin and Huang, Libo and Bi, Junyu and Yu, Xinqiang and Yang, Han and Diao, Boyu and Xu, Yongjun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2024}
}
Downloads last month
49
Safetensors
Model size
46.1M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including romrawinjp/clip-kd_ViT-T-16-CC3M12M_KD-CC3M12M