license: apache-2.0
license: apache-2.0
[EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner
This model is RWKV-CLIP-B/32 training on LAION10M(A subset randomly selected form LAION400M). Please refer to https://github.com/deepglint/RWKV-CLIP for more detailed information.
If you find this model useful, please use the following BibTeX entry for citation.
@misc{gu2024rwkvclip,
title={RWKV-CLIP: A Robust Vision-Language Representation Learner},
author={Tiancheng Gu and Kaicheng Yang and Xiang An and Ziyong Feng and Dongnan Liu and Weidong Cai and Jiankang Deng},
year={2024},
eprint={2406.06973},
archivePrefix={arXiv},
primaryClass={cs.CV}
}