--- tags: - transformers - xlm-roberta - eva02 - clip library_name: transformers license: cc-by-nc-4.0 --- # Jina CLIP Core implementation of Jina CLIP. The model uses: * the [EVA 02](https://github.com/baaivision/EVA/tree/master/EVA-CLIP/rei/eva_clip) architecture for the vision tower * the [Jina XLM RoBERTa with Flash Attention](https://huggingface.co/jinaai/xlm-roberta-flash-implementation) model as a text tower ## Models that use this implementation - [jinaai/jina-clip-v2](https://huggingface.co/jinaai/jina-clip-v2) - [jinaai/jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1) ## Requirements To use the Jina CLIP source code, the following packages are required: * `torch` * `timm` * `transformers` * `einops` * `xformers` to use x-attention * `flash-attn` to use flash attention * `apex` to use fused layer normalization