gmastrapas's picture
feat: initial commit
56fe6da
|
raw
history blame
419 Bytes

Jina CLIP

The Jina CLIP implementation is hosted in this repository. The model uses:

  • the EVA 02 architecture for the vision tower
  • the Jina BERT with Flash Attention model as a text tower

To use the Jina CLIP model, the following packages are required:

  • torch
  • timm
  • transformers
  • einops
  • xformers to use x-attention
  • flash-attn to use flash attention
  • apex to use fused layer normalization