Libra-Base

Libra: Building Decoupled Vision System on Large Language Models

This model was trained on image-text pairs for basic multi-modal understanding ability.

!!! NOTE !!!

In addition to the pretrained weights in this repo, please download the pretrained CLIP model in huggingface and merge it into the path, as:

libra-base/
β”œβ”€β”€ ...
└── openai-clip-vit-large-patch14-336/
    └── ...  

The CLIP model can be downloaded here.

Downloads last month
4
Safetensors
Model size
11B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Collection including YifanXu/libra-11b-base