YifanXu
/

libra-vision-tokenizer

Model card Files Files and versions Community

Libra Vision Tokenizer

Libra: Building Decoupled Vision System on Large Language Models

This repo provides the pretrained weight of Libra vision tokenizer trained with lookup-free quantization.

!!! NOTE !!!

Please merge the weights into llama-2-7b-chat-hf-libra (huggingface version of LLaMA2-7B-Chat).
Please download the pretrained CLIP model in huggingface and merge it into the path. The CLIP model can be downloaded here.

The files should be organized as:

llama-2-7b-chat-hf-libra/
|
│   # original llama files
|
├── ...
│   
│   # newly added vision tokenizer
│   
├── vision_tokenizer_config.yaml
├── vqgan.ckpt
│
│   # CLIP model
│
└── openai-clip-vit-large-patch14-336/
    └── ...

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including YifanXu/libra-vision-tokenizer

Libra

The official repo for the ICML2024 paper: Libra: Building Decoupled Vision System on Large Language Models • 3 items • Updated May 16, 2024