Libra Vision Tokenizer

Libra: Building Decoupled Vision System on Large Language Models

This repo provides the pretrained weight of Libra vision tokenizer trained with lookup-free quantization.

!!! NOTE !!!

  1. Please merge the weights into llama-2-7b-chat-hf-libra (huggingface version of LLaMA2-7B-Chat).

  2. Please download the pretrained CLIP model in huggingface and merge it into the path. The CLIP model can be downloaded here.

The files should be organized as:

llama-2-7b-chat-hf-libra/
|
β”‚   # original llama files
|
β”œβ”€β”€ ...
β”‚   
β”‚   # newly added vision tokenizer
β”‚   
β”œβ”€β”€ vision_tokenizer_config.yaml
β”œβ”€β”€ vqgan.ckpt
β”‚
β”‚   # CLIP model
β”‚
└── openai-clip-vit-large-patch14-336/
    └── ...    
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including YifanXu/libra-vision-tokenizer