This is the feature alignment pre-training work to train only only the multi-modal projector. "Predict" paragraph given caption, ocr and image token

Downloads last month
11
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Space using alexshengzhili/llava-7bv0-mm-projector-ft-with-ocr-caption-prompted-paragraph 1