This is the Florence-VL 3B Pretrained Checkpoint.

Train on detailed image caption from [PixelProse](https://huggingface.co/datasets/tomg-group-umd/pixelprose) and [ShareGPT4V](https://huggingface.co/datasets/Lin-Chen/ShareGPT4V).

The repository also includes the Pretrained Vision Tower.