shi-labs
/

probe_gen_llava-1.5-pt-ift

Image-Text-to-Text

probe_dsg_llava_llama

text-generation

Inference Endpoints

Model card Files Files and versions Community

probe_gen_llava-1.5-pt-ift / README.md

praeclarumjj3's picture

Update README.md

83e1d5b verified 12 days ago

|

history blame contribute delete

1.09 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- en
	pipeline_tag: image-text-to-text
	---

	# probe_gen_llava-1.5-pt-ift

	This model checkpoint contains the gen probes for CLIP-ConvNeXT-XXL Llama-3-8b based LLaVA-1.5 model after the PT and IFT stages, i.e., trained on the LLaVA-558K and LLaVA-665K dataset. Please refer to [documentation](https://github.com/SHI-Labs/OLA-VLM/blob/main/docs/Probing.md) for more details.

	- GitHub Repo: [https://github.com/SHI-Labs/OLA-VLM](https://github.com/SHI-Labs/OLA-VLM)
	- Project Page: [https://praeclarumjj3.github.io/ola_vlm/](https://praeclarumjj3.github.io/ola_vlm/)

	## Citation

	If you found our work useful in your research, please consider starring ⭐ us on [GitHub](https://github.com/SHI-Labs/OLA-VLM) and citing 📚 us in your research!

	```
	@article{jain2024ola_vlm,
	title={{OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation}},
	author={Jitesh Jain and Zhengyuan Yang and Humphrey Shi and Jianfeng Gao and Jianwei Yang},
	journal={arXiv},
	year={2024}
	}
	```