Vision-CAIR
/

LongVU_Llama3_2_1B

Video-Text-to-Text

Model card Files Files and versions Community

LongVU_Llama3_2_1B / README.md

Vision-CAIR's picture

Create README.md

fcff534 verified 1 day ago

|

561 Bytes

	---
	tags:
	- video-text-to-text
	---
	# Citation

	```
	@article{shen2024longvu,
	title={LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding},
	author={Shen, Xiaoqian and Xiong, Yunyang and Zhao, Changsheng and Wu, Lemeng and Chen, Jun and Zhu, Chenchen and Liu, Zechun and Xiao, Fanyi and Varadarajan, Balakrishnan and Bordes, Florian and Liu, Zhuang and Xu, Hu and J. Kim, Hyunwoo and Soran, Bilge and Krishnamoorthi, Raghuraman and Elhoseiny, Mohamed and Chandra, Vikas},
	journal={arXiv:2410.17434},
	year={2024}
	}
	```