mmaaz60 commited on
Commit
e522421
β€’
1 Parent(s): ebf08a0

Creates README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸ‘οΈ VideoGPT+ (Phi-3-mini-4K 3.8B)
2
+
3
+ ---
4
+ ## πŸ“ Description
5
+ VideoGPT+ integrates image and video encoders to leverage detailed spatial understanding and global temporal context, respectively. It processes videos in segments using adaptive pooling on features from both encoders, enhancing performance across various video benchmarks.
6
+
7
+ This model contains VideoGPT+ checkpoints with Phi-3-Mini-4K 3.8B LLM for VCGBench, VCGBench-Diverse and MVBench benchmarks.
8
+
9
+ ## πŸ’» Download
10
+ To get started with GLaMM-FullScope, follow these steps:
11
+ ```
12
+ git lfs install
13
+ git clone https://huggingface.co/MBZUAI/VideoGPT-plus_Phi3-mini-4k
14
+ ```
15
+
16
+ ## πŸ“š Additional Resources
17
+ - **Paper:** [ArXiv](https://arxiv.org/abs/2406.09418).
18
+ - **GitHub Repository:** For training and updates: [GitHub - GLaMM](https://github.com/mbzuai-oryx/VideoGPT-plus).
19
+ - **HuggingFace Collection:** For downloading the pretrained checkpoints, VCGBench-Diverse Benchmarks and Training data, visit [HuggingFace Collection - VideoGPT+](https://huggingface.co/collections/MBZUAI/videogpt-665c8643221dda4987a67d8d).
20
+
21
+ ## πŸ“œ Citations and Acknowledgments
22
+
23
+ ```bibtex
24
+ @article{Maaz2024VideoGPT+,
25
+ title={VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding},
26
+ author={Maaz, Muhammad and Rasheed, Hanoona and Khan, Salman and Khan, Fahad Shahbaz},
27
+ journal={arxiv},
28
+ year={2024},
29
+ url={https://arxiv.org/abs/2406.09418}
30
+ }