OpenGVLab
/

VideoMamba

Video Classification

Model card Files Files and versions Community

Andy1621 commited on Mar 13

Commit

264ee12

•

1 Parent(s): 61e86fe

Update README.md

Files changed (1) hide show

README.md +51 -0

README.md CHANGED Viewed

@@ -1,3 +1,54 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+datasets:
+- AlexFierro9/Kinetics400
+- imagenet-1k
+- HuggingFaceM4/something_something_v2
+language:
+- en
+pipeline_tag: video-classification
 ---
+<br>
+# VideoMamba
+## Model Details
+VideoMamba is a purely SSM-based model for video understanding.
+- **Developed by:** [OpenGVLab](https://github.com/OpenGVLab)
+- **Model type:** An efficient backbone based on the bidirectional state space model.
+- **License:** Non-commercial license
+### Model Sources
+- **Repository:** https://github.com/OpenGVLab/VideoMamba
+- **Paper:** https://arxiv.org/abs/2403.06977
+## Uses
+The primary use of VideoMamba is research on image and video tasks, e.g., image classification, action recognition, long-term video understanding, and video-text retrieval, with an SSM-based backbone.
+The primary intended users of the model are researchers and hobbyists in computer vision, machine learning, and artificial intelligence.
+## How to Get Started with the Model
+- You can replace the backbone for video tasks with the proposed VideoMamba: https://github.com/OpenGVLab/VideoMamba/blob/main/videomamba/video_sm/models/videomamba.py
+- Then you can load this checkpoint and start training.
+### Citation Information
+```
+@misc{li2024videomamba,
+      title={VideoMamba: State Space Model for Efficient Video Understanding},
+      author={Kunchang Li and Xinhao Li and Yi Wang and Yinan He and Yali Wang and Limin Wang and Yu Qiao},
+      year={2024},
+      eprint={2403.06977},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV}
+}
+```