Video Classification
English

VideoMamba

Model Details

VideoMamba is a purely SSM-based model for video understanding.

  • Developed by: OpenGVLab
  • Model type: An efficient backbone based on the bidirectional state space model.
  • License: Non-commercial license

Model Sources

Uses

The primary use of VideoMamba is research on image and video tasks, e.g., image classification, action recognition, long-term video understanding, and video-text retrieval, with an SSM-based backbone. The primary intended users of the model are researchers and hobbyists in computer vision, machine learning, and artificial intelligence.

How to Get Started with the Model

Citation Information

@misc{li2024videomamba,
      title={VideoMamba: State Space Model for Efficient Video Understanding}, 
      author={Kunchang Li and Xinhao Li and Yi Wang and Yinan He and Yali Wang and Limin Wang and Yu Qiao},
      year={2024},
      eprint={2403.06977},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Datasets used to train OpenGVLab/VideoMamba

Spaces using OpenGVLab/VideoMamba 2

Collection including OpenGVLab/VideoMamba