Edit model card


Model Details

VideoMamba is a purely SSM-based model for video understanding.

  • Developed by: OpenGVLab
  • Model type: An efficient backbone based on the bidirectional state space model.
  • License: Non-commercial license

Model Sources


The primary use of VideoMamba is research on image and video tasks, e.g., image classification, action recognition, long-term video understanding, and video-text retrieval, with an SSM-based backbone. The primary intended users of the model are researchers and hobbyists in computer vision, machine learning, and artificial intelligence.

How to Get Started with the Model

Citation Information

      title={VideoMamba: State Space Model for Efficient Video Understanding}, 
      author={Kunchang Li and Xinhao Li and Yi Wang and Yinan He and Yali Wang and Limin Wang and Yu Qiao},
Downloads last month


Downloads are not tracked for this model. How to track
Unable to determine this model's library. Check the docs .