metadata

license: mit

GroupMamba-Base Model Card

Model Details

GroupMamba-Base is a generic backbone with 57M parameters trained on the ImageNet-1K dataset for vision tasks.

Model type: Parameter-Efficient and Accurate Vision Backbone Based on Group Visual State Space Model
License: Non-commercial license

Model Sources

Repository: https://github.com/amshaker/GroupMamba
Paper: https://arxiv.org/abs/X.X

Uses

The primary use of GroupMamba is research on vision tasks, e.g., classification, segmentation, detection, and instance segmentation, with an SSM-based backbone. The primary intended users of the model are researchers and hobbyists in computer vision, machine learning, and artificial intelligence.

How to Get Started with the Model

You can replace the backbone for vision tasks with the proposed GroupMamba: https://github.com/Amshaker/GroupMamba/blob/main/classification/models/groupmamba.py
Then, you can load this checkpoint and start fine-tuning.

Training Details

GroupMamba is pretrained on ImageNet-1K with classification supervision. The training data is around 1.3M images from ImageNet-1K dataset. See more details in this [paper](https://arxiv.org/abs/X.X.

Evaluation

GroupMamba-Tiny is evaluated on ImageNet-1K val set, and achieves 84.5% Top-1 Acc with only 57M parameters. See more details in this paper.

Additional Information

Citation Information

@article{GroupMamba,
  title={GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model},
  author={Abdelrahman Shaker and Syed Talal Wasim and Salman Khan and Gall Jürgen and Fahad Khan},
  journal={arXiv preprint arXiv:X.X},
  year={2024}
}