Image Classification
Edit model card

MambaVision: A Hybrid Mamba-Transformer Vision Backbone.

Model Overview

We introduce a novel mixer block by creating a symmetric path without SSM to enhance the modeling of global context. MambaVision has a hierarchical architecture that employs both self-attention and mixer blocks.

Model Performance

MambaVision demonstrates a strong performance by achieving a new SOTA Pareto-front in terms of Top-1 accuracy and throughput.

Model Usage

You must first login into HuggingFace to pull the model:

huggingface-cli login

The model can be simply used according to:

access_token = "<YOUR ACCESS TOKEN"
model = AutoModel.from_pretrained("nvidia/MambaVision-L-1K", trust_remote_code=True)

License:

NVIDIA Source Code License-NC

Downloads last month

-

Downloads are not tracked for this model. How to track
Unable to determine this model's library. Check the docs .

Dataset used to train nvidia/MambaVision-L-1K