Edit model card

MambaVision: A Hybrid Mamba-Transformer Vision Backbone.

Model Overview

We introduce a novel mixer block by creating a symmetric path without SSM to enhance the modeling of global context. MambaVision has a hierarchical architecture that employs both self-attention and mixer blocks.

Model Performance

MambaVision demonstrates a strong performance by achieving a new SOTA Pareto-front in terms of Top-1 accuracy and throughput.

Model Usage

You must first login into HuggingFace to pull the model:

huggingface-cli login

The model can be simply used according to:

access_token = "<YOUR ACCESS TOKEN"
model = AutoModel.from_pretrained("nvidia/MambaVision-T-2K", trust_remote_code=True)

License:

NVIDIA Source Code License-NC

Downloads last month
0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train nvidia/MambaVision-T2-1K