ahatamiz commited on
Commit
c073962
1 Parent(s): 419fc4f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -12,8 +12,7 @@ pipeline_tag: image-feature-extraction
12
 
13
  ## Model Overview
14
 
15
- We introduce a novel mixer block by creating a symmetric path without SSM to enhance the modeling of global context. MambaVision has a hierarchical architecture that employs both self-attention and mixer blocks.
16
-
17
 
18
  ## Model Performance
19
 
 
12
 
13
  ## Model Overview
14
 
15
+ We have developed the first hybrid model for computer vision which leverages the strengths of Mamba and Transformers. Specifically, our core contribution includes redesigning the Mamba formulation to enhance its capability for efficient modeling of visual features. In addition, we conducted a comprehensive ablation study on the feasibility of integrating Vision Transformers (ViT) with Mamba. Our results demonstrate that equipping the Mamba architecture with several self-attention blocks at the final layers greatly improves the modeling capacity to capture long-range spatial dependencies. Based on our findings, we introduce a family of MambaVision models with a hierarchical architecture to meet various design criteria.
 
16
 
17
  ## Model Performance
18