Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,14 @@ sdk: static
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
+
# All Things ViTs: Understanding and Interpreting Attention in Vision (CVPR'23 tutorial)
|
11 |
+
|
12 |
+
*By: [Hila Chefer](https://hila-chefer.github.io) and [Sayak Paul](https://sayak.dev)*
|
13 |
+
|
14 |
+
*Website: [atv.github.io])https://atv.github.io)*
|
15 |
+
|
16 |
+
*Abstract: In this tutorial, we explore different ways to leverage attention in vision. From left to right: (i) attention can be used to explain the predictions by the model (e.g., CLIP for an image-text pair) (ii) By manipulating the attention-based explainability maps, one can enforce that the prediction is made based on the right reasons (e.g., foreground vs. background) (iii) The cross-attention maps of multi-modal models can be used to guide generative models (e.g., mitigating neglect in Stable Diffusion).*
|
17 |
+
|
18 |
+
This organization hosts all the interactive demos to be presented at the tutorial. Below, you can find some of them.
|
19 |
+
|
20 |
+
|