yxlu0 commited on
Commit
4c33a99
·
verified ·
1 Parent(s): 68b10b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -3
README.md CHANGED
@@ -1,3 +1,69 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - ILSVRC/imagenet-1k
5
+ ---
6
+
7
+ # SAK
8
+
9
+ <!-- Provide a quick summary of what the model is/does. -->
10
+
11
+ These are checkpoints for our ICLR2025 paper: **Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning**.
12
+
13
+ ## Model Details
14
+
15
+ ### Model Description
16
+
17
+ <!-- Provide a longer summary of what this model is. -->
18
+
19
+
20
+
21
+ - **Developed by:** Yuxiang Lu, Shengcao Cao, Yu-Xiong Wang
22
+ - **License:** mit
23
+
24
+ ### Model Sources
25
+
26
+ <!-- Provide the basic links for the model. -->
27
+
28
+ - **Repository:** https://github.com/innovator-zero/SAK
29
+ - **Paper [OpenReview]:** https://openreview.net/forum?id=eePww5u7J3
30
+ - **Paper [arXiv]:** https://arxiv.org/abs/2410.14633
31
+ - **Project Page:** https://innovator-zero.github.io/SAK/
32
+
33
+ ## Uses
34
+
35
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
36
+
37
+ Currently we directly provide checkpoints of pre-trained models in this repository. For detailed information on usage, please refer to our [github repository](https://github.com/innovator-zero/SAK).
38
+
39
+ Following are the checkpoint lists:
40
+
41
+ **Stage 1**
42
+ | Teachers | Student backbone | Checkpoint |
43
+ | ----------------------- | ---------------- | ---------- |
44
+ | DINOv2-B, CLIP-B, SAM-B | ViT-S | [BS_s1.pth](https://huggingface.co/yxlu0/SAK/blob/main/BS_s1.pth) |
45
+ | DINOv2-B, CLIP-B, SAM-B | ViT-B | [BB_s1.pth](https://huggingface.co/yxlu0/SAK/blob/main/BB_s1.pth) |
46
+ | DINOv2-L, CLIP-L, SAM-L | ViT-B | [LB_s1.pth](https://huggingface.co/yxlu0/SAK/blob/main/LB_s1.pth) |
47
+ | DINOv2-L, CLIP-L, SAM-L | ViT-L | [LL_s1.pth](https://huggingface.co/yxlu0/SAK/blob/main/LL_s1.pth) |
48
+
49
+ **Stage 2**
50
+
51
+ We provide two example checkpoints after Stage 2 training, initialized by **BB_s1.pth** from Stage 1 training:
52
+
53
+ - PASCAL-Context: [BB_s2_pascal.pth](https://huggingface.co/yxlu0/SAK/blob/main/BB_s2_pascal.pth)
54
+ - NYUD-v2: [BB_s2_nyud.pth](https://huggingface.co/yxlu0/SAK/blob/main/BB_s2_nyud.pth)
55
+
56
+ ## Citation
57
+
58
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
59
+
60
+ ```bibtex
61
+ @inproceedings{lu2025swiss,
62
+ title={Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning},
63
+ author={Yuxiang Lu and Shengcao Cao and Yu-Xiong Wang},
64
+ booktitle={The Thirteenth International Conference on Learning Representations},
65
+ year={2025}
66
+ }
67
+ ```
68
+
69
+