Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,6 @@
|
|
1 |
# Deep Model Assembling
|
2 |
|
3 |
-
This repository contains the
|
4 |
-
|
5 |
-
<p align="center">
|
6 |
-
<img src="imgs/teaser.png" width= "450">
|
7 |
-
</p>
|
8 |
|
9 |
> **Title**:  [**Deep Model Assembling**](https://arxiv.org/abs/2212.04129)
|
10 |
> **Authors**: [Zanlin Ni](https://scholar.google.com/citations?user=Yibz_asAAAAJ&hl=en&oi=ao), [Yulin Wang](https://scholar.google.com/citations?hl=en&user=gBP38gcAAAAJ), Jiangwei Yu, [Haojun Jiang](https://scholar.google.com/citations?hl=en&user=ULmStp8AAAAJ), [Yue Cao](https://scholar.google.com/citations?hl=en&user=iRUO1ckAAAAJ), [Gao Huang](https://scholar.google.com/citations?user=-P9LwcgAAAAJ&hl=en&oi=ao) (Corresponding Author)
|
@@ -12,17 +8,10 @@ This repository contains the official code for [Deep Model Assembling](https://a
|
|
12 |
> **Publish**: *arXiv preprint ([arXiv 2212.04129](https://arxiv.org/abs/2212.04129))*
|
13 |
> **Contact**: nzl22 at mails dot tsinghua dot edu dot cn
|
14 |
|
15 |
-
## News
|
16 |
-
|
17 |
-
- `Dec 10, 2022`: release code for training ViT-B, ViT-L and ViT-H on ImageNet-1K.
|
18 |
-
|
19 |
## Overview
|
20 |
|
21 |
In this paper, we present a divide-and-conquer strategy for training large models. Our algorithm, Model Assembling, divides a large model into smaller modules, optimizes them independently, and then assembles them together. Though conceptually simple, our method significantly outperforms end-to-end (E2E) training in terms of both training efficiency and final accuracy. For example, on ViT-H, Model Assembling outperforms E2E training by **2.7%**, while reducing the training cost by **43%**.
|
22 |
|
23 |
-
<p align="center">
|
24 |
-
<img src="imgs/ours.png" width= "900">
|
25 |
-
</p>
|
26 |
|
27 |
## Data Preparation
|
28 |
|
@@ -153,40 +142,6 @@ python -m torch.distributed.launch --nproc_per_node=${NGPUS} --master_port=23346
|
|
153 |
|
154 |
</details>
|
155 |
|
156 |
-
## Results
|
157 |
-
|
158 |
-
### Results on ImageNet-1K
|
159 |
-
|
160 |
-
<p align="center">
|
161 |
-
<img src="./imgs/in1k.png" width= "900">
|
162 |
-
</p>
|
163 |
-
|
164 |
-
### Results on CIFAR-100
|
165 |
-
|
166 |
-
<p align="center">
|
167 |
-
<img src="./imgs/cifar.png" width= "900">
|
168 |
-
</p>
|
169 |
-
|
170 |
-
### Training Efficiency
|
171 |
-
|
172 |
-
- Comparing different training budgets
|
173 |
-
|
174 |
-
<p align="center">
|
175 |
-
<img src="./imgs/efficiency.png" width= "900">
|
176 |
-
</p>
|
177 |
-
|
178 |
-
- Detailed convergence curves of ViT-Huge
|
179 |
-
|
180 |
-
<p align="center">
|
181 |
-
<img src="./imgs/huge_curve.png" width= "450">
|
182 |
-
</p>
|
183 |
-
|
184 |
-
### Data Efficiency
|
185 |
-
|
186 |
-
<p align="center">
|
187 |
-
<img src="./imgs/data_efficiency.png" width= "450">
|
188 |
-
</p>
|
189 |
-
|
190 |
## Citation
|
191 |
|
192 |
If you find our work helpful, please **star🌟** this repo and **cite📑** our paper. Thanks for your support!
|
|
|
1 |
# Deep Model Assembling
|
2 |
|
3 |
+
This repository contains the pre-trained models for [Deep Model Assembling](https://arxiv.org/abs/2212.04129).
|
|
|
|
|
|
|
|
|
4 |
|
5 |
> **Title**:  [**Deep Model Assembling**](https://arxiv.org/abs/2212.04129)
|
6 |
> **Authors**: [Zanlin Ni](https://scholar.google.com/citations?user=Yibz_asAAAAJ&hl=en&oi=ao), [Yulin Wang](https://scholar.google.com/citations?hl=en&user=gBP38gcAAAAJ), Jiangwei Yu, [Haojun Jiang](https://scholar.google.com/citations?hl=en&user=ULmStp8AAAAJ), [Yue Cao](https://scholar.google.com/citations?hl=en&user=iRUO1ckAAAAJ), [Gao Huang](https://scholar.google.com/citations?user=-P9LwcgAAAAJ&hl=en&oi=ao) (Corresponding Author)
|
|
|
8 |
> **Publish**: *arXiv preprint ([arXiv 2212.04129](https://arxiv.org/abs/2212.04129))*
|
9 |
> **Contact**: nzl22 at mails dot tsinghua dot edu dot cn
|
10 |
|
|
|
|
|
|
|
|
|
11 |
## Overview
|
12 |
|
13 |
In this paper, we present a divide-and-conquer strategy for training large models. Our algorithm, Model Assembling, divides a large model into smaller modules, optimizes them independently, and then assembles them together. Though conceptually simple, our method significantly outperforms end-to-end (E2E) training in terms of both training efficiency and final accuracy. For example, on ViT-H, Model Assembling outperforms E2E training by **2.7%**, while reducing the training cost by **43%**.
|
14 |
|
|
|
|
|
|
|
15 |
|
16 |
## Data Preparation
|
17 |
|
|
|
142 |
|
143 |
</details>
|
144 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
145 |
## Citation
|
146 |
|
147 |
If you find our work helpful, please **star🌟** this repo and **cite📑** our paper. Thanks for your support!
|