Yuanhao Zhai commited on
Commit
482ab8a
1 Parent(s): efaddb4

release code

Browse files
.gitignore CHANGED
@@ -178,3 +178,5 @@ pyrightconfig.json
178
 
179
  *.DS_Store
180
 
 
 
 
178
 
179
  *.DS_Store
180
 
181
+ tmp/
182
+ pretrained/
README.md CHANGED
@@ -1,6 +1,5 @@
1
  # Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning
2
 
3
- This repo contains the original PyTorch implementation of our paper:
4
 
5
  > [**Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning**](https://arxiv.org/abs/2309.01246)
6
  >
@@ -10,4 +9,49 @@ This repo contains the original PyTorch implementation of our paper:
10
  >
11
  > ICCV 2023
12
 
13
- **Code will be released soon!**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning
2
 
 
3
 
4
  > [**Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning**](https://arxiv.org/abs/2309.01246)
5
  >
 
9
  >
10
  > ICCV 2023
11
 
12
+ This repo contains the MIL-FCN version of our WSCL implementation.
13
+
14
+ ## 1. Setup
15
+ Clone this repo
16
+
17
+ ```bash
18
+ git clone git@github.com:yhZhai/WSCL.git
19
+ ```
20
+
21
+ Install packages
22
+ ```bash
23
+ pip install -r requirements.txt
24
+ ```
25
+
26
+ ## 2. Data preparation
27
+
28
+ We provide preprocessed CASIA (v1 and v2), Columbia, and Coverage datasets [here](https://buffalo.box.com/s/2t3eqvwp7ua2ircpdx12sfq04sne4x50).
29
+ Place them under the `data` folder.
30
+
31
+
32
+ ## 3. Training and evaluation
33
+
34
+ Runing the following script to train on CASIAv2, and evalute on CASIAv1, Columbia and Coverage.
35
+
36
+ ```shell
37
+ python main.py --load configs/final.yaml
38
+ ```
39
+
40
+
41
+ ## Citation
42
+ If you feel this project is helpful, please consider citing our paper
43
+ ```bibtex
44
+ @inproceedings{zhai2023towards,
45
+ title={Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning},
46
+ author={Zhai, Yuanhao and Luan, Tianyu and Doermann, David and Yuan, Junsong},
47
+ booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
48
+ pages={22390--22400},
49
+ year={2023}
50
+ }
51
+ ```
52
+
53
+
54
+ ## Acknowledgement
55
+ We would like to thank the following repos for their great work:
56
+ - [awesome-semantic-segmentation-pytorch](https://github.com/Tramac/awesome-semantic-segmentation-pytorch)
57
+ - [DETR](https://github.com/facebookresearch/detr)
configs/final.yaml ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ modality:
2
+ - rgb
3
+ - srm
4
+ - bayar
5
+ train_datalist:
6
+ casia: data/casia_datalist.json
7
+ val_datalist:
8
+ casia: data/casia_datalist.json
9
+ columbia: data/columbia_datalist.json
10
+ coverage: data/coverage_datalist.json
11
+ no_gaussian_blur: True
12
+ no_color_jitter: True
13
+
14
+ # model
15
+ loss_on_mid_map: True
16
+ otsu_sel: True
17
+ otsu_portion: 1
18
+
19
+ # losses
20
+ map_label_weight: 1.
21
+ map_mask_weight: 0.
22
+ volume_mask_weight: 0.
23
+ volume_label_weight: 0.
24
+ consistency_weight: 0.1
25
+ consistency_source: ensemble
26
+ mvc_weight: 0.1
27
+ mvc_single_weight:
28
+ - 1
29
+ - 2
30
+ - 2
31
+ mvc_time_dependent: True
32
+
33
+ # arch
34
+ fcn_up: 16
35
+
36
+ # misc
37
+ batch_size: 36
38
+
39
+ # eval
40
+ tile_size: 1024
data/casia_datalist.json ADDED
The diff for this file is too large to render. See raw diff
 
data/columbia_datalist.json ADDED
@@ -0,0 +1,1997 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nikond70_05_sub_09.tif": {
3
+ "subset": "val",
4
+ "path": "data/columbia/val/au/nikond70_05_sub_09.tif",
5
+ "label": 0
6
+ },
7
+ "canonxt_11_sub_07.tif": {
8
+ "subset": "val",
9
+ "path": "data/columbia/val/au/canonxt_11_sub_07.tif",
10
+ "label": 0
11
+ },
12
+ "canong3_02_sub_06.tif": {
13
+ "subset": "val",
14
+ "path": "data/columbia/val/au/canong3_02_sub_06.tif",
15
+ "label": 0
16
+ },
17
+ "nikond70_08_sub_06.tif": {
18
+ "subset": "val",
19
+ "path": "data/columbia/val/au/nikond70_08_sub_06.tif",
20
+ "label": 0
21
+ },
22
+ "kodakdcs330_03_sub_01.tif": {
23
+ "subset": "val",
24
+ "path": "data/columbia/val/au/kodakdcs330_03_sub_01.tif",
25
+ "label": 0
26
+ },
27
+ "canonxt_11_sub_03.tif": {
28
+ "subset": "val",
29
+ "path": "data/columbia/val/au/canonxt_11_sub_03.tif",
30
+ "label": 0
31
+ },
32
+ "canonxt_11_sub_08.tif": {
33
+ "subset": "val",
34
+ "path": "data/columbia/val/au/canonxt_11_sub_08.tif",
35
+ "label": 0
36
+ },
37
+ "nikond70_05_sub_03.tif": {
38
+ "subset": "val",
39
+ "path": "data/columbia/val/au/nikond70_05_sub_03.tif",
40
+ "label": 0
41
+ },
42
+ "canong3_02_sub_09.tif": {
43
+ "subset": "val",
44
+ "path": "data/columbia/val/au/canong3_02_sub_09.tif",
45
+ "label": 0
46
+ },
47
+ "canonxt_38_sub_08.tif": {
48
+ "subset": "val",
49
+ "path": "data/columbia/val/au/canonxt_38_sub_08.tif",
50
+ "label": 0
51
+ },
52
+ "canonxt_02_sub_02.tif": {
53
+ "subset": "val",
54
+ "path": "data/columbia/val/au/canonxt_02_sub_02.tif",
55
+ "label": 0
56
+ },
57
+ "canonxt_08_sub_03.tif": {
58
+ "subset": "val",
59
+ "path": "data/columbia/val/au/canonxt_08_sub_03.tif",
60
+ "label": 0
61
+ },
62
+ "nikond70_08_sub_08.tif": {
63
+ "subset": "val",
64
+ "path": "data/columbia/val/au/nikond70_08_sub_08.tif",
65
+ "label": 0
66
+ },
67
+ "canonxt_26_sub_02.tif": {
68
+ "subset": "val",
69
+ "path": "data/columbia/val/au/canonxt_26_sub_02.tif",
70
+ "label": 0
71
+ },
72
+ "canonxt_05_sub_07.tif": {
73
+ "subset": "val",
74
+ "path": "data/columbia/val/au/canonxt_05_sub_07.tif",
75
+ "label": 0
76
+ },
77
+ "nikond70_02_sub_01.tif": {
78
+ "subset": "val",
79
+ "path": "data/columbia/val/au/nikond70_02_sub_01.tif",
80
+ "label": 0
81
+ },
82
+ "canonxt_17_sub_09.tif": {
83
+ "subset": "val",
84
+ "path": "data/columbia/val/au/canonxt_17_sub_09.tif",
85
+ "label": 0
86
+ },
87
+ "canonxt_20_sub_09.tif": {
88
+ "subset": "val",
89
+ "path": "data/columbia/val/au/canonxt_20_sub_09.tif",
90
+ "label": 0
91
+ },
92
+ "nikond70_08_sub_09.tif": {
93
+ "subset": "val",
94
+ "path": "data/columbia/val/au/nikond70_08_sub_09.tif",
95
+ "label": 0
96
+ },
97
+ "canonxt_23_sub_09.tif": {
98
+ "subset": "val",
99
+ "path": "data/columbia/val/au/canonxt_23_sub_09.tif",
100
+ "label": 0
101
+ },
102
+ "canong3_08_sub_02.tif": {
103
+ "subset": "val",
104
+ "path": "data/columbia/val/au/canong3_08_sub_02.tif",
105
+ "label": 0
106
+ },
107
+ "canong3_02_sub_08.tif": {
108
+ "subset": "val",
109
+ "path": "data/columbia/val/au/canong3_02_sub_08.tif",
110
+ "label": 0
111
+ },
112
+ "canonxt_02_sub_09.tif": {
113
+ "subset": "val",
114
+ "path": "data/columbia/val/au/canonxt_02_sub_09.tif",
115
+ "label": 0
116
+ },
117
+ "canonxt_08_sub_07.tif": {
118
+ "subset": "val",
119
+ "path": "data/columbia/val/au/canonxt_08_sub_07.tif",
120
+ "label": 0
121
+ },
122
+ "canonxt_23_sub_03.tif": {
123
+ "subset": "val",
124
+ "path": "data/columbia/val/au/canonxt_23_sub_03.tif",
125
+ "label": 0
126
+ },
127
+ "canong3_05_sub_01.tif": {
128
+ "subset": "val",
129
+ "path": "data/columbia/val/au/canong3_05_sub_01.tif",
130
+ "label": 0
131
+ },
132
+ "canonxt_05_sub_09.tif": {
133
+ "subset": "val",
134
+ "path": "data/columbia/val/au/canonxt_05_sub_09.tif",
135
+ "label": 0
136
+ },
137
+ "canonxt_20_sub_04.tif": {
138
+ "subset": "val",
139
+ "path": "data/columbia/val/au/canonxt_20_sub_04.tif",
140
+ "label": 0
141
+ },
142
+ "canonxt_08_sub_01.tif": {
143
+ "subset": "val",
144
+ "path": "data/columbia/val/au/canonxt_08_sub_01.tif",
145
+ "label": 0
146
+ },
147
+ "nikond70_11_sub_02.tif": {
148
+ "subset": "val",
149
+ "path": "data/columbia/val/au/nikond70_11_sub_02.tif",
150
+ "label": 0
151
+ },
152
+ "canonxt_29_sub_09.tif": {
153
+ "subset": "val",
154
+ "path": "data/columbia/val/au/canonxt_29_sub_09.tif",
155
+ "label": 0
156
+ },
157
+ "canonxt_32_sub_01.tif": {
158
+ "subset": "val",
159
+ "path": "data/columbia/val/au/canonxt_32_sub_01.tif",
160
+ "label": 0
161
+ },
162
+ "canonxt_08_sub_06.tif": {
163
+ "subset": "val",
164
+ "path": "data/columbia/val/au/canonxt_08_sub_06.tif",
165
+ "label": 0
166
+ },
167
+ "canong3_08_sub_09.tif": {
168
+ "subset": "val",
169
+ "path": "data/columbia/val/au/canong3_08_sub_09.tif",
170
+ "label": 0
171
+ },
172
+ "canonxt_29_sub_06.tif": {
173
+ "subset": "val",
174
+ "path": "data/columbia/val/au/canonxt_29_sub_06.tif",
175
+ "label": 0
176
+ },
177
+ "canonxt_35_sub_07.tif": {
178
+ "subset": "val",
179
+ "path": "data/columbia/val/au/canonxt_35_sub_07.tif",
180
+ "label": 0
181
+ },
182
+ "canonxt_20_sub_03.tif": {
183
+ "subset": "val",
184
+ "path": "data/columbia/val/au/canonxt_20_sub_03.tif",
185
+ "label": 0
186
+ },
187
+ "nikond70_11_sub_05.tif": {
188
+ "subset": "val",
189
+ "path": "data/columbia/val/au/nikond70_11_sub_05.tif",
190
+ "label": 0
191
+ },
192
+ "canonxt_29_sub_04.tif": {
193
+ "subset": "val",
194
+ "path": "data/columbia/val/au/canonxt_29_sub_04.tif",
195
+ "label": 0
196
+ },
197
+ "canong3_05_sub_07.tif": {
198
+ "subset": "val",
199
+ "path": "data/columbia/val/au/canong3_05_sub_07.tif",
200
+ "label": 0
201
+ },
202
+ "canonxt_05_sub_02.tif": {
203
+ "subset": "val",
204
+ "path": "data/columbia/val/au/canonxt_05_sub_02.tif",
205
+ "label": 0
206
+ },
207
+ "nikond70_11_sub_06.tif": {
208
+ "subset": "val",
209
+ "path": "data/columbia/val/au/nikond70_11_sub_06.tif",
210
+ "label": 0
211
+ },
212
+ "canonxt_02_sub_06.tif": {
213
+ "subset": "val",
214
+ "path": "data/columbia/val/au/canonxt_02_sub_06.tif",
215
+ "label": 0
216
+ },
217
+ "canonxt_05_sub_05.tif": {
218
+ "subset": "val",
219
+ "path": "data/columbia/val/au/canonxt_05_sub_05.tif",
220
+ "label": 0
221
+ },
222
+ "canonxt_38_sub_09.tif": {
223
+ "subset": "val",
224
+ "path": "data/columbia/val/au/canonxt_38_sub_09.tif",
225
+ "label": 0
226
+ },
227
+ "canonxt_38_sub_01.tif": {
228
+ "subset": "val",
229
+ "path": "data/columbia/val/au/canonxt_38_sub_01.tif",
230
+ "label": 0
231
+ },
232
+ "canonxt_14_sub_05.tif": {
233
+ "subset": "val",
234
+ "path": "data/columbia/val/au/canonxt_14_sub_05.tif",
235
+ "label": 0
236
+ },
237
+ "canong3_02_sub_07.tif": {
238
+ "subset": "val",
239
+ "path": "data/columbia/val/au/canong3_02_sub_07.tif",
240
+ "label": 0
241
+ },
242
+ "canonxt_35_sub_04.tif": {
243
+ "subset": "val",
244
+ "path": "data/columbia/val/au/canonxt_35_sub_04.tif",
245
+ "label": 0
246
+ },
247
+ "canonxt_11_sub_01.tif": {
248
+ "subset": "val",
249
+ "path": "data/columbia/val/au/canonxt_11_sub_01.tif",
250
+ "label": 0
251
+ },
252
+ "nikond70_05_sub_08.tif": {
253
+ "subset": "val",
254
+ "path": "data/columbia/val/au/nikond70_05_sub_08.tif",
255
+ "label": 0
256
+ },
257
+ "nikond70_02_sub_06.tif": {
258
+ "subset": "val",
259
+ "path": "data/columbia/val/au/nikond70_02_sub_06.tif",
260
+ "label": 0
261
+ },
262
+ "nikond70_11_sub_08.tif": {
263
+ "subset": "val",
264
+ "path": "data/columbia/val/au/nikond70_11_sub_08.tif",
265
+ "label": 0
266
+ },
267
+ "canong3_05_sub_06.tif": {
268
+ "subset": "val",
269
+ "path": "data/columbia/val/au/canong3_05_sub_06.tif",
270
+ "label": 0
271
+ },
272
+ "canonxt_32_sub_05.tif": {
273
+ "subset": "val",
274
+ "path": "data/columbia/val/au/canonxt_32_sub_05.tif",
275
+ "label": 0
276
+ },
277
+ "canonxt_14_sub_09.tif": {
278
+ "subset": "val",
279
+ "path": "data/columbia/val/au/canonxt_14_sub_09.tif",
280
+ "label": 0
281
+ },
282
+ "canong3_05_sub_03.tif": {
283
+ "subset": "val",
284
+ "path": "data/columbia/val/au/canong3_05_sub_03.tif",
285
+ "label": 0
286
+ },
287
+ "canonxt_20_sub_01.tif": {
288
+ "subset": "val",
289
+ "path": "data/columbia/val/au/canonxt_20_sub_01.tif",
290
+ "label": 0
291
+ },
292
+ "canonxt_29_sub_01.tif": {
293
+ "subset": "val",
294
+ "path": "data/columbia/val/au/canonxt_29_sub_01.tif",
295
+ "label": 0
296
+ },
297
+ "canonxt_26_sub_03.tif": {
298
+ "subset": "val",
299
+ "path": "data/columbia/val/au/canonxt_26_sub_03.tif",
300
+ "label": 0
301
+ },
302
+ "canonxt_38_sub_02.tif": {
303
+ "subset": "val",
304
+ "path": "data/columbia/val/au/canonxt_38_sub_02.tif",
305
+ "label": 0
306
+ },
307
+ "canonxt_38_sub_03.tif": {
308
+ "subset": "val",
309
+ "path": "data/columbia/val/au/canonxt_38_sub_03.tif",
310
+ "label": 0
311
+ },
312
+ "canonxt_26_sub_06.tif": {
313
+ "subset": "val",
314
+ "path": "data/columbia/val/au/canonxt_26_sub_06.tif",
315
+ "label": 0
316
+ },
317
+ "canonxt_20_sub_06.tif": {
318
+ "subset": "val",
319
+ "path": "data/columbia/val/au/canonxt_20_sub_06.tif",
320
+ "label": 0
321
+ },
322
+ "canonxt_08_sub_02.tif": {
323
+ "subset": "val",
324
+ "path": "data/columbia/val/au/canonxt_08_sub_02.tif",
325
+ "label": 0
326
+ },
327
+ "canonxt_14_sub_01.tif": {
328
+ "subset": "val",
329
+ "path": "data/columbia/val/au/canonxt_14_sub_01.tif",
330
+ "label": 0
331
+ },
332
+ "nikond70_11_sub_01.tif": {
333
+ "subset": "val",
334
+ "path": "data/columbia/val/au/nikond70_11_sub_01.tif",
335
+ "label": 0
336
+ },
337
+ "canonxt_05_sub_01.tif": {
338
+ "subset": "val",
339
+ "path": "data/columbia/val/au/canonxt_05_sub_01.tif",
340
+ "label": 0
341
+ },
342
+ "canonxt_17_sub_01.tif": {
343
+ "subset": "val",
344
+ "path": "data/columbia/val/au/canonxt_17_sub_01.tif",
345
+ "label": 0
346
+ },
347
+ "canonxt_38_sub_04.tif": {
348
+ "subset": "val",
349
+ "path": "data/columbia/val/au/canonxt_38_sub_04.tif",
350
+ "label": 0
351
+ },
352
+ "canonxt_17_sub_04.tif": {
353
+ "subset": "val",
354
+ "path": "data/columbia/val/au/canonxt_17_sub_04.tif",
355
+ "label": 0
356
+ },
357
+ "canonxt_35_sub_03.tif": {
358
+ "subset": "val",
359
+ "path": "data/columbia/val/au/canonxt_35_sub_03.tif",
360
+ "label": 0
361
+ },
362
+ "nikond70_02_sub_07.tif": {
363
+ "subset": "val",
364
+ "path": "data/columbia/val/au/nikond70_02_sub_07.tif",
365
+ "label": 0
366
+ },
367
+ "canong3_08_sub_08.tif": {
368
+ "subset": "val",
369
+ "path": "data/columbia/val/au/canong3_08_sub_08.tif",
370
+ "label": 0
371
+ },
372
+ "canonxt_17_sub_08.tif": {
373
+ "subset": "val",
374
+ "path": "data/columbia/val/au/canonxt_17_sub_08.tif",
375
+ "label": 0
376
+ },
377
+ "canonxt_35_sub_09.tif": {
378
+ "subset": "val",
379
+ "path": "data/columbia/val/au/canonxt_35_sub_09.tif",
380
+ "label": 0
381
+ },
382
+ "canonxt_02_sub_08.tif": {
383
+ "subset": "val",
384
+ "path": "data/columbia/val/au/canonxt_02_sub_08.tif",
385
+ "label": 0
386
+ },
387
+ "canonxt_08_sub_05.tif": {
388
+ "subset": "val",
389
+ "path": "data/columbia/val/au/canonxt_08_sub_05.tif",
390
+ "label": 0
391
+ },
392
+ "canonxt_05_sub_06.tif": {
393
+ "subset": "val",
394
+ "path": "data/columbia/val/au/canonxt_05_sub_06.tif",
395
+ "label": 0
396
+ },
397
+ "nikond70_05_sub_07.tif": {
398
+ "subset": "val",
399
+ "path": "data/columbia/val/au/nikond70_05_sub_07.tif",
400
+ "label": 0
401
+ },
402
+ "canonxt_17_sub_02.tif": {
403
+ "subset": "val",
404
+ "path": "data/columbia/val/au/canonxt_17_sub_02.tif",
405
+ "label": 0
406
+ },
407
+ "canonxt_26_sub_08.tif": {
408
+ "subset": "val",
409
+ "path": "data/columbia/val/au/canonxt_26_sub_08.tif",
410
+ "label": 0
411
+ },
412
+ "canonxt_38_sub_06.tif": {
413
+ "subset": "val",
414
+ "path": "data/columbia/val/au/canonxt_38_sub_06.tif",
415
+ "label": 0
416
+ },
417
+ "canonxt_26_sub_04.tif": {
418
+ "subset": "val",
419
+ "path": "data/columbia/val/au/canonxt_26_sub_04.tif",
420
+ "label": 0
421
+ },
422
+ "canonxt_23_sub_01.tif": {
423
+ "subset": "val",
424
+ "path": "data/columbia/val/au/canonxt_23_sub_01.tif",
425
+ "label": 0
426
+ },
427
+ "canong3_05_sub_09.tif": {
428
+ "subset": "val",
429
+ "path": "data/columbia/val/au/canong3_05_sub_09.tif",
430
+ "label": 0
431
+ },
432
+ "canonxt_02_sub_01.tif": {
433
+ "subset": "val",
434
+ "path": "data/columbia/val/au/canonxt_02_sub_01.tif",
435
+ "label": 0
436
+ },
437
+ "nikond70_05_sub_05.tif": {
438
+ "subset": "val",
439
+ "path": "data/columbia/val/au/nikond70_05_sub_05.tif",
440
+ "label": 0
441
+ },
442
+ "canong3_05_sub_04.tif": {
443
+ "subset": "val",
444
+ "path": "data/columbia/val/au/canong3_05_sub_04.tif",
445
+ "label": 0
446
+ },
447
+ "canonxt_11_sub_06.tif": {
448
+ "subset": "val",
449
+ "path": "data/columbia/val/au/canonxt_11_sub_06.tif",
450
+ "label": 0
451
+ },
452
+ "canonxt_02_sub_07.tif": {
453
+ "subset": "val",
454
+ "path": "data/columbia/val/au/canonxt_02_sub_07.tif",
455
+ "label": 0
456
+ },
457
+ "canonxt_29_sub_02.tif": {
458
+ "subset": "val",
459
+ "path": "data/columbia/val/au/canonxt_29_sub_02.tif",
460
+ "label": 0
461
+ },
462
+ "canonxt_26_sub_07.tif": {
463
+ "subset": "val",
464
+ "path": "data/columbia/val/au/canonxt_26_sub_07.tif",
465
+ "label": 0
466
+ },
467
+ "canonxt_14_sub_08.tif": {
468
+ "subset": "val",
469
+ "path": "data/columbia/val/au/canonxt_14_sub_08.tif",
470
+ "label": 0
471
+ },
472
+ "canonxt_14_sub_02.tif": {
473
+ "subset": "val",
474
+ "path": "data/columbia/val/au/canonxt_14_sub_02.tif",
475
+ "label": 0
476
+ },
477
+ "canonxt_11_sub_09.tif": {
478
+ "subset": "val",
479
+ "path": "data/columbia/val/au/canonxt_11_sub_09.tif",
480
+ "label": 0
481
+ },
482
+ "canong3_02_sub_02.tif": {
483
+ "subset": "val",
484
+ "path": "data/columbia/val/au/canong3_02_sub_02.tif",
485
+ "label": 0
486
+ },
487
+ "canonxt_35_sub_02.tif": {
488
+ "subset": "val",
489
+ "path": "data/columbia/val/au/canonxt_35_sub_02.tif",
490
+ "label": 0
491
+ },
492
+ "canonxt_23_sub_05.tif": {
493
+ "subset": "val",
494
+ "path": "data/columbia/val/au/canonxt_23_sub_05.tif",
495
+ "label": 0
496
+ },
497
+ "canonxt_17_sub_06.tif": {
498
+ "subset": "val",
499
+ "path": "data/columbia/val/au/canonxt_17_sub_06.tif",
500
+ "label": 0
501
+ },
502
+ "canonxt_26_sub_05.tif": {
503
+ "subset": "val",
504
+ "path": "data/columbia/val/au/canonxt_26_sub_05.tif",
505
+ "label": 0
506
+ },
507
+ "canonxt_38_sub_05.tif": {
508
+ "subset": "val",
509
+ "path": "data/columbia/val/au/canonxt_38_sub_05.tif",
510
+ "label": 0
511
+ },
512
+ "canonxt_32_sub_08.tif": {
513
+ "subset": "val",
514
+ "path": "data/columbia/val/au/canonxt_32_sub_08.tif",
515
+ "label": 0
516
+ },
517
+ "nikond70_05_sub_02.tif": {
518
+ "subset": "val",
519
+ "path": "data/columbia/val/au/nikond70_05_sub_02.tif",
520
+ "label": 0
521
+ },
522
+ "canonxt_02_sub_05.tif": {
523
+ "subset": "val",
524
+ "path": "data/columbia/val/au/canonxt_02_sub_05.tif",
525
+ "label": 0
526
+ },
527
+ "canonxt_17_sub_03.tif": {
528
+ "subset": "val",
529
+ "path": "data/columbia/val/au/canonxt_17_sub_03.tif",
530
+ "label": 0
531
+ },
532
+ "nikond70_08_sub_02.tif": {
533
+ "subset": "val",
534
+ "path": "data/columbia/val/au/nikond70_08_sub_02.tif",
535
+ "label": 0
536
+ },
537
+ "nikond70_08_sub_05.tif": {
538
+ "subset": "val",
539
+ "path": "data/columbia/val/au/nikond70_08_sub_05.tif",
540
+ "label": 0
541
+ },
542
+ "canonxt_14_sub_03.tif": {
543
+ "subset": "val",
544
+ "path": "data/columbia/val/au/canonxt_14_sub_03.tif",
545
+ "label": 0
546
+ },
547
+ "canonxt_20_sub_07.tif": {
548
+ "subset": "val",
549
+ "path": "data/columbia/val/au/canonxt_20_sub_07.tif",
550
+ "label": 0
551
+ },
552
+ "nikond70_02_sub_04.tif": {
553
+ "subset": "val",
554
+ "path": "data/columbia/val/au/nikond70_02_sub_04.tif",
555
+ "label": 0
556
+ },
557
+ "canonxt_23_sub_06.tif": {
558
+ "subset": "val",
559
+ "path": "data/columbia/val/au/canonxt_23_sub_06.tif",
560
+ "label": 0
561
+ },
562
+ "canonxt_20_sub_02.tif": {
563
+ "subset": "val",
564
+ "path": "data/columbia/val/au/canonxt_20_sub_02.tif",
565
+ "label": 0
566
+ },
567
+ "nikond70_11_sub_09.tif": {
568
+ "subset": "val",
569
+ "path": "data/columbia/val/au/nikond70_11_sub_09.tif",
570
+ "label": 0
571
+ },
572
+ "canong3_08_sub_07.tif": {
573
+ "subset": "val",
574
+ "path": "data/columbia/val/au/canong3_08_sub_07.tif",
575
+ "label": 0
576
+ },
577
+ "canonxt_05_sub_08.tif": {
578
+ "subset": "val",
579
+ "path": "data/columbia/val/au/canonxt_05_sub_08.tif",
580
+ "label": 0
581
+ },
582
+ "canong3_05_sub_05.tif": {
583
+ "subset": "val",
584
+ "path": "data/columbia/val/au/canong3_05_sub_05.tif",
585
+ "label": 0
586
+ },
587
+ "canonxt_17_sub_05.tif": {
588
+ "subset": "val",
589
+ "path": "data/columbia/val/au/canonxt_17_sub_05.tif",
590
+ "label": 0
591
+ },
592
+ "canong3_08_sub_03.tif": {
593
+ "subset": "val",
594
+ "path": "data/columbia/val/au/canong3_08_sub_03.tif",
595
+ "label": 0
596
+ },
597
+ "canonxt_20_sub_08.tif": {
598
+ "subset": "val",
599
+ "path": "data/columbia/val/au/canonxt_20_sub_08.tif",
600
+ "label": 0
601
+ },
602
+ "canonxt_11_sub_05.tif": {
603
+ "subset": "val",
604
+ "path": "data/columbia/val/au/canonxt_11_sub_05.tif",
605
+ "label": 0
606
+ },
607
+ "nikond70_11_sub_03.tif": {
608
+ "subset": "val",
609
+ "path": "data/columbia/val/au/nikond70_11_sub_03.tif",
610
+ "label": 0
611
+ },
612
+ "canong3_05_sub_02.tif": {
613
+ "subset": "val",
614
+ "path": "data/columbia/val/au/canong3_05_sub_02.tif",
615
+ "label": 0
616
+ },
617
+ "canonxt_11_sub_02.tif": {
618
+ "subset": "val",
619
+ "path": "data/columbia/val/au/canonxt_11_sub_02.tif",
620
+ "label": 0
621
+ },
622
+ "nikond70_11_sub_04.tif": {
623
+ "subset": "val",
624
+ "path": "data/columbia/val/au/nikond70_11_sub_04.tif",
625
+ "label": 0
626
+ },
627
+ "nikond70_11_sub_07.tif": {
628
+ "subset": "val",
629
+ "path": "data/columbia/val/au/nikond70_11_sub_07.tif",
630
+ "label": 0
631
+ },
632
+ "canong3_08_sub_05.tif": {
633
+ "subset": "val",
634
+ "path": "data/columbia/val/au/canong3_08_sub_05.tif",
635
+ "label": 0
636
+ },
637
+ "kodakdcs330_01_sub_01.tif": {
638
+ "subset": "val",
639
+ "path": "data/columbia/val/au/kodakdcs330_01_sub_01.tif",
640
+ "label": 0
641
+ },
642
+ "nikond70_08_sub_03.tif": {
643
+ "subset": "val",
644
+ "path": "data/columbia/val/au/nikond70_08_sub_03.tif",
645
+ "label": 0
646
+ },
647
+ "canonxt_32_sub_04.tif": {
648
+ "subset": "val",
649
+ "path": "data/columbia/val/au/canonxt_32_sub_04.tif",
650
+ "label": 0
651
+ },
652
+ "canonxt_29_sub_05.tif": {
653
+ "subset": "val",
654
+ "path": "data/columbia/val/au/canonxt_29_sub_05.tif",
655
+ "label": 0
656
+ },
657
+ "canong3_02_sub_03.tif": {
658
+ "subset": "val",
659
+ "path": "data/columbia/val/au/canong3_02_sub_03.tif",
660
+ "label": 0
661
+ },
662
+ "canonxt_14_sub_04.tif": {
663
+ "subset": "val",
664
+ "path": "data/columbia/val/au/canonxt_14_sub_04.tif",
665
+ "label": 0
666
+ },
667
+ "nikond70_08_sub_07.tif": {
668
+ "subset": "val",
669
+ "path": "data/columbia/val/au/nikond70_08_sub_07.tif",
670
+ "label": 0
671
+ },
672
+ "canong3_02_sub_04.tif": {
673
+ "subset": "val",
674
+ "path": "data/columbia/val/au/canong3_02_sub_04.tif",
675
+ "label": 0
676
+ },
677
+ "canonxt_38_sub_07.tif": {
678
+ "subset": "val",
679
+ "path": "data/columbia/val/au/canonxt_38_sub_07.tif",
680
+ "label": 0
681
+ },
682
+ "canonxt_29_sub_08.tif": {
683
+ "subset": "val",
684
+ "path": "data/columbia/val/au/canonxt_29_sub_08.tif",
685
+ "label": 0
686
+ },
687
+ "canong3_08_sub_06.tif": {
688
+ "subset": "val",
689
+ "path": "data/columbia/val/au/canong3_08_sub_06.tif",
690
+ "label": 0
691
+ },
692
+ "canonxt_35_sub_06.tif": {
693
+ "subset": "val",
694
+ "path": "data/columbia/val/au/canonxt_35_sub_06.tif",
695
+ "label": 0
696
+ },
697
+ "canonxt_35_sub_01.tif": {
698
+ "subset": "val",
699
+ "path": "data/columbia/val/au/canonxt_35_sub_01.tif",
700
+ "label": 0
701
+ },
702
+ "canonxt_20_sub_05.tif": {
703
+ "subset": "val",
704
+ "path": "data/columbia/val/au/canonxt_20_sub_05.tif",
705
+ "label": 0
706
+ },
707
+ "canonxt_23_sub_02.tif": {
708
+ "subset": "val",
709
+ "path": "data/columbia/val/au/canonxt_23_sub_02.tif",
710
+ "label": 0
711
+ },
712
+ "canonxt_35_sub_08.tif": {
713
+ "subset": "val",
714
+ "path": "data/columbia/val/au/canonxt_35_sub_08.tif",
715
+ "label": 0
716
+ },
717
+ "nikond70_05_sub_06.tif": {
718
+ "subset": "val",
719
+ "path": "data/columbia/val/au/nikond70_05_sub_06.tif",
720
+ "label": 0
721
+ },
722
+ "canong3_02_sub_01.tif": {
723
+ "subset": "val",
724
+ "path": "data/columbia/val/au/canong3_02_sub_01.tif",
725
+ "label": 0
726
+ },
727
+ "nikond70_02_sub_03.tif": {
728
+ "subset": "val",
729
+ "path": "data/columbia/val/au/nikond70_02_sub_03.tif",
730
+ "label": 0
731
+ },
732
+ "canonxt_29_sub_03.tif": {
733
+ "subset": "val",
734
+ "path": "data/columbia/val/au/canonxt_29_sub_03.tif",
735
+ "label": 0
736
+ },
737
+ "kodakdcs330_02_sub_01.tif": {
738
+ "subset": "val",
739
+ "path": "data/columbia/val/au/kodakdcs330_02_sub_01.tif",
740
+ "label": 0
741
+ },
742
+ "canonxt_32_sub_06.tif": {
743
+ "subset": "val",
744
+ "path": "data/columbia/val/au/canonxt_32_sub_06.tif",
745
+ "label": 0
746
+ },
747
+ "canonxt_23_sub_07.tif": {
748
+ "subset": "val",
749
+ "path": "data/columbia/val/au/canonxt_23_sub_07.tif",
750
+ "label": 0
751
+ },
752
+ "canonxt_32_sub_07.tif": {
753
+ "subset": "val",
754
+ "path": "data/columbia/val/au/canonxt_32_sub_07.tif",
755
+ "label": 0
756
+ },
757
+ "nikond70_02_sub_08.tif": {
758
+ "subset": "val",
759
+ "path": "data/columbia/val/au/nikond70_02_sub_08.tif",
760
+ "label": 0
761
+ },
762
+ "canonxt_26_sub_01.tif": {
763
+ "subset": "val",
764
+ "path": "data/columbia/val/au/canonxt_26_sub_01.tif",
765
+ "label": 0
766
+ },
767
+ "canonxt_02_sub_04.tif": {
768
+ "subset": "val",
769
+ "path": "data/columbia/val/au/canonxt_02_sub_04.tif",
770
+ "label": 0
771
+ },
772
+ "canonxt_26_sub_09.tif": {
773
+ "subset": "val",
774
+ "path": "data/columbia/val/au/canonxt_26_sub_09.tif",
775
+ "label": 0
776
+ },
777
+ "canonxt_23_sub_04.tif": {
778
+ "subset": "val",
779
+ "path": "data/columbia/val/au/canonxt_23_sub_04.tif",
780
+ "label": 0
781
+ },
782
+ "canong3_08_sub_01.tif": {
783
+ "subset": "val",
784
+ "path": "data/columbia/val/au/canong3_08_sub_01.tif",
785
+ "label": 0
786
+ },
787
+ "canonxt_32_sub_03.tif": {
788
+ "subset": "val",
789
+ "path": "data/columbia/val/au/canonxt_32_sub_03.tif",
790
+ "label": 0
791
+ },
792
+ "canong3_08_sub_04.tif": {
793
+ "subset": "val",
794
+ "path": "data/columbia/val/au/canong3_08_sub_04.tif",
795
+ "label": 0
796
+ },
797
+ "canonxt_17_sub_07.tif": {
798
+ "subset": "val",
799
+ "path": "data/columbia/val/au/canonxt_17_sub_07.tif",
800
+ "label": 0
801
+ },
802
+ "canonxt_14_sub_07.tif": {
803
+ "subset": "val",
804
+ "path": "data/columbia/val/au/canonxt_14_sub_07.tif",
805
+ "label": 0
806
+ },
807
+ "canonxt_29_sub_07.tif": {
808
+ "subset": "val",
809
+ "path": "data/columbia/val/au/canonxt_29_sub_07.tif",
810
+ "label": 0
811
+ },
812
+ "canong3_02_sub_05.tif": {
813
+ "subset": "val",
814
+ "path": "data/columbia/val/au/canong3_02_sub_05.tif",
815
+ "label": 0
816
+ },
817
+ "canonxt_08_sub_04.tif": {
818
+ "subset": "val",
819
+ "path": "data/columbia/val/au/canonxt_08_sub_04.tif",
820
+ "label": 0
821
+ },
822
+ "nikond70_08_sub_01.tif": {
823
+ "subset": "val",
824
+ "path": "data/columbia/val/au/nikond70_08_sub_01.tif",
825
+ "label": 0
826
+ },
827
+ "canonxt_05_sub_03.tif": {
828
+ "subset": "val",
829
+ "path": "data/columbia/val/au/canonxt_05_sub_03.tif",
830
+ "label": 0
831
+ },
832
+ "canonxt_35_sub_05.tif": {
833
+ "subset": "val",
834
+ "path": "data/columbia/val/au/canonxt_35_sub_05.tif",
835
+ "label": 0
836
+ },
837
+ "canonxt_32_sub_09.tif": {
838
+ "subset": "val",
839
+ "path": "data/columbia/val/au/canonxt_32_sub_09.tif",
840
+ "label": 0
841
+ },
842
+ "nikond70_05_sub_04.tif": {
843
+ "subset": "val",
844
+ "path": "data/columbia/val/au/nikond70_05_sub_04.tif",
845
+ "label": 0
846
+ },
847
+ "nikond70_02_sub_09.tif": {
848
+ "subset": "val",
849
+ "path": "data/columbia/val/au/nikond70_02_sub_09.tif",
850
+ "label": 0
851
+ },
852
+ "canonxt_14_sub_06.tif": {
853
+ "subset": "val",
854
+ "path": "data/columbia/val/au/canonxt_14_sub_06.tif",
855
+ "label": 0
856
+ },
857
+ "canonxt_32_sub_02.tif": {
858
+ "subset": "val",
859
+ "path": "data/columbia/val/au/canonxt_32_sub_02.tif",
860
+ "label": 0
861
+ },
862
+ "canonxt_05_sub_04.tif": {
863
+ "subset": "val",
864
+ "path": "data/columbia/val/au/canonxt_05_sub_04.tif",
865
+ "label": 0
866
+ },
867
+ "nikond70_08_sub_04.tif": {
868
+ "subset": "val",
869
+ "path": "data/columbia/val/au/nikond70_08_sub_04.tif",
870
+ "label": 0
871
+ },
872
+ "canong3_05_sub_08.tif": {
873
+ "subset": "val",
874
+ "path": "data/columbia/val/au/canong3_05_sub_08.tif",
875
+ "label": 0
876
+ },
877
+ "canonxt_08_sub_08.tif": {
878
+ "subset": "val",
879
+ "path": "data/columbia/val/au/canonxt_08_sub_08.tif",
880
+ "label": 0
881
+ },
882
+ "nikond70_02_sub_05.tif": {
883
+ "subset": "val",
884
+ "path": "data/columbia/val/au/nikond70_02_sub_05.tif",
885
+ "label": 0
886
+ },
887
+ "nikond70_05_sub_01.tif": {
888
+ "subset": "val",
889
+ "path": "data/columbia/val/au/nikond70_05_sub_01.tif",
890
+ "label": 0
891
+ },
892
+ "canonxt_11_sub_04.tif": {
893
+ "subset": "val",
894
+ "path": "data/columbia/val/au/canonxt_11_sub_04.tif",
895
+ "label": 0
896
+ },
897
+ "canonxt_23_sub_08.tif": {
898
+ "subset": "val",
899
+ "path": "data/columbia/val/au/canonxt_23_sub_08.tif",
900
+ "label": 0
901
+ },
902
+ "canonxt_02_sub_03.tif": {
903
+ "subset": "val",
904
+ "path": "data/columbia/val/au/canonxt_02_sub_03.tif",
905
+ "label": 0
906
+ },
907
+ "nikond70_02_sub_02.tif": {
908
+ "subset": "val",
909
+ "path": "data/columbia/val/au/nikond70_02_sub_02.tif",
910
+ "label": 0
911
+ },
912
+ "canonxt_08_sub_09.tif": {
913
+ "subset": "val",
914
+ "path": "data/columbia/val/au/canonxt_08_sub_09.tif",
915
+ "label": 0
916
+ },
917
+ "canong3_kodakdcs330_sub_07.tif": {
918
+ "subset": "val",
919
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_07.tif",
920
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_07_gt.png",
921
+ "label": 1
922
+ },
923
+ "canong3_canonxt_sub_03.tif": {
924
+ "subset": "val",
925
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_03.tif",
926
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_03_gt.png",
927
+ "label": 1
928
+ },
929
+ "nikond70_kodakdcs330_sub_10.tif": {
930
+ "subset": "val",
931
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_10.tif",
932
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_10_gt.png",
933
+ "label": 1
934
+ },
935
+ "nikond70_canonxt_sub_26.tif": {
936
+ "subset": "val",
937
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_26.tif",
938
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_26_gt.png",
939
+ "label": 1
940
+ },
941
+ "canong3_canonxt_sub_29.tif": {
942
+ "subset": "val",
943
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_29.tif",
944
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_29_gt.png",
945
+ "label": 1
946
+ },
947
+ "canong3_kodakdcs330_sub_21.tif": {
948
+ "subset": "val",
949
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_21.tif",
950
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_21_gt.png",
951
+ "label": 1
952
+ },
953
+ "canong3_nikond70_sub_15.tif": {
954
+ "subset": "val",
955
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_15.tif",
956
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_15_gt.png",
957
+ "label": 1
958
+ },
959
+ "canonxt_kodakdcs330_sub_01.tif": {
960
+ "subset": "val",
961
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_01.tif",
962
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_01_gt.png",
963
+ "label": 1
964
+ },
965
+ "canong3_kodakdcs330_sub_30.tif": {
966
+ "subset": "val",
967
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_30.tif",
968
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_30_gt.png",
969
+ "label": 1
970
+ },
971
+ "nikond70_kodakdcs330_sub_01.tif": {
972
+ "subset": "val",
973
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_01.tif",
974
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_01_gt.png",
975
+ "label": 1
976
+ },
977
+ "canong3_kodakdcs330_sub_20.tif": {
978
+ "subset": "val",
979
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_20.tif",
980
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_20_gt.png",
981
+ "label": 1
982
+ },
983
+ "nikond70_kodakdcs330_sub_15.tif": {
984
+ "subset": "val",
985
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_15.tif",
986
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_15_gt.png",
987
+ "label": 1
988
+ },
989
+ "nikond70_canonxt_sub_16.tif": {
990
+ "subset": "val",
991
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_16.tif",
992
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_16_gt.png",
993
+ "label": 1
994
+ },
995
+ "canong3_nikond70_sub_13.tif": {
996
+ "subset": "val",
997
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_13.tif",
998
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_13_gt.png",
999
+ "label": 1
1000
+ },
1001
+ "canong3_canonxt_sub_05.tif": {
1002
+ "subset": "val",
1003
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_05.tif",
1004
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_05_gt.png",
1005
+ "label": 1
1006
+ },
1007
+ "canonxt_kodakdcs330_sub_24.tif": {
1008
+ "subset": "val",
1009
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_24.tif",
1010
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_24_gt.png",
1011
+ "label": 1
1012
+ },
1013
+ "nikond70_kodakdcs330_sub_20.tif": {
1014
+ "subset": "val",
1015
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_20.tif",
1016
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_20_gt.png",
1017
+ "label": 1
1018
+ },
1019
+ "nikond70_kodakdcs330_sub_30.tif": {
1020
+ "subset": "val",
1021
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_30.tif",
1022
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_30_gt.png",
1023
+ "label": 1
1024
+ },
1025
+ "nikond70_canonxt_sub_28.tif": {
1026
+ "subset": "val",
1027
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_28.tif",
1028
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_28_gt.png",
1029
+ "label": 1
1030
+ },
1031
+ "canonxt_kodakdcs330_sub_04.tif": {
1032
+ "subset": "val",
1033
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_04.tif",
1034
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_04_gt.png",
1035
+ "label": 1
1036
+ },
1037
+ "canong3_kodakdcs330_sub_02.tif": {
1038
+ "subset": "val",
1039
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_02.tif",
1040
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_02_gt.png",
1041
+ "label": 1
1042
+ },
1043
+ "nikond70_canonxt_sub_04.tif": {
1044
+ "subset": "val",
1045
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_04.tif",
1046
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_04_gt.png",
1047
+ "label": 1
1048
+ },
1049
+ "canong3_canonxt_sub_22.tif": {
1050
+ "subset": "val",
1051
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_22.tif",
1052
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_22_gt.png",
1053
+ "label": 1
1054
+ },
1055
+ "nikond70_canonxt_sub_09.tif": {
1056
+ "subset": "val",
1057
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_09.tif",
1058
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_09_gt.png",
1059
+ "label": 1
1060
+ },
1061
+ "nikond70_canonxt_sub_24.tif": {
1062
+ "subset": "val",
1063
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_24.tif",
1064
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_24_gt.png",
1065
+ "label": 1
1066
+ },
1067
+ "canonxt_kodakdcs330_sub_28.tif": {
1068
+ "subset": "val",
1069
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_28.tif",
1070
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_28_gt.png",
1071
+ "label": 1
1072
+ },
1073
+ "canonxt_kodakdcs330_sub_02.tif": {
1074
+ "subset": "val",
1075
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_02.tif",
1076
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_02_gt.png",
1077
+ "label": 1
1078
+ },
1079
+ "canong3_canonxt_sub_15.tif": {
1080
+ "subset": "val",
1081
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_15.tif",
1082
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_15_gt.png",
1083
+ "label": 1
1084
+ },
1085
+ "canong3_canonxt_sub_11.tif": {
1086
+ "subset": "val",
1087
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_11.tif",
1088
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_11_gt.png",
1089
+ "label": 1
1090
+ },
1091
+ "nikond70_kodakdcs330_sub_25.tif": {
1092
+ "subset": "val",
1093
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_25.tif",
1094
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_25_gt.png",
1095
+ "label": 1
1096
+ },
1097
+ "nikond70_kodakdcs330_sub_04.tif": {
1098
+ "subset": "val",
1099
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_04.tif",
1100
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_04_gt.png",
1101
+ "label": 1
1102
+ },
1103
+ "nikond70_kodakdcs330_sub_21.tif": {
1104
+ "subset": "val",
1105
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_21.tif",
1106
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_21_gt.png",
1107
+ "label": 1
1108
+ },
1109
+ "canonxt_kodakdcs330_sub_30.tif": {
1110
+ "subset": "val",
1111
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_30.tif",
1112
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_30_gt.png",
1113
+ "label": 1
1114
+ },
1115
+ "canong3_nikond70_sub_25.tif": {
1116
+ "subset": "val",
1117
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_25.tif",
1118
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_25_gt.png",
1119
+ "label": 1
1120
+ },
1121
+ "canong3_kodakdcs330_sub_22.tif": {
1122
+ "subset": "val",
1123
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_22.tif",
1124
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_22_gt.png",
1125
+ "label": 1
1126
+ },
1127
+ "nikond70_canonxt_sub_25.tif": {
1128
+ "subset": "val",
1129
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_25.tif",
1130
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_25_gt.png",
1131
+ "label": 1
1132
+ },
1133
+ "canong3_canonxt_sub_18.tif": {
1134
+ "subset": "val",
1135
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_18.tif",
1136
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_18_gt.png",
1137
+ "label": 1
1138
+ },
1139
+ "canong3_nikond70_sub_11.tif": {
1140
+ "subset": "val",
1141
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_11.tif",
1142
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_11_gt.png",
1143
+ "label": 1
1144
+ },
1145
+ "canong3_canonxt_sub_17.tif": {
1146
+ "subset": "val",
1147
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_17.tif",
1148
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_17_gt.png",
1149
+ "label": 1
1150
+ },
1151
+ "nikond70_canonxt_sub_17.tif": {
1152
+ "subset": "val",
1153
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_17.tif",
1154
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_17_gt.png",
1155
+ "label": 1
1156
+ },
1157
+ "canong3_kodakdcs330_sub_03.tif": {
1158
+ "subset": "val",
1159
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_03.tif",
1160
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_03_gt.png",
1161
+ "label": 1
1162
+ },
1163
+ "canong3_kodakdcs330_sub_13.tif": {
1164
+ "subset": "val",
1165
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_13.tif",
1166
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_13_gt.png",
1167
+ "label": 1
1168
+ },
1169
+ "nikond70_canonxt_sub_21.tif": {
1170
+ "subset": "val",
1171
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_21.tif",
1172
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_21_gt.png",
1173
+ "label": 1
1174
+ },
1175
+ "nikond70_kodakdcs330_sub_07.tif": {
1176
+ "subset": "val",
1177
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_07.tif",
1178
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_07_gt.png",
1179
+ "label": 1
1180
+ },
1181
+ "canonxt_kodakdcs330_sub_08.tif": {
1182
+ "subset": "val",
1183
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_08.tif",
1184
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_08_gt.png",
1185
+ "label": 1
1186
+ },
1187
+ "canong3_nikond70_sub_01.tif": {
1188
+ "subset": "val",
1189
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_01.tif",
1190
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_01_gt.png",
1191
+ "label": 1
1192
+ },
1193
+ "nikond70_canonxt_sub_19.tif": {
1194
+ "subset": "val",
1195
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_19.tif",
1196
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_19_gt.png",
1197
+ "label": 1
1198
+ },
1199
+ "nikond70_canonxt_sub_08.tif": {
1200
+ "subset": "val",
1201
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_08.tif",
1202
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_08_gt.png",
1203
+ "label": 1
1204
+ },
1205
+ "nikond70_kodakdcs330_sub_14.tif": {
1206
+ "subset": "val",
1207
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_14.tif",
1208
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_14_gt.png",
1209
+ "label": 1
1210
+ },
1211
+ "nikond70_kodakdcs330_sub_29.tif": {
1212
+ "subset": "val",
1213
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_29.tif",
1214
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_29_gt.png",
1215
+ "label": 1
1216
+ },
1217
+ "canonxt_kodakdcs330_sub_09.tif": {
1218
+ "subset": "val",
1219
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_09.tif",
1220
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_09_gt.png",
1221
+ "label": 1
1222
+ },
1223
+ "canong3_kodakdcs330_sub_04.tif": {
1224
+ "subset": "val",
1225
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_04.tif",
1226
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_04_gt.png",
1227
+ "label": 1
1228
+ },
1229
+ "canong3_canonxt_sub_06.tif": {
1230
+ "subset": "val",
1231
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_06.tif",
1232
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_06_gt.png",
1233
+ "label": 1
1234
+ },
1235
+ "canong3_nikond70_sub_22.tif": {
1236
+ "subset": "val",
1237
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_22.tif",
1238
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_22_gt.png",
1239
+ "label": 1
1240
+ },
1241
+ "canong3_kodakdcs330_sub_09.tif": {
1242
+ "subset": "val",
1243
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_09.tif",
1244
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_09_gt.png",
1245
+ "label": 1
1246
+ },
1247
+ "nikond70_canonxt_sub_20.tif": {
1248
+ "subset": "val",
1249
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_20.tif",
1250
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_20_gt.png",
1251
+ "label": 1
1252
+ },
1253
+ "canong3_canonxt_sub_10.tif": {
1254
+ "subset": "val",
1255
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_10.tif",
1256
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_10_gt.png",
1257
+ "label": 1
1258
+ },
1259
+ "canong3_nikond70_sub_18.tif": {
1260
+ "subset": "val",
1261
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_18.tif",
1262
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_18_gt.png",
1263
+ "label": 1
1264
+ },
1265
+ "canong3_kodakdcs330_sub_28.tif": {
1266
+ "subset": "val",
1267
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_28.tif",
1268
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_28_gt.png",
1269
+ "label": 1
1270
+ },
1271
+ "canonxt_kodakdcs330_sub_25.tif": {
1272
+ "subset": "val",
1273
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_25.tif",
1274
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_25_gt.png",
1275
+ "label": 1
1276
+ },
1277
+ "canong3_nikond70_sub_03.tif": {
1278
+ "subset": "val",
1279
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_03.tif",
1280
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_03_gt.png",
1281
+ "label": 1
1282
+ },
1283
+ "canong3_nikond70_sub_12.tif": {
1284
+ "subset": "val",
1285
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_12.tif",
1286
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_12_gt.png",
1287
+ "label": 1
1288
+ },
1289
+ "canong3_canonxt_sub_09.tif": {
1290
+ "subset": "val",
1291
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_09.tif",
1292
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_09_gt.png",
1293
+ "label": 1
1294
+ },
1295
+ "nikond70_canonxt_sub_22.tif": {
1296
+ "subset": "val",
1297
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_22.tif",
1298
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_22_gt.png",
1299
+ "label": 1
1300
+ },
1301
+ "canong3_canonxt_sub_16.tif": {
1302
+ "subset": "val",
1303
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_16.tif",
1304
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_16_gt.png",
1305
+ "label": 1
1306
+ },
1307
+ "canong3_canonxt_sub_19.tif": {
1308
+ "subset": "val",
1309
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_19.tif",
1310
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_19_gt.png",
1311
+ "label": 1
1312
+ },
1313
+ "canong3_kodakdcs330_sub_06.tif": {
1314
+ "subset": "val",
1315
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_06.tif",
1316
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_06_gt.png",
1317
+ "label": 1
1318
+ },
1319
+ "canonxt_kodakdcs330_sub_26.tif": {
1320
+ "subset": "val",
1321
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_26.tif",
1322
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_26_gt.png",
1323
+ "label": 1
1324
+ },
1325
+ "canong3_nikond70_sub_09.tif": {
1326
+ "subset": "val",
1327
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_09.tif",
1328
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_09_gt.png",
1329
+ "label": 1
1330
+ },
1331
+ "nikond70_kodakdcs330_sub_28.tif": {
1332
+ "subset": "val",
1333
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_28.tif",
1334
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_28_gt.png",
1335
+ "label": 1
1336
+ },
1337
+ "nikond70_kodakdcs330_sub_26.tif": {
1338
+ "subset": "val",
1339
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_26.tif",
1340
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_26_gt.png",
1341
+ "label": 1
1342
+ },
1343
+ "nikond70_canonxt_sub_11.tif": {
1344
+ "subset": "val",
1345
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_11.tif",
1346
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_11_gt.png",
1347
+ "label": 1
1348
+ },
1349
+ "canong3_kodakdcs330_sub_15.tif": {
1350
+ "subset": "val",
1351
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_15.tif",
1352
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_15_gt.png",
1353
+ "label": 1
1354
+ },
1355
+ "canonxt_kodakdcs330_sub_22.tif": {
1356
+ "subset": "val",
1357
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_22.tif",
1358
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_22_gt.png",
1359
+ "label": 1
1360
+ },
1361
+ "nikond70_canonxt_sub_23.tif": {
1362
+ "subset": "val",
1363
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_23.tif",
1364
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_23_gt.png",
1365
+ "label": 1
1366
+ },
1367
+ "canonxt_kodakdcs330_sub_27.tif": {
1368
+ "subset": "val",
1369
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_27.tif",
1370
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_27_gt.png",
1371
+ "label": 1
1372
+ },
1373
+ "nikond70_canonxt_sub_05.tif": {
1374
+ "subset": "val",
1375
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_05.tif",
1376
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_05_gt.png",
1377
+ "label": 1
1378
+ },
1379
+ "nikond70_kodakdcs330_sub_22.tif": {
1380
+ "subset": "val",
1381
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_22.tif",
1382
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_22_gt.png",
1383
+ "label": 1
1384
+ },
1385
+ "nikond70_kodakdcs330_sub_23.tif": {
1386
+ "subset": "val",
1387
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_23.tif",
1388
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_23_gt.png",
1389
+ "label": 1
1390
+ },
1391
+ "canong3_canonxt_sub_23.tif": {
1392
+ "subset": "val",
1393
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_23.tif",
1394
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_23_gt.png",
1395
+ "label": 1
1396
+ },
1397
+ "canong3_kodakdcs330_sub_27.tif": {
1398
+ "subset": "val",
1399
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_27.tif",
1400
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_27_gt.png",
1401
+ "label": 1
1402
+ },
1403
+ "canong3_canonxt_sub_04.tif": {
1404
+ "subset": "val",
1405
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_04.tif",
1406
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_04_gt.png",
1407
+ "label": 1
1408
+ },
1409
+ "canong3_nikond70_sub_07.tif": {
1410
+ "subset": "val",
1411
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_07.tif",
1412
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_07_gt.png",
1413
+ "label": 1
1414
+ },
1415
+ "canong3_kodakdcs330_sub_10.tif": {
1416
+ "subset": "val",
1417
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_10.tif",
1418
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_10_gt.png",
1419
+ "label": 1
1420
+ },
1421
+ "canong3_kodakdcs330_sub_18.tif": {
1422
+ "subset": "val",
1423
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_18.tif",
1424
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_18_gt.png",
1425
+ "label": 1
1426
+ },
1427
+ "nikond70_kodakdcs330_sub_03.tif": {
1428
+ "subset": "val",
1429
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_03.tif",
1430
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_03_gt.png",
1431
+ "label": 1
1432
+ },
1433
+ "canong3_kodakdcs330_sub_24.tif": {
1434
+ "subset": "val",
1435
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_24.tif",
1436
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_24_gt.png",
1437
+ "label": 1
1438
+ },
1439
+ "canonxt_kodakdcs330_sub_07.tif": {
1440
+ "subset": "val",
1441
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_07.tif",
1442
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_07_gt.png",
1443
+ "label": 1
1444
+ },
1445
+ "canonxt_kodakdcs330_sub_14.tif": {
1446
+ "subset": "val",
1447
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_14.tif",
1448
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_14_gt.png",
1449
+ "label": 1
1450
+ },
1451
+ "canong3_nikond70_sub_21.tif": {
1452
+ "subset": "val",
1453
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_21.tif",
1454
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_21_gt.png",
1455
+ "label": 1
1456
+ },
1457
+ "nikond70_canonxt_sub_30.tif": {
1458
+ "subset": "val",
1459
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_30.tif",
1460
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_30_gt.png",
1461
+ "label": 1
1462
+ },
1463
+ "nikond70_kodakdcs330_sub_05.tif": {
1464
+ "subset": "val",
1465
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_05.tif",
1466
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_05_gt.png",
1467
+ "label": 1
1468
+ },
1469
+ "canonxt_kodakdcs330_sub_23.tif": {
1470
+ "subset": "val",
1471
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_23.tif",
1472
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_23_gt.png",
1473
+ "label": 1
1474
+ },
1475
+ "nikond70_canonxt_sub_27.tif": {
1476
+ "subset": "val",
1477
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_27.tif",
1478
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_27_gt.png",
1479
+ "label": 1
1480
+ },
1481
+ "canong3_canonxt_sub_27.tif": {
1482
+ "subset": "val",
1483
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_27.tif",
1484
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_27_gt.png",
1485
+ "label": 1
1486
+ },
1487
+ "nikond70_kodakdcs330_sub_24.tif": {
1488
+ "subset": "val",
1489
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_24.tif",
1490
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_24_gt.png",
1491
+ "label": 1
1492
+ },
1493
+ "canong3_canonxt_sub_24.tif": {
1494
+ "subset": "val",
1495
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_24.tif",
1496
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_24_gt.png",
1497
+ "label": 1
1498
+ },
1499
+ "canong3_canonxt_sub_08.tif": {
1500
+ "subset": "val",
1501
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_08.tif",
1502
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_08_gt.png",
1503
+ "label": 1
1504
+ },
1505
+ "nikond70_canonxt_sub_06.tif": {
1506
+ "subset": "val",
1507
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_06.tif",
1508
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_06_gt.png",
1509
+ "label": 1
1510
+ },
1511
+ "nikond70_kodakdcs330_sub_27.tif": {
1512
+ "subset": "val",
1513
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_27.tif",
1514
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_27_gt.png",
1515
+ "label": 1
1516
+ },
1517
+ "canong3_kodakdcs330_sub_29.tif": {
1518
+ "subset": "val",
1519
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_29.tif",
1520
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_29_gt.png",
1521
+ "label": 1
1522
+ },
1523
+ "nikond70_canonxt_sub_12.tif": {
1524
+ "subset": "val",
1525
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_12.tif",
1526
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_12_gt.png",
1527
+ "label": 1
1528
+ },
1529
+ "nikond70_canonxt_sub_01.tif": {
1530
+ "subset": "val",
1531
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_01.tif",
1532
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_01_gt.png",
1533
+ "label": 1
1534
+ },
1535
+ "canonxt_kodakdcs330_sub_10.tif": {
1536
+ "subset": "val",
1537
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_10.tif",
1538
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_10_gt.png",
1539
+ "label": 1
1540
+ },
1541
+ "canong3_kodakdcs330_sub_14.tif": {
1542
+ "subset": "val",
1543
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_14.tif",
1544
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_14_gt.png",
1545
+ "label": 1
1546
+ },
1547
+ "canonxt_kodakdcs330_sub_16.tif": {
1548
+ "subset": "val",
1549
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_16.tif",
1550
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_16_gt.png",
1551
+ "label": 1
1552
+ },
1553
+ "nikond70_kodakdcs330_sub_12.tif": {
1554
+ "subset": "val",
1555
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_12.tif",
1556
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_12_gt.png",
1557
+ "label": 1
1558
+ },
1559
+ "nikond70_kodakdcs330_sub_08.tif": {
1560
+ "subset": "val",
1561
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_08.tif",
1562
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_08_gt.png",
1563
+ "label": 1
1564
+ },
1565
+ "nikond70_kodakdcs330_sub_19.tif": {
1566
+ "subset": "val",
1567
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_19.tif",
1568
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_19_gt.png",
1569
+ "label": 1
1570
+ },
1571
+ "nikond70_canonxt_sub_03.tif": {
1572
+ "subset": "val",
1573
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_03.tif",
1574
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_03_gt.png",
1575
+ "label": 1
1576
+ },
1577
+ "nikond70_kodakdcs330_sub_09.tif": {
1578
+ "subset": "val",
1579
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_09.tif",
1580
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_09_gt.png",
1581
+ "label": 1
1582
+ },
1583
+ "canonxt_kodakdcs330_sub_17.tif": {
1584
+ "subset": "val",
1585
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_17.tif",
1586
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_17_gt.png",
1587
+ "label": 1
1588
+ },
1589
+ "nikond70_kodakdcs330_sub_02.tif": {
1590
+ "subset": "val",
1591
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_02.tif",
1592
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_02_gt.png",
1593
+ "label": 1
1594
+ },
1595
+ "nikond70_kodakdcs330_sub_16.tif": {
1596
+ "subset": "val",
1597
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_16.tif",
1598
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_16_gt.png",
1599
+ "label": 1
1600
+ },
1601
+ "canong3_canonxt_sub_20.tif": {
1602
+ "subset": "val",
1603
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_20.tif",
1604
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_20_gt.png",
1605
+ "label": 1
1606
+ },
1607
+ "canong3_kodakdcs330_sub_17.tif": {
1608
+ "subset": "val",
1609
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_17.tif",
1610
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_17_gt.png",
1611
+ "label": 1
1612
+ },
1613
+ "canong3_kodakdcs330_sub_16.tif": {
1614
+ "subset": "val",
1615
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_16.tif",
1616
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_16_gt.png",
1617
+ "label": 1
1618
+ },
1619
+ "canong3_kodakdcs330_sub_23.tif": {
1620
+ "subset": "val",
1621
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_23.tif",
1622
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_23_gt.png",
1623
+ "label": 1
1624
+ },
1625
+ "canonxt_kodakdcs330_sub_05.tif": {
1626
+ "subset": "val",
1627
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_05.tif",
1628
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_05_gt.png",
1629
+ "label": 1
1630
+ },
1631
+ "canong3_nikond70_sub_20.tif": {
1632
+ "subset": "val",
1633
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_20.tif",
1634
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_20_gt.png",
1635
+ "label": 1
1636
+ },
1637
+ "nikond70_kodakdcs330_sub_18.tif": {
1638
+ "subset": "val",
1639
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_18.tif",
1640
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_18_gt.png",
1641
+ "label": 1
1642
+ },
1643
+ "canong3_nikond70_sub_30.tif": {
1644
+ "subset": "val",
1645
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_30.tif",
1646
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_30_gt.png",
1647
+ "label": 1
1648
+ },
1649
+ "canong3_canonxt_sub_21.tif": {
1650
+ "subset": "val",
1651
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_21.tif",
1652
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_21_gt.png",
1653
+ "label": 1
1654
+ },
1655
+ "canong3_canonxt_sub_12.tif": {
1656
+ "subset": "val",
1657
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_12.tif",
1658
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_12_gt.png",
1659
+ "label": 1
1660
+ },
1661
+ "canonxt_kodakdcs330_sub_13.tif": {
1662
+ "subset": "val",
1663
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_13.tif",
1664
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_13_gt.png",
1665
+ "label": 1
1666
+ },
1667
+ "nikond70_canonxt_sub_18.tif": {
1668
+ "subset": "val",
1669
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_18.tif",
1670
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_18_gt.png",
1671
+ "label": 1
1672
+ },
1673
+ "canong3_canonxt_sub_07.tif": {
1674
+ "subset": "val",
1675
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_07.tif",
1676
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_07_gt.png",
1677
+ "label": 1
1678
+ },
1679
+ "nikond70_canonxt_sub_07.tif": {
1680
+ "subset": "val",
1681
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_07.tif",
1682
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_07_gt.png",
1683
+ "label": 1
1684
+ },
1685
+ "canong3_nikond70_sub_17.tif": {
1686
+ "subset": "val",
1687
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_17.tif",
1688
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_17_gt.png",
1689
+ "label": 1
1690
+ },
1691
+ "canonxt_kodakdcs330_sub_11.tif": {
1692
+ "subset": "val",
1693
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_11.tif",
1694
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_11_gt.png",
1695
+ "label": 1
1696
+ },
1697
+ "canong3_nikond70_sub_06.tif": {
1698
+ "subset": "val",
1699
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_06.tif",
1700
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_06_gt.png",
1701
+ "label": 1
1702
+ },
1703
+ "nikond70_canonxt_sub_29.tif": {
1704
+ "subset": "val",
1705
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_29.tif",
1706
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_29_gt.png",
1707
+ "label": 1
1708
+ },
1709
+ "canong3_nikond70_sub_14.tif": {
1710
+ "subset": "val",
1711
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_14.tif",
1712
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_14_gt.png",
1713
+ "label": 1
1714
+ },
1715
+ "nikond70_canonxt_sub_14.tif": {
1716
+ "subset": "val",
1717
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_14.tif",
1718
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_14_gt.png",
1719
+ "label": 1
1720
+ },
1721
+ "canong3_kodakdcs330_sub_01.tif": {
1722
+ "subset": "val",
1723
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_01.tif",
1724
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_01_gt.png",
1725
+ "label": 1
1726
+ },
1727
+ "canong3_nikond70_sub_26.tif": {
1728
+ "subset": "val",
1729
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_26.tif",
1730
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_26_gt.png",
1731
+ "label": 1
1732
+ },
1733
+ "canonxt_kodakdcs330_sub_29.tif": {
1734
+ "subset": "val",
1735
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_29.tif",
1736
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_29_gt.png",
1737
+ "label": 1
1738
+ },
1739
+ "canong3_nikond70_sub_24.tif": {
1740
+ "subset": "val",
1741
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_24.tif",
1742
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_24_gt.png",
1743
+ "label": 1
1744
+ },
1745
+ "nikond70_kodakdcs330_sub_13.tif": {
1746
+ "subset": "val",
1747
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_13.tif",
1748
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_13_gt.png",
1749
+ "label": 1
1750
+ },
1751
+ "canong3_canonxt_sub_30.tif": {
1752
+ "subset": "val",
1753
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_30.tif",
1754
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_30_gt.png",
1755
+ "label": 1
1756
+ },
1757
+ "nikond70_kodakdcs330_sub_17.tif": {
1758
+ "subset": "val",
1759
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_17.tif",
1760
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_17_gt.png",
1761
+ "label": 1
1762
+ },
1763
+ "nikond70_canonxt_sub_13.tif": {
1764
+ "subset": "val",
1765
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_13.tif",
1766
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_13_gt.png",
1767
+ "label": 1
1768
+ },
1769
+ "canong3_nikond70_sub_19.tif": {
1770
+ "subset": "val",
1771
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_19.tif",
1772
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_19_gt.png",
1773
+ "label": 1
1774
+ },
1775
+ "canong3_nikond70_sub_04.tif": {
1776
+ "subset": "val",
1777
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_04.tif",
1778
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_04_gt.png",
1779
+ "label": 1
1780
+ },
1781
+ "nikond70_kodakdcs330_sub_11.tif": {
1782
+ "subset": "val",
1783
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_11.tif",
1784
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_11_gt.png",
1785
+ "label": 1
1786
+ },
1787
+ "canong3_canonxt_sub_01.tif": {
1788
+ "subset": "val",
1789
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_01.tif",
1790
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_01_gt.png",
1791
+ "label": 1
1792
+ },
1793
+ "canonxt_kodakdcs330_sub_12.tif": {
1794
+ "subset": "val",
1795
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_12.tif",
1796
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_12_gt.png",
1797
+ "label": 1
1798
+ },
1799
+ "canong3_canonxt_sub_13.tif": {
1800
+ "subset": "val",
1801
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_13.tif",
1802
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_13_gt.png",
1803
+ "label": 1
1804
+ },
1805
+ "canong3_canonxt_sub_14.tif": {
1806
+ "subset": "val",
1807
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_14.tif",
1808
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_14_gt.png",
1809
+ "label": 1
1810
+ },
1811
+ "canong3_kodakdcs330_sub_26.tif": {
1812
+ "subset": "val",
1813
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_26.tif",
1814
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_26_gt.png",
1815
+ "label": 1
1816
+ },
1817
+ "canong3_canonxt_sub_26.tif": {
1818
+ "subset": "val",
1819
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_26.tif",
1820
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_26_gt.png",
1821
+ "label": 1
1822
+ },
1823
+ "canong3_kodakdcs330_sub_11.tif": {
1824
+ "subset": "val",
1825
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_11.tif",
1826
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_11_gt.png",
1827
+ "label": 1
1828
+ },
1829
+ "nikond70_canonxt_sub_02.tif": {
1830
+ "subset": "val",
1831
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_02.tif",
1832
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_02_gt.png",
1833
+ "label": 1
1834
+ },
1835
+ "canong3_nikond70_sub_23.tif": {
1836
+ "subset": "val",
1837
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_23.tif",
1838
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_23_gt.png",
1839
+ "label": 1
1840
+ },
1841
+ "canong3_kodakdcs330_sub_08.tif": {
1842
+ "subset": "val",
1843
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_08.tif",
1844
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_08_gt.png",
1845
+ "label": 1
1846
+ },
1847
+ "canong3_nikond70_sub_28.tif": {
1848
+ "subset": "val",
1849
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_28.tif",
1850
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_28_gt.png",
1851
+ "label": 1
1852
+ },
1853
+ "canong3_nikond70_sub_05.tif": {
1854
+ "subset": "val",
1855
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_05.tif",
1856
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_05_gt.png",
1857
+ "label": 1
1858
+ },
1859
+ "nikond70_canonxt_sub_15.tif": {
1860
+ "subset": "val",
1861
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_15.tif",
1862
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_15_gt.png",
1863
+ "label": 1
1864
+ },
1865
+ "canonxt_kodakdcs330_sub_20.tif": {
1866
+ "subset": "val",
1867
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_20.tif",
1868
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_20_gt.png",
1869
+ "label": 1
1870
+ },
1871
+ "nikond70_kodakdcs330_sub_06.tif": {
1872
+ "subset": "val",
1873
+ "path": "data/columbia/val/tp/nikond70_kodakdcs330_sub_06.tif",
1874
+ "mask": "data/columbia/val/mask/nikond70_kodakdcs330_sub_06_gt.png",
1875
+ "label": 1
1876
+ },
1877
+ "canong3_nikond70_sub_10.tif": {
1878
+ "subset": "val",
1879
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_10.tif",
1880
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_10_gt.png",
1881
+ "label": 1
1882
+ },
1883
+ "canonxt_kodakdcs330_sub_15.tif": {
1884
+ "subset": "val",
1885
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_15.tif",
1886
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_15_gt.png",
1887
+ "label": 1
1888
+ },
1889
+ "canong3_nikond70_sub_02.tif": {
1890
+ "subset": "val",
1891
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_02.tif",
1892
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_02_gt.png",
1893
+ "label": 1
1894
+ },
1895
+ "canonxt_kodakdcs330_sub_21.tif": {
1896
+ "subset": "val",
1897
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_21.tif",
1898
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_21_gt.png",
1899
+ "label": 1
1900
+ },
1901
+ "canong3_nikond70_sub_16.tif": {
1902
+ "subset": "val",
1903
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_16.tif",
1904
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_16_gt.png",
1905
+ "label": 1
1906
+ },
1907
+ "canong3_kodakdcs330_sub_25.tif": {
1908
+ "subset": "val",
1909
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_25.tif",
1910
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_25_gt.png",
1911
+ "label": 1
1912
+ },
1913
+ "canong3_kodakdcs330_sub_19.tif": {
1914
+ "subset": "val",
1915
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_19.tif",
1916
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_19_gt.png",
1917
+ "label": 1
1918
+ },
1919
+ "canonxt_kodakdcs330_sub_06.tif": {
1920
+ "subset": "val",
1921
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_06.tif",
1922
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_06_gt.png",
1923
+ "label": 1
1924
+ },
1925
+ "canonxt_kodakdcs330_sub_19.tif": {
1926
+ "subset": "val",
1927
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_19.tif",
1928
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_19_gt.png",
1929
+ "label": 1
1930
+ },
1931
+ "canonxt_kodakdcs330_sub_03.tif": {
1932
+ "subset": "val",
1933
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_03.tif",
1934
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_03_gt.png",
1935
+ "label": 1
1936
+ },
1937
+ "canong3_canonxt_sub_28.tif": {
1938
+ "subset": "val",
1939
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_28.tif",
1940
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_28_gt.png",
1941
+ "label": 1
1942
+ },
1943
+ "canong3_kodakdcs330_sub_05.tif": {
1944
+ "subset": "val",
1945
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_05.tif",
1946
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_05_gt.png",
1947
+ "label": 1
1948
+ },
1949
+ "canong3_nikond70_sub_29.tif": {
1950
+ "subset": "val",
1951
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_29.tif",
1952
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_29_gt.png",
1953
+ "label": 1
1954
+ },
1955
+ "canong3_kodakdcs330_sub_12.tif": {
1956
+ "subset": "val",
1957
+ "path": "data/columbia/val/tp/canong3_kodakdcs330_sub_12.tif",
1958
+ "mask": "data/columbia/val/mask/canong3_kodakdcs330_sub_12_gt.png",
1959
+ "label": 1
1960
+ },
1961
+ "canonxt_kodakdcs330_sub_18.tif": {
1962
+ "subset": "val",
1963
+ "path": "data/columbia/val/tp/canonxt_kodakdcs330_sub_18.tif",
1964
+ "mask": "data/columbia/val/mask/canonxt_kodakdcs330_sub_18_gt.png",
1965
+ "label": 1
1966
+ },
1967
+ "canong3_nikond70_sub_27.tif": {
1968
+ "subset": "val",
1969
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_27.tif",
1970
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_27_gt.png",
1971
+ "label": 1
1972
+ },
1973
+ "canong3_canonxt_sub_02.tif": {
1974
+ "subset": "val",
1975
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_02.tif",
1976
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_02_gt.png",
1977
+ "label": 1
1978
+ },
1979
+ "nikond70_canonxt_sub_10.tif": {
1980
+ "subset": "val",
1981
+ "path": "data/columbia/val/tp/nikond70_canonxt_sub_10.tif",
1982
+ "mask": "data/columbia/val/mask/nikond70_canonxt_sub_10_gt.png",
1983
+ "label": 1
1984
+ },
1985
+ "canong3_canonxt_sub_25.tif": {
1986
+ "subset": "val",
1987
+ "path": "data/columbia/val/tp/canong3_canonxt_sub_25.tif",
1988
+ "mask": "data/columbia/val/mask/canong3_canonxt_sub_25_gt.png",
1989
+ "label": 1
1990
+ },
1991
+ "canong3_nikond70_sub_08.tif": {
1992
+ "subset": "val",
1993
+ "path": "data/columbia/val/tp/canong3_nikond70_sub_08.tif",
1994
+ "mask": "data/columbia/val/mask/canong3_nikond70_sub_08_gt.png",
1995
+ "label": 1
1996
+ }
1997
+ }
data/coverage_datalist.json ADDED
@@ -0,0 +1,1048 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "16t.tif": {
3
+ "subset": "val",
4
+ "path": "data/coverage/val/image/16t.tif",
5
+ "mask": "data/coverage/val/mask/16forged.tif",
6
+ "label": 1
7
+ },
8
+ "25t.tif": {
9
+ "subset": "val",
10
+ "path": "data/coverage/val/image/25t.tif",
11
+ "mask": "data/coverage/val/mask/25forged.tif",
12
+ "label": 1
13
+ },
14
+ "81.tif": {
15
+ "subset": "val",
16
+ "path": "data/coverage/val/image/81.tif",
17
+ "label": 0
18
+ },
19
+ "26.tif": {
20
+ "subset": "val",
21
+ "path": "data/coverage/val/image/26.tif",
22
+ "label": 0
23
+ },
24
+ "67.tif": {
25
+ "subset": "val",
26
+ "path": "data/coverage/val/image/67.tif",
27
+ "label": 0
28
+ },
29
+ "35t.tif": {
30
+ "subset": "val",
31
+ "path": "data/coverage/val/image/35t.tif",
32
+ "mask": "data/coverage/val/mask/35forged.tif",
33
+ "label": 1
34
+ },
35
+ "30t.tif": {
36
+ "subset": "val",
37
+ "path": "data/coverage/val/image/30t.tif",
38
+ "mask": "data/coverage/val/mask/30forged.tif",
39
+ "label": 1
40
+ },
41
+ "94t.tif": {
42
+ "subset": "val",
43
+ "path": "data/coverage/val/image/94t.tif",
44
+ "mask": "data/coverage/val/mask/94forged.tif",
45
+ "label": 1
46
+ },
47
+ "9.tif": {
48
+ "subset": "val",
49
+ "path": "data/coverage/val/image/9.tif",
50
+ "label": 0
51
+ },
52
+ "52.tif": {
53
+ "subset": "val",
54
+ "path": "data/coverage/val/image/52.tif",
55
+ "label": 0
56
+ },
57
+ "46t.tif": {
58
+ "subset": "val",
59
+ "path": "data/coverage/val/image/46t.tif",
60
+ "mask": "data/coverage/val/mask/46forged.tif",
61
+ "label": 1
62
+ },
63
+ "86t.tif": {
64
+ "subset": "val",
65
+ "path": "data/coverage/val/image/86t.tif",
66
+ "mask": "data/coverage/val/mask/86forged.tif",
67
+ "label": 1
68
+ },
69
+ "95.tif": {
70
+ "subset": "val",
71
+ "path": "data/coverage/val/image/95.tif",
72
+ "label": 0
73
+ },
74
+ "94.tif": {
75
+ "subset": "val",
76
+ "path": "data/coverage/val/image/94.tif",
77
+ "label": 0
78
+ },
79
+ "73t.tif": {
80
+ "subset": "val",
81
+ "path": "data/coverage/val/image/73t.tif",
82
+ "mask": "data/coverage/val/mask/73forged.tif",
83
+ "label": 1
84
+ },
85
+ "23t.tif": {
86
+ "subset": "val",
87
+ "path": "data/coverage/val/image/23t.tif",
88
+ "mask": "data/coverage/val/mask/23forged.tif",
89
+ "label": 1
90
+ },
91
+ "68t.tif": {
92
+ "subset": "val",
93
+ "path": "data/coverage/val/image/68t.tif",
94
+ "mask": "data/coverage/val/mask/68forged.tif",
95
+ "label": 1
96
+ },
97
+ "88t.tif": {
98
+ "subset": "val",
99
+ "path": "data/coverage/val/image/88t.tif",
100
+ "mask": "data/coverage/val/mask/88forged.tif",
101
+ "label": 1
102
+ },
103
+ "45.tif": {
104
+ "subset": "val",
105
+ "path": "data/coverage/val/image/45.tif",
106
+ "label": 0
107
+ },
108
+ "93t.tif": {
109
+ "subset": "val",
110
+ "path": "data/coverage/val/image/93t.tif",
111
+ "mask": "data/coverage/val/mask/93forged.tif",
112
+ "label": 1
113
+ },
114
+ "27.tif": {
115
+ "subset": "val",
116
+ "path": "data/coverage/val/image/27.tif",
117
+ "label": 0
118
+ },
119
+ "44.tif": {
120
+ "subset": "val",
121
+ "path": "data/coverage/val/image/44.tif",
122
+ "label": 0
123
+ },
124
+ "48.tif": {
125
+ "subset": "val",
126
+ "path": "data/coverage/val/image/48.tif",
127
+ "label": 0
128
+ },
129
+ "10.tif": {
130
+ "subset": "val",
131
+ "path": "data/coverage/val/image/10.tif",
132
+ "label": 0
133
+ },
134
+ "50t.tif": {
135
+ "subset": "val",
136
+ "path": "data/coverage/val/image/50t.tif",
137
+ "mask": "data/coverage/val/mask/50forged.tif",
138
+ "label": 1
139
+ },
140
+ "90.tif": {
141
+ "subset": "val",
142
+ "path": "data/coverage/val/image/90.tif",
143
+ "label": 0
144
+ },
145
+ "6.tif": {
146
+ "subset": "val",
147
+ "path": "data/coverage/val/image/6.tif",
148
+ "label": 0
149
+ },
150
+ "24t.tif": {
151
+ "subset": "val",
152
+ "path": "data/coverage/val/image/24t.tif",
153
+ "mask": "data/coverage/val/mask/24forged.tif",
154
+ "label": 1
155
+ },
156
+ "87.tif": {
157
+ "subset": "val",
158
+ "path": "data/coverage/val/image/87.tif",
159
+ "label": 0
160
+ },
161
+ "20.tif": {
162
+ "subset": "val",
163
+ "path": "data/coverage/val/image/20.tif",
164
+ "label": 0
165
+ },
166
+ "54.tif": {
167
+ "subset": "val",
168
+ "path": "data/coverage/val/image/54.tif",
169
+ "label": 0
170
+ },
171
+ "72t.tif": {
172
+ "subset": "val",
173
+ "path": "data/coverage/val/image/72t.tif",
174
+ "mask": "data/coverage/val/mask/72forged.tif",
175
+ "label": 1
176
+ },
177
+ "13.tif": {
178
+ "subset": "val",
179
+ "path": "data/coverage/val/image/13.tif",
180
+ "label": 0
181
+ },
182
+ "67t.tif": {
183
+ "subset": "val",
184
+ "path": "data/coverage/val/image/67t.tif",
185
+ "mask": "data/coverage/val/mask/67forged.tif",
186
+ "label": 1
187
+ },
188
+ "32.tif": {
189
+ "subset": "val",
190
+ "path": "data/coverage/val/image/32.tif",
191
+ "label": 0
192
+ },
193
+ "8t.tif": {
194
+ "subset": "val",
195
+ "path": "data/coverage/val/image/8t.tif",
196
+ "mask": "data/coverage/val/mask/8forged.tif",
197
+ "label": 1
198
+ },
199
+ "22.tif": {
200
+ "subset": "val",
201
+ "path": "data/coverage/val/image/22.tif",
202
+ "label": 0
203
+ },
204
+ "35.tif": {
205
+ "subset": "val",
206
+ "path": "data/coverage/val/image/35.tif",
207
+ "label": 0
208
+ },
209
+ "18t.tif": {
210
+ "subset": "val",
211
+ "path": "data/coverage/val/image/18t.tif",
212
+ "mask": "data/coverage/val/mask/18forged.tif",
213
+ "label": 1
214
+ },
215
+ "20t.tif": {
216
+ "subset": "val",
217
+ "path": "data/coverage/val/image/20t.tif",
218
+ "mask": "data/coverage/val/mask/20forged.tif",
219
+ "label": 1
220
+ },
221
+ "63t.tif": {
222
+ "subset": "val",
223
+ "path": "data/coverage/val/image/63t.tif",
224
+ "mask": "data/coverage/val/mask/63forged.tif",
225
+ "label": 1
226
+ },
227
+ "75t.tif": {
228
+ "subset": "val",
229
+ "path": "data/coverage/val/image/75t.tif",
230
+ "mask": "data/coverage/val/mask/75forged.tif",
231
+ "label": 1
232
+ },
233
+ "63.tif": {
234
+ "subset": "val",
235
+ "path": "data/coverage/val/image/63.tif",
236
+ "label": 0
237
+ },
238
+ "56.tif": {
239
+ "subset": "val",
240
+ "path": "data/coverage/val/image/56.tif",
241
+ "label": 0
242
+ },
243
+ "3t.tif": {
244
+ "subset": "val",
245
+ "path": "data/coverage/val/image/3t.tif",
246
+ "mask": "data/coverage/val/mask/3forged.tif",
247
+ "label": 1
248
+ },
249
+ "97.tif": {
250
+ "subset": "val",
251
+ "path": "data/coverage/val/image/97.tif",
252
+ "label": 0
253
+ },
254
+ "42t.tif": {
255
+ "subset": "val",
256
+ "path": "data/coverage/val/image/42t.tif",
257
+ "mask": "data/coverage/val/mask/42forged.tif",
258
+ "label": 1
259
+ },
260
+ "86.tif": {
261
+ "subset": "val",
262
+ "path": "data/coverage/val/image/86.tif",
263
+ "label": 0
264
+ },
265
+ "66t.tif": {
266
+ "subset": "val",
267
+ "path": "data/coverage/val/image/66t.tif",
268
+ "mask": "data/coverage/val/mask/66forged.tif",
269
+ "label": 1
270
+ },
271
+ "61.tif": {
272
+ "subset": "val",
273
+ "path": "data/coverage/val/image/61.tif",
274
+ "label": 0
275
+ },
276
+ "49.tif": {
277
+ "subset": "val",
278
+ "path": "data/coverage/val/image/49.tif",
279
+ "label": 0
280
+ },
281
+ "4.tif": {
282
+ "subset": "val",
283
+ "path": "data/coverage/val/image/4.tif",
284
+ "label": 0
285
+ },
286
+ "96t.tif": {
287
+ "subset": "val",
288
+ "path": "data/coverage/val/image/96t.tif",
289
+ "mask": "data/coverage/val/mask/96forged.tif",
290
+ "label": 1
291
+ },
292
+ "81t.tif": {
293
+ "subset": "val",
294
+ "path": "data/coverage/val/image/81t.tif",
295
+ "mask": "data/coverage/val/mask/81forged.tif",
296
+ "label": 1
297
+ },
298
+ "2t.tif": {
299
+ "subset": "val",
300
+ "path": "data/coverage/val/image/2t.tif",
301
+ "mask": "data/coverage/val/mask/2forged.tif",
302
+ "label": 1
303
+ },
304
+ "62.tif": {
305
+ "subset": "val",
306
+ "path": "data/coverage/val/image/62.tif",
307
+ "label": 0
308
+ },
309
+ "78t.tif": {
310
+ "subset": "val",
311
+ "path": "data/coverage/val/image/78t.tif",
312
+ "mask": "data/coverage/val/mask/78forged.tif",
313
+ "label": 1
314
+ },
315
+ "92t.tif": {
316
+ "subset": "val",
317
+ "path": "data/coverage/val/image/92t.tif",
318
+ "mask": "data/coverage/val/mask/92forged.tif",
319
+ "label": 1
320
+ },
321
+ "77.tif": {
322
+ "subset": "val",
323
+ "path": "data/coverage/val/image/77.tif",
324
+ "label": 0
325
+ },
326
+ "14.tif": {
327
+ "subset": "val",
328
+ "path": "data/coverage/val/image/14.tif",
329
+ "label": 0
330
+ },
331
+ "12t.tif": {
332
+ "subset": "val",
333
+ "path": "data/coverage/val/image/12t.tif",
334
+ "mask": "data/coverage/val/mask/12forged.tif",
335
+ "label": 1
336
+ },
337
+ "96.tif": {
338
+ "subset": "val",
339
+ "path": "data/coverage/val/image/96.tif",
340
+ "label": 0
341
+ },
342
+ "85t.tif": {
343
+ "subset": "val",
344
+ "path": "data/coverage/val/image/85t.tif",
345
+ "mask": "data/coverage/val/mask/85forged.tif",
346
+ "label": 1
347
+ },
348
+ "50.tif": {
349
+ "subset": "val",
350
+ "path": "data/coverage/val/image/50.tif",
351
+ "label": 0
352
+ },
353
+ "100.tif": {
354
+ "subset": "val",
355
+ "path": "data/coverage/val/image/100.tif",
356
+ "label": 0
357
+ },
358
+ "76t.tif": {
359
+ "subset": "val",
360
+ "path": "data/coverage/val/image/76t.tif",
361
+ "mask": "data/coverage/val/mask/76forged.tif",
362
+ "label": 1
363
+ },
364
+ "71.tif": {
365
+ "subset": "val",
366
+ "path": "data/coverage/val/image/71.tif",
367
+ "label": 0
368
+ },
369
+ "42.tif": {
370
+ "subset": "val",
371
+ "path": "data/coverage/val/image/42.tif",
372
+ "label": 0
373
+ },
374
+ "5t.tif": {
375
+ "subset": "val",
376
+ "path": "data/coverage/val/image/5t.tif",
377
+ "mask": "data/coverage/val/mask/5forged.tif",
378
+ "label": 1
379
+ },
380
+ "41.tif": {
381
+ "subset": "val",
382
+ "path": "data/coverage/val/image/41.tif",
383
+ "label": 0
384
+ },
385
+ "71t.tif": {
386
+ "subset": "val",
387
+ "path": "data/coverage/val/image/71t.tif",
388
+ "mask": "data/coverage/val/mask/71forged.tif",
389
+ "label": 1
390
+ },
391
+ "90t.tif": {
392
+ "subset": "val",
393
+ "path": "data/coverage/val/image/90t.tif",
394
+ "mask": "data/coverage/val/mask/90forged.tif",
395
+ "label": 1
396
+ },
397
+ "32t.tif": {
398
+ "subset": "val",
399
+ "path": "data/coverage/val/image/32t.tif",
400
+ "mask": "data/coverage/val/mask/32forged.tif",
401
+ "label": 1
402
+ },
403
+ "33.tif": {
404
+ "subset": "val",
405
+ "path": "data/coverage/val/image/33.tif",
406
+ "label": 0
407
+ },
408
+ "87t.tif": {
409
+ "subset": "val",
410
+ "path": "data/coverage/val/image/87t.tif",
411
+ "mask": "data/coverage/val/mask/87forged.tif",
412
+ "label": 1
413
+ },
414
+ "70.tif": {
415
+ "subset": "val",
416
+ "path": "data/coverage/val/image/70.tif",
417
+ "label": 0
418
+ },
419
+ "2.tif": {
420
+ "subset": "val",
421
+ "path": "data/coverage/val/image/2.tif",
422
+ "label": 0
423
+ },
424
+ "43.tif": {
425
+ "subset": "val",
426
+ "path": "data/coverage/val/image/43.tif",
427
+ "label": 0
428
+ },
429
+ "43t.tif": {
430
+ "subset": "val",
431
+ "path": "data/coverage/val/image/43t.tif",
432
+ "mask": "data/coverage/val/mask/43forged.tif",
433
+ "label": 1
434
+ },
435
+ "75.tif": {
436
+ "subset": "val",
437
+ "path": "data/coverage/val/image/75.tif",
438
+ "label": 0
439
+ },
440
+ "40t.tif": {
441
+ "subset": "val",
442
+ "path": "data/coverage/val/image/40t.tif",
443
+ "mask": "data/coverage/val/mask/40forged.tif",
444
+ "label": 1
445
+ },
446
+ "17t.tif": {
447
+ "subset": "val",
448
+ "path": "data/coverage/val/image/17t.tif",
449
+ "mask": "data/coverage/val/mask/17forged.tif",
450
+ "label": 1
451
+ },
452
+ "28t.tif": {
453
+ "subset": "val",
454
+ "path": "data/coverage/val/image/28t.tif",
455
+ "mask": "data/coverage/val/mask/28forged.tif",
456
+ "label": 1
457
+ },
458
+ "82.tif": {
459
+ "subset": "val",
460
+ "path": "data/coverage/val/image/82.tif",
461
+ "label": 0
462
+ },
463
+ "73.tif": {
464
+ "subset": "val",
465
+ "path": "data/coverage/val/image/73.tif",
466
+ "label": 0
467
+ },
468
+ "78.tif": {
469
+ "subset": "val",
470
+ "path": "data/coverage/val/image/78.tif",
471
+ "label": 0
472
+ },
473
+ "64.tif": {
474
+ "subset": "val",
475
+ "path": "data/coverage/val/image/64.tif",
476
+ "label": 0
477
+ },
478
+ "69t.tif": {
479
+ "subset": "val",
480
+ "path": "data/coverage/val/image/69t.tif",
481
+ "mask": "data/coverage/val/mask/69forged.tif",
482
+ "label": 1
483
+ },
484
+ "15t.tif": {
485
+ "subset": "val",
486
+ "path": "data/coverage/val/image/15t.tif",
487
+ "mask": "data/coverage/val/mask/15forged.tif",
488
+ "label": 1
489
+ },
490
+ "47t.tif": {
491
+ "subset": "val",
492
+ "path": "data/coverage/val/image/47t.tif",
493
+ "mask": "data/coverage/val/mask/47forged.tif",
494
+ "label": 1
495
+ },
496
+ "13t.tif": {
497
+ "subset": "val",
498
+ "path": "data/coverage/val/image/13t.tif",
499
+ "mask": "data/coverage/val/mask/13forged.tif",
500
+ "label": 1
501
+ },
502
+ "15.tif": {
503
+ "subset": "val",
504
+ "path": "data/coverage/val/image/15.tif",
505
+ "label": 0
506
+ },
507
+ "23.tif": {
508
+ "subset": "val",
509
+ "path": "data/coverage/val/image/23.tif",
510
+ "label": 0
511
+ },
512
+ "64t.tif": {
513
+ "subset": "val",
514
+ "path": "data/coverage/val/image/64t.tif",
515
+ "mask": "data/coverage/val/mask/64forged.tif",
516
+ "label": 1
517
+ },
518
+ "77t.tif": {
519
+ "subset": "val",
520
+ "path": "data/coverage/val/image/77t.tif",
521
+ "mask": "data/coverage/val/mask/77forged.tif",
522
+ "label": 1
523
+ },
524
+ "98.tif": {
525
+ "subset": "val",
526
+ "path": "data/coverage/val/image/98.tif",
527
+ "label": 0
528
+ },
529
+ "5.tif": {
530
+ "subset": "val",
531
+ "path": "data/coverage/val/image/5.tif",
532
+ "label": 0
533
+ },
534
+ "79t.tif": {
535
+ "subset": "val",
536
+ "path": "data/coverage/val/image/79t.tif",
537
+ "mask": "data/coverage/val/mask/79forged.tif",
538
+ "label": 1
539
+ },
540
+ "9t.tif": {
541
+ "subset": "val",
542
+ "path": "data/coverage/val/image/9t.tif",
543
+ "mask": "data/coverage/val/mask/9forged.tif",
544
+ "label": 1
545
+ },
546
+ "91.tif": {
547
+ "subset": "val",
548
+ "path": "data/coverage/val/image/91.tif",
549
+ "label": 0
550
+ },
551
+ "85.tif": {
552
+ "subset": "val",
553
+ "path": "data/coverage/val/image/85.tif",
554
+ "label": 0
555
+ },
556
+ "91t.tif": {
557
+ "subset": "val",
558
+ "path": "data/coverage/val/image/91t.tif",
559
+ "mask": "data/coverage/val/mask/91forged.tif",
560
+ "label": 1
561
+ },
562
+ "97t.tif": {
563
+ "subset": "val",
564
+ "path": "data/coverage/val/image/97t.tif",
565
+ "mask": "data/coverage/val/mask/97forged.tif",
566
+ "label": 1
567
+ },
568
+ "98t.tif": {
569
+ "subset": "val",
570
+ "path": "data/coverage/val/image/98t.tif",
571
+ "mask": "data/coverage/val/mask/98forged.tif",
572
+ "label": 1
573
+ },
574
+ "60t.tif": {
575
+ "subset": "val",
576
+ "path": "data/coverage/val/image/60t.tif",
577
+ "mask": "data/coverage/val/mask/60forged.tif",
578
+ "label": 1
579
+ },
580
+ "11t.tif": {
581
+ "subset": "val",
582
+ "path": "data/coverage/val/image/11t.tif",
583
+ "mask": "data/coverage/val/mask/11forged.tif",
584
+ "label": 1
585
+ },
586
+ "68.tif": {
587
+ "subset": "val",
588
+ "path": "data/coverage/val/image/68.tif",
589
+ "label": 0
590
+ },
591
+ "84.tif": {
592
+ "subset": "val",
593
+ "path": "data/coverage/val/image/84.tif",
594
+ "label": 0
595
+ },
596
+ "84t.tif": {
597
+ "subset": "val",
598
+ "path": "data/coverage/val/image/84t.tif",
599
+ "mask": "data/coverage/val/mask/84forged.tif",
600
+ "label": 1
601
+ },
602
+ "4t.tif": {
603
+ "subset": "val",
604
+ "path": "data/coverage/val/image/4t.tif",
605
+ "mask": "data/coverage/val/mask/4forged.tif",
606
+ "label": 1
607
+ },
608
+ "79.tif": {
609
+ "subset": "val",
610
+ "path": "data/coverage/val/image/79.tif",
611
+ "label": 0
612
+ },
613
+ "36t.tif": {
614
+ "subset": "val",
615
+ "path": "data/coverage/val/image/36t.tif",
616
+ "mask": "data/coverage/val/mask/36forged.tif",
617
+ "label": 1
618
+ },
619
+ "1.tif": {
620
+ "subset": "val",
621
+ "path": "data/coverage/val/image/1.tif",
622
+ "label": 0
623
+ },
624
+ "10t.tif": {
625
+ "subset": "val",
626
+ "path": "data/coverage/val/image/10t.tif",
627
+ "mask": "data/coverage/val/mask/10forged.tif",
628
+ "label": 1
629
+ },
630
+ "38.tif": {
631
+ "subset": "val",
632
+ "path": "data/coverage/val/image/38.tif",
633
+ "label": 0
634
+ },
635
+ "39.tif": {
636
+ "subset": "val",
637
+ "path": "data/coverage/val/image/39.tif",
638
+ "label": 0
639
+ },
640
+ "40.tif": {
641
+ "subset": "val",
642
+ "path": "data/coverage/val/image/40.tif",
643
+ "label": 0
644
+ },
645
+ "17.tif": {
646
+ "subset": "val",
647
+ "path": "data/coverage/val/image/17.tif",
648
+ "label": 0
649
+ },
650
+ "59.tif": {
651
+ "subset": "val",
652
+ "path": "data/coverage/val/image/59.tif",
653
+ "label": 0
654
+ },
655
+ "3.tif": {
656
+ "subset": "val",
657
+ "path": "data/coverage/val/image/3.tif",
658
+ "label": 0
659
+ },
660
+ "53t.tif": {
661
+ "subset": "val",
662
+ "path": "data/coverage/val/image/53t.tif",
663
+ "mask": "data/coverage/val/mask/53forged.tif",
664
+ "label": 1
665
+ },
666
+ "92.tif": {
667
+ "subset": "val",
668
+ "path": "data/coverage/val/image/92.tif",
669
+ "label": 0
670
+ },
671
+ "62t.tif": {
672
+ "subset": "val",
673
+ "path": "data/coverage/val/image/62t.tif",
674
+ "mask": "data/coverage/val/mask/62forged.tif",
675
+ "label": 1
676
+ },
677
+ "66.tif": {
678
+ "subset": "val",
679
+ "path": "data/coverage/val/image/66.tif",
680
+ "label": 0
681
+ },
682
+ "14t.tif": {
683
+ "subset": "val",
684
+ "path": "data/coverage/val/image/14t.tif",
685
+ "mask": "data/coverage/val/mask/14forged.tif",
686
+ "label": 1
687
+ },
688
+ "58.tif": {
689
+ "subset": "val",
690
+ "path": "data/coverage/val/image/58.tif",
691
+ "label": 0
692
+ },
693
+ "82t.tif": {
694
+ "subset": "val",
695
+ "path": "data/coverage/val/image/82t.tif",
696
+ "mask": "data/coverage/val/mask/82forged.tif",
697
+ "label": 1
698
+ },
699
+ "31t.tif": {
700
+ "subset": "val",
701
+ "path": "data/coverage/val/image/31t.tif",
702
+ "mask": "data/coverage/val/mask/31forged.tif",
703
+ "label": 1
704
+ },
705
+ "55.tif": {
706
+ "subset": "val",
707
+ "path": "data/coverage/val/image/55.tif",
708
+ "label": 0
709
+ },
710
+ "31.tif": {
711
+ "subset": "val",
712
+ "path": "data/coverage/val/image/31.tif",
713
+ "label": 0
714
+ },
715
+ "80t.tif": {
716
+ "subset": "val",
717
+ "path": "data/coverage/val/image/80t.tif",
718
+ "mask": "data/coverage/val/mask/80forged.tif",
719
+ "label": 1
720
+ },
721
+ "18.tif": {
722
+ "subset": "val",
723
+ "path": "data/coverage/val/image/18.tif",
724
+ "label": 0
725
+ },
726
+ "53.tif": {
727
+ "subset": "val",
728
+ "path": "data/coverage/val/image/53.tif",
729
+ "label": 0
730
+ },
731
+ "46.tif": {
732
+ "subset": "val",
733
+ "path": "data/coverage/val/image/46.tif",
734
+ "label": 0
735
+ },
736
+ "26t.tif": {
737
+ "subset": "val",
738
+ "path": "data/coverage/val/image/26t.tif",
739
+ "mask": "data/coverage/val/mask/26forged.tif",
740
+ "label": 1
741
+ },
742
+ "99.tif": {
743
+ "subset": "val",
744
+ "path": "data/coverage/val/image/99.tif",
745
+ "label": 0
746
+ },
747
+ "28.tif": {
748
+ "subset": "val",
749
+ "path": "data/coverage/val/image/28.tif",
750
+ "label": 0
751
+ },
752
+ "38t.tif": {
753
+ "subset": "val",
754
+ "path": "data/coverage/val/image/38t.tif",
755
+ "mask": "data/coverage/val/mask/38forged.tif",
756
+ "label": 1
757
+ },
758
+ "70t.tif": {
759
+ "subset": "val",
760
+ "path": "data/coverage/val/image/70t.tif",
761
+ "mask": "data/coverage/val/mask/70forged.tif",
762
+ "label": 1
763
+ },
764
+ "47.tif": {
765
+ "subset": "val",
766
+ "path": "data/coverage/val/image/47.tif",
767
+ "label": 0
768
+ },
769
+ "34.tif": {
770
+ "subset": "val",
771
+ "path": "data/coverage/val/image/34.tif",
772
+ "label": 0
773
+ },
774
+ "49t.tif": {
775
+ "subset": "val",
776
+ "path": "data/coverage/val/image/49t.tif",
777
+ "mask": "data/coverage/val/mask/49forged.tif",
778
+ "label": 1
779
+ },
780
+ "22t.tif": {
781
+ "subset": "val",
782
+ "path": "data/coverage/val/image/22t.tif",
783
+ "mask": "data/coverage/val/mask/22forged.tif",
784
+ "label": 1
785
+ },
786
+ "74t.tif": {
787
+ "subset": "val",
788
+ "path": "data/coverage/val/image/74t.tif",
789
+ "mask": "data/coverage/val/mask/74forged.tif",
790
+ "label": 1
791
+ },
792
+ "65t.tif": {
793
+ "subset": "val",
794
+ "path": "data/coverage/val/image/65t.tif",
795
+ "mask": "data/coverage/val/mask/65forged.tif",
796
+ "label": 1
797
+ },
798
+ "8.tif": {
799
+ "subset": "val",
800
+ "path": "data/coverage/val/image/8.tif",
801
+ "label": 0
802
+ },
803
+ "1t.tif": {
804
+ "subset": "val",
805
+ "path": "data/coverage/val/image/1t.tif",
806
+ "mask": "data/coverage/val/mask/1forged.tif",
807
+ "label": 1
808
+ },
809
+ "80.tif": {
810
+ "subset": "val",
811
+ "path": "data/coverage/val/image/80.tif",
812
+ "label": 0
813
+ },
814
+ "60.tif": {
815
+ "subset": "val",
816
+ "path": "data/coverage/val/image/60.tif",
817
+ "label": 0
818
+ },
819
+ "21.tif": {
820
+ "subset": "val",
821
+ "path": "data/coverage/val/image/21.tif",
822
+ "label": 0
823
+ },
824
+ "57.tif": {
825
+ "subset": "val",
826
+ "path": "data/coverage/val/image/57.tif",
827
+ "label": 0
828
+ },
829
+ "51.tif": {
830
+ "subset": "val",
831
+ "path": "data/coverage/val/image/51.tif",
832
+ "label": 0
833
+ },
834
+ "7t.tif": {
835
+ "subset": "val",
836
+ "path": "data/coverage/val/image/7t.tif",
837
+ "mask": "data/coverage/val/mask/7forged.tif",
838
+ "label": 1
839
+ },
840
+ "93.tif": {
841
+ "subset": "val",
842
+ "path": "data/coverage/val/image/93.tif",
843
+ "label": 0
844
+ },
845
+ "83.tif": {
846
+ "subset": "val",
847
+ "path": "data/coverage/val/image/83.tif",
848
+ "label": 0
849
+ },
850
+ "27t.tif": {
851
+ "subset": "val",
852
+ "path": "data/coverage/val/image/27t.tif",
853
+ "mask": "data/coverage/val/mask/27forged.tif",
854
+ "label": 1
855
+ },
856
+ "19.tif": {
857
+ "subset": "val",
858
+ "path": "data/coverage/val/image/19.tif",
859
+ "label": 0
860
+ },
861
+ "34t.tif": {
862
+ "subset": "val",
863
+ "path": "data/coverage/val/image/34t.tif",
864
+ "mask": "data/coverage/val/mask/34forged.tif",
865
+ "label": 1
866
+ },
867
+ "52t.tif": {
868
+ "subset": "val",
869
+ "path": "data/coverage/val/image/52t.tif",
870
+ "mask": "data/coverage/val/mask/52forged.tif",
871
+ "label": 1
872
+ },
873
+ "45t.tif": {
874
+ "subset": "val",
875
+ "path": "data/coverage/val/image/45t.tif",
876
+ "mask": "data/coverage/val/mask/45forged.tif",
877
+ "label": 1
878
+ },
879
+ "12.tif": {
880
+ "subset": "val",
881
+ "path": "data/coverage/val/image/12.tif",
882
+ "label": 0
883
+ },
884
+ "16.tif": {
885
+ "subset": "val",
886
+ "path": "data/coverage/val/image/16.tif",
887
+ "label": 0
888
+ },
889
+ "29.tif": {
890
+ "subset": "val",
891
+ "path": "data/coverage/val/image/29.tif",
892
+ "label": 0
893
+ },
894
+ "89.tif": {
895
+ "subset": "val",
896
+ "path": "data/coverage/val/image/89.tif",
897
+ "label": 0
898
+ },
899
+ "29t.tif": {
900
+ "subset": "val",
901
+ "path": "data/coverage/val/image/29t.tif",
902
+ "mask": "data/coverage/val/mask/29forged.tif",
903
+ "label": 1
904
+ },
905
+ "36.tif": {
906
+ "subset": "val",
907
+ "path": "data/coverage/val/image/36.tif",
908
+ "label": 0
909
+ },
910
+ "39t.tif": {
911
+ "subset": "val",
912
+ "path": "data/coverage/val/image/39t.tif",
913
+ "mask": "data/coverage/val/mask/39forged.tif",
914
+ "label": 1
915
+ },
916
+ "100t.tif": {
917
+ "subset": "val",
918
+ "path": "data/coverage/val/image/100t.tif",
919
+ "mask": "data/coverage/val/mask/100forged.tif",
920
+ "label": 1
921
+ },
922
+ "21t.tif": {
923
+ "subset": "val",
924
+ "path": "data/coverage/val/image/21t.tif",
925
+ "mask": "data/coverage/val/mask/21forged.tif",
926
+ "label": 1
927
+ },
928
+ "88.tif": {
929
+ "subset": "val",
930
+ "path": "data/coverage/val/image/88.tif",
931
+ "label": 0
932
+ },
933
+ "74.tif": {
934
+ "subset": "val",
935
+ "path": "data/coverage/val/image/74.tif",
936
+ "label": 0
937
+ },
938
+ "7.tif": {
939
+ "subset": "val",
940
+ "path": "data/coverage/val/image/7.tif",
941
+ "label": 0
942
+ },
943
+ "33t.tif": {
944
+ "subset": "val",
945
+ "path": "data/coverage/val/image/33t.tif",
946
+ "mask": "data/coverage/val/mask/33forged.tif",
947
+ "label": 1
948
+ },
949
+ "89t.tif": {
950
+ "subset": "val",
951
+ "path": "data/coverage/val/image/89t.tif",
952
+ "mask": "data/coverage/val/mask/89forged.tif",
953
+ "label": 1
954
+ },
955
+ "24.tif": {
956
+ "subset": "val",
957
+ "path": "data/coverage/val/image/24.tif",
958
+ "label": 0
959
+ },
960
+ "37t.tif": {
961
+ "subset": "val",
962
+ "path": "data/coverage/val/image/37t.tif",
963
+ "mask": "data/coverage/val/mask/37forged.tif",
964
+ "label": 1
965
+ },
966
+ "83t.tif": {
967
+ "subset": "val",
968
+ "path": "data/coverage/val/image/83t.tif",
969
+ "mask": "data/coverage/val/mask/83forged.tif",
970
+ "label": 1
971
+ },
972
+ "19t.tif": {
973
+ "subset": "val",
974
+ "path": "data/coverage/val/image/19t.tif",
975
+ "mask": "data/coverage/val/mask/19forged.tif",
976
+ "label": 1
977
+ },
978
+ "76.tif": {
979
+ "subset": "val",
980
+ "path": "data/coverage/val/image/76.tif",
981
+ "label": 0
982
+ },
983
+ "65.tif": {
984
+ "subset": "val",
985
+ "path": "data/coverage/val/image/65.tif",
986
+ "label": 0
987
+ },
988
+ "51t.tif": {
989
+ "subset": "val",
990
+ "path": "data/coverage/val/image/51t.tif",
991
+ "mask": "data/coverage/val/mask/51forged.tif",
992
+ "label": 1
993
+ },
994
+ "69.tif": {
995
+ "subset": "val",
996
+ "path": "data/coverage/val/image/69.tif",
997
+ "label": 0
998
+ },
999
+ "37.tif": {
1000
+ "subset": "val",
1001
+ "path": "data/coverage/val/image/37.tif",
1002
+ "label": 0
1003
+ },
1004
+ "25.tif": {
1005
+ "subset": "val",
1006
+ "path": "data/coverage/val/image/25.tif",
1007
+ "label": 0
1008
+ },
1009
+ "72.tif": {
1010
+ "subset": "val",
1011
+ "path": "data/coverage/val/image/72.tif",
1012
+ "label": 0
1013
+ },
1014
+ "44t.tif": {
1015
+ "subset": "val",
1016
+ "path": "data/coverage/val/image/44t.tif",
1017
+ "mask": "data/coverage/val/mask/44forged.tif",
1018
+ "label": 1
1019
+ },
1020
+ "54t.tif": {
1021
+ "subset": "val",
1022
+ "path": "data/coverage/val/image/54t.tif",
1023
+ "mask": "data/coverage/val/mask/54forged.tif",
1024
+ "label": 1
1025
+ },
1026
+ "99t.tif": {
1027
+ "subset": "val",
1028
+ "path": "data/coverage/val/image/99t.tif",
1029
+ "mask": "data/coverage/val/mask/99forged.tif",
1030
+ "label": 1
1031
+ },
1032
+ "30.tif": {
1033
+ "subset": "val",
1034
+ "path": "data/coverage/val/image/30.tif",
1035
+ "label": 0
1036
+ },
1037
+ "11.tif": {
1038
+ "subset": "val",
1039
+ "path": "data/coverage/val/image/11.tif",
1040
+ "label": 0
1041
+ },
1042
+ "6t.tif": {
1043
+ "subset": "val",
1044
+ "path": "data/coverage/val/image/6t.tif",
1045
+ "mask": "data/coverage/val/mask/6forged.tif",
1046
+ "label": 1
1047
+ }
1048
+ }
datasets/__init__.py ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Dict
2
+
3
+ import albumentations as A
4
+
5
+ from .dataset import ImageDataset, crop_to_smallest_collate_fn
6
+
7
+
8
+ def get_dataset(datalist: Dict, subset, transform, opt):
9
+ datasets = {}
10
+ for k, v in datalist.items():
11
+ # val_transform = transform
12
+ if k in ["imd2020", "nist16"]:
13
+ val_transform = A.Compose([A.SmallestMaxSize(opt.tile_size)])
14
+ else:
15
+ val_transform = transform
16
+ datasets[k] = ImageDataset(
17
+ k,
18
+ v,
19
+ subset,
20
+ val_transform,
21
+ opt.uncorrect_label,
22
+ opt.mvc_spixel
23
+ if subset == "train"
24
+ else opt.crf_postproc or opt.convcrf_postproc or opt.spixel_postproc,
25
+ opt.mvc_num_spixel,
26
+ )
27
+
28
+ return datasets
datasets/dataset.py ADDED
@@ -0,0 +1,230 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import os
3
+ import random
4
+ import signal
5
+
6
+ import albumentations as A
7
+ import cv2
8
+ import h5py
9
+ import numpy as np
10
+ import torch
11
+ import torchvision.transforms as T
12
+ from albumentations.pytorch.functional import img_to_tensor, mask_to_tensor
13
+ from skimage import segmentation
14
+ from termcolor import cprint
15
+ from timm.data.constants import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD
16
+
17
+
18
+ class ImageDataset(torch.utils.data.Dataset):
19
+ def __init__(
20
+ self,
21
+ dataset_name: str,
22
+ datalist: str,
23
+ mode: str,
24
+ transform=None,
25
+ uncorrect_label=False,
26
+ spixel: bool = False,
27
+ num_spixel: int = 100,
28
+ ):
29
+ super().__init__()
30
+
31
+ assert os.path.exists(datalist), f"{datalist} does not exist"
32
+ assert mode in ["train", "val"], f"{mode} unsupported mode"
33
+
34
+ with open(datalist, "r") as f:
35
+ self.datalist = json.load(f)
36
+
37
+ self.datalist = dict(
38
+ filter(lambda x: x[1]["subset"] == mode, self.datalist.items())
39
+ )
40
+ if len(self.datalist) == 0:
41
+ raise NotImplementedError(f"no item in {datalist} {mode} dataset")
42
+ self.video_id_list = list(self.datalist.keys())
43
+ self.transform = transform
44
+ self.uncorrect_label = uncorrect_label
45
+
46
+ self.dataset_name = dataset_name
47
+ h5_path = os.path.join("data", dataset_name + "_dataset.hdf5")
48
+ self.use_h5 = os.path.exists(h5_path)
49
+ if self.use_h5:
50
+ cprint(
51
+ f"{dataset_name} {mode} HDF5 database found, loading into memory...",
52
+ "blue",
53
+ )
54
+ try:
55
+ with timeout(seconds=60):
56
+ self.database = h5py.File(h5_path, "r", driver="core")
57
+ except Exception as e:
58
+ self.database = h5py.File(h5_path, "r")
59
+ cprint(
60
+ "Failed to load {} HDF5 database to memory due to {}".format(
61
+ dataset_name, str(e)
62
+ ),
63
+ "red",
64
+ )
65
+ else:
66
+ cprint(
67
+ f"{dataset_name} {mode} HDF5 database not found, using raw images.",
68
+ "blue",
69
+ )
70
+
71
+ self.spixel = False
72
+ self.num_spixel = num_spixel
73
+ if spixel:
74
+ self.spixel = True
75
+ self.spixel_dict = {}
76
+
77
+ def __getitem__(self, index):
78
+ image_id = self.video_id_list[index]
79
+ info = self.datalist[image_id]
80
+ label = float(info["label"])
81
+ if self.use_h5:
82
+ try:
83
+ image = self.database[info["path"].replace("/", "-")][()]
84
+ except Exception as e:
85
+ cprint(
86
+ "Failed to load {} from {} due to {}".format(
87
+ image_id, self.dataset_name, str(e)
88
+ ),
89
+ "red",
90
+ )
91
+ image = cv2.imread(info["path"])
92
+ image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
93
+ else:
94
+ assert os.path.exists(info["path"]), f"{info['path']} does not exist!"
95
+ image = cv2.imread(info["path"])
96
+ image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
97
+
98
+ if self.spixel and image_id not in self.spixel_dict.keys():
99
+ spixel = segmentation.slic(
100
+ image, n_segments=self.num_spixel, channel_axis=2, start_label=0
101
+ )
102
+ self.spixel_dict[image_id] = spixel
103
+
104
+ image_size = image.shape[:2]
105
+
106
+ # 1 means modified area, 0 means pristine
107
+ if "mask" in info.keys():
108
+ if self.use_h5:
109
+ try:
110
+ mask = self.database[info["mask"].replace("/", "-")][()]
111
+ except Exception as e:
112
+ cprint(
113
+ "Failed to load {} mask from {} due to {}".format(
114
+ image_id, self.dataset_name, str(e)
115
+ ),
116
+ "red",
117
+ )
118
+ mask = cv2.imread(info["mask"], cv2.IMREAD_GRAYSCALE)
119
+ else:
120
+ mask = cv2.imread(info["mask"], cv2.IMREAD_GRAYSCALE)
121
+ else:
122
+ if label == 0:
123
+ mask = np.zeros(image_size)
124
+ else:
125
+ mask = np.ones(image_size)
126
+
127
+ if self.transform is not None:
128
+ if self.spixel:
129
+ transformed = self.transform(
130
+ image=image, masks=[mask, self.spixel_dict[image_id]]
131
+ ) # TODO I am not sure if this is correct for scaling
132
+ mask = transformed["masks"][0]
133
+ spixel = transformed["masks"][1]
134
+ else:
135
+ transformed = self.transform(image=image, mask=mask)
136
+ mask = transformed["mask"]
137
+
138
+ image = transformed["image"]
139
+ if not self.uncorrect_label:
140
+ label = float(mask.max() != 0.0)
141
+
142
+ if label == 1.0 and image.shape[:-1] != mask.shape:
143
+ mask = cv2.resize(mask, dsize=(image.shape[1], image.shape[0]))
144
+
145
+ unnormalized_image = img_to_tensor(image)
146
+ image = img_to_tensor(
147
+ image,
148
+ normalize={"mean": IMAGENET_DEFAULT_MEAN, "std": IMAGENET_DEFAULT_STD},
149
+ )
150
+ mask = mask_to_tensor(mask, num_classes=1, sigmoid=True)
151
+
152
+ output = {
153
+ "image": image, # tensor of 3, H, W
154
+ "label": label, # float
155
+ "mask": mask, # tensor of 1, H, W
156
+ "id": image_id, # string
157
+ "unnormalized_image": unnormalized_image,
158
+ } # tensor of 3, H, W
159
+ if self.spixel:
160
+ spixel = torch.from_numpy(spixel).unsqueeze(0)
161
+ output["spixel"] = spixel
162
+ return output
163
+
164
+ def __len__(self):
165
+ return len(self.video_id_list)
166
+
167
+
168
+ def crop_to_smallest_collate_fn(batch, max_size=128, uncorrect_label=False):
169
+ # get the smallest image size in a batch
170
+ smallest_size = [max_size, max_size]
171
+ for item in batch:
172
+ if item["mask"].shape[-2:] != item["image"].shape[-2:]:
173
+ cprint(
174
+ f"{item['id']} has inconsistent image-mask sizes,"
175
+ "with image size {item['image'].shape[-2:]} and mask size"
176
+ "{item['mask'].shape[-2:]}!",
177
+ "red",
178
+ )
179
+ image_size = item["image"].shape[-2:]
180
+ if image_size[0] < smallest_size[0]:
181
+ smallest_size[0] = image_size[0]
182
+ if image_size[1] < smallest_size[1]:
183
+ smallest_size[1] = image_size[1]
184
+
185
+ # crop all images and masks in each item to the smallest size
186
+ result = {}
187
+ for item in batch:
188
+ image_size = item["image"].shape[-2:]
189
+ x1 = random.randint(0, image_size[1] - smallest_size[1])
190
+ y1 = random.randint(0, image_size[0] - smallest_size[0])
191
+ x2 = x1 + smallest_size[1]
192
+ y2 = y1 + smallest_size[0]
193
+ for k in ["image", "mask", "unnormalized_image", "spixel"]:
194
+ if k not in item.keys():
195
+ continue
196
+ item[k] = item[k][:, y1:y2, x1:x2]
197
+ if not uncorrect_label:
198
+ item["label"] = float(item["mask"].max() != 0.0)
199
+ for k, v in item.items():
200
+ if k in result.keys():
201
+ result[k].append(v)
202
+ else:
203
+ result[k] = [v]
204
+
205
+ # stack all outputs
206
+ for k, v in result.items():
207
+ if k in ["image", "mask", "unnormalized_image", "spixel"]:
208
+ if k not in result.keys():
209
+ continue
210
+ result[k] = torch.stack(v, dim=0)
211
+ elif k in ["label"]:
212
+ result[k] = torch.tensor(v).float()
213
+
214
+ return result
215
+
216
+
217
+ class timeout:
218
+ def __init__(self, seconds=1, error_message="Timeout"):
219
+ self.seconds = seconds
220
+ self.error_message = error_message
221
+
222
+ def handle_timeout(self, signum, frame):
223
+ raise TimeoutError(self.error_message)
224
+
225
+ def __enter__(self):
226
+ signal.signal(signal.SIGALRM, self.handle_timeout)
227
+ signal.alarm(self.seconds)
228
+
229
+ def __exit__(self, type, value, traceback):
230
+ signal.alarm(0)
engine.py ADDED
@@ -0,0 +1,454 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import itertools
2
+ import os
3
+ import random
4
+ import shutil
5
+ from math import ceil
6
+ from typing import Dict, List
7
+
8
+ import numpy as np
9
+ import prettytable as pt
10
+ import torch
11
+ import torch.nn as nn
12
+ from fast_pytorch_kmeans import KMeans
13
+ from pathlib2 import Path
14
+ from scipy.stats import hmean
15
+ from sklearn import metrics
16
+ from termcolor import cprint
17
+ from torchvision.utils import draw_segmentation_masks, make_grid, save_image
18
+
19
+ import utils.misc as misc
20
+ from losses import get_spixel_tgt_map, get_volume_seg_map
21
+ from utils.convcrf import convcrf
22
+ from utils.crf import DenseCRF
23
+
24
+
25
+ def train(
26
+ model: nn.Module,
27
+ dataloader,
28
+ dataset_title: str,
29
+ optimizer_dict: Dict,
30
+ criterion,
31
+ epoch: int,
32
+ writer,
33
+ suffix: str,
34
+ opt,
35
+ ):
36
+
37
+ metric_logger = misc.MetricLogger(writer=writer, suffix=suffix)
38
+ cprint("{}-th epoch training on {}".format(epoch, dataset_title), "blue")
39
+ model.train()
40
+ roc_auc_elements = {
41
+ modality: {"map_scores": [], "vol_scores": []}
42
+ for modality in itertools.chain(opt.modality, ["ensemble"])
43
+ }
44
+ roc_auc_elements["labels"] = []
45
+
46
+ for i, data in metric_logger.log_every(
47
+ dataloader, print_freq=opt.print_freq, header=f"[{suffix} {epoch}]"
48
+ ):
49
+ if (opt.debug or opt.wholetest) and i > 50:
50
+ break
51
+
52
+ for modality, optimizer in optimizer_dict.items():
53
+ optimizer.zero_grad()
54
+
55
+ image = data["image"].to(opt.device)
56
+ unnormalized_image = data["unnormalized_image"].to(opt.device)
57
+ label = data["label"].to(opt.device)
58
+ mask = data["mask"].to(opt.device)
59
+ spixel = data["spixel"].to(opt.device) if opt.mvc_spixel else None
60
+
61
+ outputs = model(
62
+ image,
63
+ seg_size=None
64
+ if opt.loss_on_mid_map
65
+ else [image.shape[-2], image.shape[-1]],
66
+ )
67
+
68
+ losses = criterion(
69
+ outputs,
70
+ label,
71
+ mask,
72
+ epoch=epoch,
73
+ max_epoch=opt.epochs,
74
+ spixel=spixel,
75
+ raw_image=unnormalized_image,
76
+ )
77
+ total_loss = losses["total_loss"]
78
+ total_loss.backward()
79
+
80
+ for modality in opt.modality:
81
+ if opt.grad_clip > 0.0:
82
+ grad_norm = nn.utils.clip_grad_norm_(
83
+ model.sub_models[modality].parameters(), opt.grad_clip
84
+ )
85
+ metric_logger.update(**{f"grad_norm/{modality}": grad_norm})
86
+
87
+ optimizer_dict[modality].step()
88
+
89
+ # image-level metrices logger
90
+ roc_auc_elements["labels"].extend(label.tolist())
91
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
92
+ roc_auc_elements[modality]["map_scores"].extend(
93
+ outputs[modality]["map_pred"].tolist()
94
+ )
95
+ roc_auc_elements[modality]["vol_scores"].extend(
96
+ (outputs[modality]["vol_pred"]).tolist()
97
+ )
98
+
99
+ metric_logger.update(**losses)
100
+
101
+ image_metrics = update_image_roc_auc_metric(
102
+ opt.modality + ["ensemble"], roc_auc_elements, None
103
+ )
104
+ metric_logger.update(**image_metrics)
105
+
106
+ metric_logger.write_tensorboard(epoch)
107
+ print("Average status:")
108
+ print(metric_logger.stat_table())
109
+
110
+
111
+ def bundled_evaluate(
112
+ model: nn.Module, dataloaders: Dict, criterion, epoch, writer, suffix, opt
113
+ ):
114
+
115
+ metric_logger = misc.MetricLogger(writer=writer, suffix=suffix + "_avg")
116
+ for dataset, dataloader in dataloaders.items():
117
+ outputs = evaluate(
118
+ model,
119
+ dataloader,
120
+ criterion,
121
+ dataset,
122
+ epoch,
123
+ writer,
124
+ suffix + f"_{dataset}",
125
+ opt,
126
+ )
127
+ old_keys = list(outputs.keys())
128
+ for k in old_keys:
129
+ outputs[k.replace(dataset.upper(), "AVG")] = outputs[k]
130
+ for k in old_keys:
131
+ del outputs[k]
132
+
133
+ metric_logger.update(**outputs)
134
+
135
+ metric_logger.write_tensorboard(epoch)
136
+ print("Average status:")
137
+ print(metric_logger.stat_table())
138
+ return metric_logger.get_meters()
139
+
140
+
141
+ def evaluate(
142
+ model: nn.Module,
143
+ dataloader,
144
+ criterion,
145
+ dataset_title: str,
146
+ epoch: int,
147
+ writer,
148
+ suffix: str,
149
+ opt,
150
+ ):
151
+
152
+ metric_logger = misc.MetricLogger(writer=writer, suffix=suffix)
153
+ cprint("{}-th epoch evaluation on {}".format(epoch, dataset_title.upper()), "blue")
154
+
155
+ model.eval()
156
+
157
+ if opt.crf_postproc:
158
+ postprocess = DenseCRF(
159
+ iter_max=opt.crf_iter_max,
160
+ pos_w=opt.crf_pos_w,
161
+ pos_xy_std=opt.crf_pos_xy_std,
162
+ bi_w=opt.crf_bi_w,
163
+ bi_xy_std=opt.crf_bi_xy_std,
164
+ bi_rgb_std=opt.crf_bi_rgb_std,
165
+ )
166
+ elif opt.convcrf_postproc:
167
+ convcrf_config = convcrf.default_conf
168
+ convcrf_config["skip_init_softmax"] = True
169
+ convcrf_config["final_softmax"] = True
170
+ shape = [opt.convcrf_shape, opt.convcrf_shape]
171
+ postprocess = convcrf.GaussCRF(
172
+ conf=convcrf_config, shape=shape, nclasses=2, use_gpu=True
173
+ ).to(opt.device)
174
+
175
+ figure_path = opt.figure_path + f"_{dataset_title.upper()}"
176
+ if opt.save_figure:
177
+ if os.path.exists(figure_path):
178
+ shutil.rmtree(figure_path)
179
+ os.mkdir(figure_path)
180
+ cprint("Saving figures to {}".format(figure_path), "blue")
181
+
182
+ if opt.max_pool_postproc > 1:
183
+ max_pool = nn.MaxPool2d(
184
+ kernel_size=opt.max_pool_postproc,
185
+ stride=1,
186
+ padding=(opt.max_pool_postproc - 1) // 2,
187
+ ).to(opt.device)
188
+ else:
189
+ max_pool = nn.Identity().to(opt.device)
190
+ # used_sliding_prediction = False
191
+ roc_auc_elements = {
192
+ modality: {"map_scores": [], "vol_scores": []}
193
+ for modality in itertools.chain(opt.modality, ["ensemble"])
194
+ }
195
+ roc_auc_elements["labels"] = []
196
+ with torch.no_grad():
197
+ for i, data in metric_logger.log_every(
198
+ dataloader, print_freq=opt.print_freq, header=f"[{suffix} {epoch}]"
199
+ ):
200
+ if (opt.debug or opt.wholetest) and i > 50:
201
+ break
202
+
203
+ image_size = data["image"].shape[-2:]
204
+ label = data["label"]
205
+ mask = data["mask"]
206
+ if opt.crf_postproc or opt.spixel_postproc or opt.convcrf_postproc:
207
+ spixel = data["spixel"].to(opt.device)
208
+ if max(image_size) > opt.tile_size and opt.large_image_strategy == "slide":
209
+ outputs = sliding_predict(
210
+ model, data, opt.tile_size, opt.tile_overlap, opt
211
+ )
212
+ else:
213
+ image = data["image"].to(opt.device)
214
+ outputs = model(image, seg_size=image.shape[-2:])
215
+
216
+ if opt.max_pool_postproc > 1:
217
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
218
+ outputs[modality]["out_map"] = max_pool(
219
+ outputs[modality]["out_map"]
220
+ )
221
+ # CRF
222
+ if opt.crf_postproc:
223
+ raw_prob = outputs["ensemble"]["out_map"]
224
+ image = data["unnormalized_image"] * 255.0
225
+ if opt.crf_downsample > 1:
226
+ image = (
227
+ torch.nn.functional.interpolate(
228
+ image,
229
+ size=(
230
+ image_size[0] // opt.crf_downsample,
231
+ image_size[1] // opt.crf_downsample,
232
+ ),
233
+ mode="bilinear",
234
+ align_corners=False,
235
+ )
236
+ .clamp(0, 255)
237
+ .int()
238
+ )
239
+ image = image.squeeze(0).numpy().astype(np.uint8).transpose(1, 2, 0)
240
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
241
+ prob = outputs[modality]["out_map"].squeeze(1)
242
+ if opt.crf_downsample > 1:
243
+ prob = (
244
+ torch.nn.functional.interpolate(
245
+ prob,
246
+ size=(
247
+ image_size[0] // opt.crf_downsample,
248
+ image_size[1] // opt.crf_downsample,
249
+ ),
250
+ mode="bilinear",
251
+ align_corners=False,
252
+ )
253
+ .clamp(0, 1)
254
+ .squeeze(0)
255
+ )
256
+ prob = torch.cat([prob, 1 - prob], dim=0).detach().cpu().numpy()
257
+ prob = postprocess(image, prob)
258
+ prob = prob[None, 0, ...]
259
+ prob = torch.tensor(prob, device=opt.device).unsqueeze(0)
260
+ if opt.crf_downsample > 1:
261
+ prob = torch.nn.functional.interpolate(
262
+ prob, size=image_size, mode="bilinear", align_corners=False
263
+ ).clamp(0, 1)
264
+ outputs[modality]["out_map"] = prob
265
+ outputs[modality]["map_pred"] = (
266
+ outputs[modality]["out_map"].max().unsqueeze(0)
267
+ )
268
+ elif opt.convcrf_postproc:
269
+ raw_prob = outputs["ensemble"]["out_map"]
270
+ image = data["unnormalized_image"].to(opt.device) * 255.0
271
+ image = (
272
+ torch.nn.functional.interpolate(
273
+ image,
274
+ size=(opt.convcrf_shape, opt.convcrf_shape),
275
+ mode="bilinear",
276
+ align_corners=False,
277
+ )
278
+ .clamp(0, 255)
279
+ .int()
280
+ )
281
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
282
+ prob = outputs[modality]["out_map"]
283
+ prob = torch.cat([prob, 1 - prob], dim=1)
284
+ prob = torch.nn.functional.interpolate(
285
+ prob,
286
+ size=(opt.convcrf_shape, opt.convcrf_shape),
287
+ mode="bilinear",
288
+ align_corners=False,
289
+ ).clamp(0, 1)
290
+ prob = postprocess(unary=prob, img=image)
291
+ prob = torch.nn.functional.interpolate(
292
+ prob, size=image_size, mode="bilinear", align_corners=False
293
+ ).clamp(0, 1)
294
+ outputs[modality]["out_map"] = prob[:, 0, None, ...]
295
+ outputs[modality]["map_pred"] = (
296
+ outputs[modality]["out_map"].max().unsqueeze(0)
297
+ )
298
+ elif opt.spixel_postproc:
299
+ raw_prob = outputs["ensemble"]["out_map"]
300
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
301
+ outputs[modality]["out_map"] = get_spixel_tgt_map(
302
+ outputs[modality]["out_map"], spixel
303
+ )
304
+
305
+ # image-level metrices logger
306
+ roc_auc_elements["labels"].extend(label.detach().cpu().tolist())
307
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
308
+ roc_auc_elements[modality]["map_scores"].extend(
309
+ outputs[modality]["map_pred"].detach().cpu().tolist()
310
+ )
311
+ roc_auc_elements[modality]["vol_scores"].extend(
312
+ (outputs[modality]["vol_pred"]).detach().cpu().tolist()
313
+ )
314
+
315
+ # generate binary prediction mask
316
+ out_map = {
317
+ modality: outputs[modality]["out_map"] > opt.mask_threshold
318
+ for modality in itertools.chain(opt.modality, ["ensemble"])
319
+ }
320
+
321
+ # only compute pixel-level metrics for manipulated images
322
+ if label.item() == 1.0:
323
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
324
+ pixel_metrics = misc.calculate_pixel_f1(
325
+ out_map[modality].float().detach().cpu().numpy().flatten(),
326
+ mask.detach().cpu().numpy().flatten(),
327
+ suffix=f"/{modality}",
328
+ )
329
+ metric_logger.update(**pixel_metrics)
330
+
331
+ # save images, mask, and prediction map
332
+ if opt.save_figure:
333
+ unnormalized_image = data["unnormalized_image"]
334
+ # image_id = data['id'][0].split('.')[0]
335
+ image_id = Path(data["id"][0]).stem
336
+ save_image(
337
+ (
338
+ outputs["ensemble"]["out_map"][0, ...] > opt.mask_threshold
339
+ ).float()
340
+ * 255,
341
+ os.path.join(figure_path, f"{image_id}_ensemble_map.png"),
342
+ )
343
+
344
+ image_metrics = update_image_roc_auc_metric(
345
+ opt.modality + ["ensemble"],
346
+ roc_auc_elements,
347
+ {
348
+ modality: metric_logger.meters[f"pixel_f1/{modality}"].avg
349
+ for modality in itertools.chain(opt.modality, ["ensemble"])
350
+ },
351
+ )
352
+ metric_logger.update(**image_metrics)
353
+
354
+ metric_logger.prepend_subprefix(f"{dataset_title.upper()}_")
355
+ metric_logger.write_tensorboard(epoch)
356
+ print("Average status:")
357
+ print(metric_logger.stat_table())
358
+
359
+ return metric_logger.get_meters()
360
+
361
+
362
+ def update_image_roc_auc_metric(modalities: List, roc_auc_elements, pixel_f1=None):
363
+
364
+ result = {}
365
+ for modality in modalities:
366
+ image_metrics = misc.calculate_img_score(
367
+ np.array(roc_auc_elements[modality]["map_scores"]) > 0.5,
368
+ (np.array(roc_auc_elements["labels"]) > 0).astype(np.int),
369
+ suffix=f"/{modality}",
370
+ )
371
+ if pixel_f1 is not None:
372
+ image_f1 = image_metrics[f"image_f1/{modality}"]
373
+ combined_f1 = hmean([image_f1, pixel_f1[modality]])
374
+ image_metrics[f"comb_f1/{modality}"] = float(combined_f1)
375
+ if 0.0 in roc_auc_elements["labels"] and 1.0 in roc_auc_elements["labels"]:
376
+ image_auc = metrics.roc_auc_score(
377
+ roc_auc_elements["labels"], roc_auc_elements[modality]["map_scores"]
378
+ )
379
+ image_metrics[f"image_auc/{modality}"] = image_auc
380
+ result.update(image_metrics)
381
+
382
+ return result
383
+
384
+
385
+ def pad_image(image, target_size):
386
+ image_size = image.shape[-2:]
387
+ if image_size != target_size:
388
+ row_missing = target_size[0] - image_size[0]
389
+ col_missing = target_size[1] - image_size[1]
390
+ image = nn.functional.pad(
391
+ image, (0, row_missing, 0, col_missing), "constant", 0
392
+ )
393
+ return image
394
+
395
+
396
+ def sliding_predict(model: nn.Module, data, tile_size, tile_overlap, opt):
397
+ image = data["image"]
398
+ mask = data["mask"]
399
+ image = image.to(opt.device)
400
+ image_size = image.shape[-2:]
401
+ stride = ceil(tile_size * (1 - tile_overlap))
402
+ tile_rows = int(ceil((image_size[0] - tile_size) / stride) + 1)
403
+ tile_cols = int(ceil((image_size[1] - tile_size) / stride) + 1)
404
+ result = {}
405
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
406
+ result[modality] = {
407
+ "out_map": torch.zeros_like(
408
+ mask, requires_grad=False, dtype=torch.float32, device=opt.device
409
+ ),
410
+ "out_vol_map": torch.zeros_like(
411
+ mask, requires_grad=False, dtype=torch.float32, device=opt.device
412
+ ),
413
+ }
414
+ map_counter = torch.zeros_like(
415
+ mask, requires_grad=False, dtype=torch.float32, device=opt.device
416
+ )
417
+
418
+ with torch.no_grad():
419
+ for row in range(tile_rows):
420
+ for col in range(tile_cols):
421
+ x1 = int(col * stride)
422
+ y1 = int(row * stride)
423
+ x2 = min(x1 + tile_size, image_size[1])
424
+ y2 = min(y1 + tile_size, image_size[0])
425
+ x1 = max(int(x2 - tile_size), 0)
426
+ y1 = max(int(y2 - tile_size), 0)
427
+
428
+ image_tile = image[:, :, y1:y2, x1:x2]
429
+ image_tile = pad_image(image_tile, [opt.tile_size, opt.tile_size])
430
+ tile_outputs = model(image_tile, seg_size=(image_tile.shape[-2:]))
431
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
432
+ result[modality]["out_map"][:, :, y1:y2, x1:x2] += tile_outputs[
433
+ modality
434
+ ]["out_map"][:, :, : y2 - y1, : x2 - x1]
435
+ out_vol_map = get_volume_seg_map(
436
+ tile_outputs[modality]["out_vol"],
437
+ size=image_tile.shape[-2:],
438
+ label=data["label"],
439
+ kmeans=KMeans(2) if opt.consistency_kmeans else None,
440
+ )[:, :, : y2 - y1, : x2 - x1]
441
+ result[modality]["out_vol_map"][:, :, y1:y2, x1:x2] += out_vol_map
442
+ map_counter[:, :, y1:y2, x1:x2] += 1
443
+
444
+ for modality in itertools.chain(opt.modality, ["ensemble"]):
445
+ result[modality]["out_map"] /= map_counter
446
+ result[modality]["out_vol_map"] /= map_counter
447
+ result[modality]["map_pred"] = (
448
+ result[modality]["out_map"].max().unsqueeze(0)
449
+ )
450
+ result[modality]["vol_pred"] = (
451
+ result[modality]["out_vol_map"].max().unsqueeze(0)
452
+ )
453
+
454
+ return result
losses/__init__.py ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from .bundled_loss import BundledLoss
2
+ from .consisitency_loss import get_consistency_loss, get_volume_seg_map
3
+ from .entropy_loss import get_entropy_loss
4
+ from .loss import Loss
5
+ from .map_label_loss import get_map_label_loss
6
+ from .map_mask_loss import get_map_mask_loss
7
+ from .multi_view_consistency_loss import (
8
+ get_multi_view_consistency_loss,
9
+ get_spixel_tgt_map,
10
+ )
11
+ from .volume_label_loss import get_volume_label_loss
12
+ from .volume_mask_loss import get_volume_mask_loss
13
+
14
+
15
+ def get_bundled_loss(opt):
16
+ """Loss function for the overeall training, including the multi-view
17
+ consistency loss."""
18
+ single_modality_loss = get_loss(opt)
19
+ multi_view_consistency_loss = get_multi_view_consistency_loss(opt)
20
+ volume_mask_loss = get_volume_mask_loss(opt)
21
+ bundled_loss = BundledLoss(
22
+ single_modality_loss,
23
+ multi_view_consistency_loss,
24
+ volume_mask_loss,
25
+ opt.mvc_weight,
26
+ opt.mvc_time_dependent,
27
+ opt.mvc_steepness,
28
+ opt.modality,
29
+ opt.consistency_weight,
30
+ opt.consistency_source,
31
+ )
32
+
33
+ return bundled_loss
34
+
35
+
36
+ def get_loss(opt):
37
+ """Loss function for a single model, excluding the multi-view consistency
38
+ loss."""
39
+ map_label_loss = get_map_label_loss(opt)
40
+ volume_label_loss = get_volume_label_loss(opt)
41
+ map_mask_loss = get_map_mask_loss(opt)
42
+ volume_mask_loss = get_volume_mask_loss(opt)
43
+ consisitency_loss = get_consistency_loss(opt)
44
+ entropy_loss = get_entropy_loss(opt)
45
+ loss = Loss(
46
+ map_label_loss,
47
+ volume_label_loss,
48
+ map_mask_loss,
49
+ volume_mask_loss,
50
+ consisitency_loss,
51
+ entropy_loss,
52
+ opt.map_label_weight,
53
+ opt.volume_label_weight,
54
+ opt.map_mask_weight,
55
+ opt.volume_mask_weight,
56
+ opt.consistency_weight,
57
+ opt.map_entropy_weight,
58
+ opt.volume_entropy_weight,
59
+ opt.consistency_source,
60
+ )
61
+ return loss
losses/bundled_loss.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import math
2
+ from typing import Dict, List, Optional
3
+
4
+ import torch
5
+ import torch.nn as nn
6
+
7
+
8
+ class BundledLoss(nn.Module):
9
+ def __init__(
10
+ self,
11
+ single_modality_loss,
12
+ multi_view_consistency_loss,
13
+ volume_mask_loss,
14
+ multi_view_consistency_weight: float,
15
+ mvc_time_dependent: bool,
16
+ mvc_steepness: float,
17
+ modality: List,
18
+ consistency_weight: float,
19
+ consistency_source: str,
20
+ ):
21
+ super().__init__()
22
+
23
+ self.single_modality_loss = single_modality_loss
24
+ self.multi_view_consistency_loss = multi_view_consistency_loss
25
+ self.volume_mask_loss = volume_mask_loss
26
+
27
+ self.mvc_weight = multi_view_consistency_weight
28
+ self.mvc_time_dependent = mvc_time_dependent
29
+ self.mvc_steepness = mvc_steepness
30
+ self.modality = modality
31
+ self.consistency_weight = consistency_weight
32
+ self.consistency_source = consistency_source
33
+
34
+ def forward(
35
+ self,
36
+ output: Dict,
37
+ label,
38
+ mask,
39
+ epoch: int = 1,
40
+ max_epoch: int = 70,
41
+ spixel=None,
42
+ raw_image=None,
43
+ ):
44
+
45
+ total_loss = 0.0
46
+ loss_dict = {}
47
+ for modality in self.modality:
48
+ single_loss = self.single_modality_loss(output[modality], label, mask)
49
+
50
+ for k, v in single_loss.items():
51
+ loss_dict[f"{k}/{modality}"] = v
52
+ total_loss = total_loss + single_loss["total_loss"]
53
+
54
+ if self.mvc_time_dependent:
55
+ mvc_weight = self.mvc_weight * math.exp(
56
+ -self.mvc_steepness * (1 - epoch / max_epoch) ** 2
57
+ )
58
+ else:
59
+ mvc_weight = self.mvc_weight
60
+
61
+ multi_view_consistency_loss = self.multi_view_consistency_loss(
62
+ output, label, spixel, raw_image, mask
63
+ )
64
+ for k, v in multi_view_consistency_loss.items():
65
+ if k not in ["total_loss", "tgt_map"]:
66
+ loss_dict.update({k: v})
67
+
68
+ if self.consistency_weight != 0.0 and self.consistency_source == "ensemble":
69
+ for modality in self.modality:
70
+ consisitency_loss = self.volume_mask_loss(
71
+ output[modality]["out_vol"], multi_view_consistency_loss["tgt_map"]
72
+ )
73
+ consisitency_loss = consisitency_loss["loss"]
74
+ loss_dict[f"consistency_loss/{modality}"] = consisitency_loss
75
+ total_loss = (
76
+ total_loss
77
+ + self.consistency_weight
78
+ * consisitency_loss
79
+ * math.exp(-self.mvc_steepness * (1 - epoch / max_epoch) ** 2)
80
+ )
81
+
82
+ total_loss = total_loss + mvc_weight * multi_view_consistency_loss["total_loss"]
83
+
84
+ return {"total_loss": total_loss, **loss_dict}
losses/consisitency_loss.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ from einops import rearrange
4
+ from fast_pytorch_kmeans import KMeans
5
+
6
+
7
+ def get_consistency_loss(opt):
8
+ loss = ConsistencyLoss(
9
+ opt.consistency_type, opt.consistency_kmeans, opt.consistency_stop_map_grad
10
+ )
11
+ return loss
12
+
13
+
14
+ class ConsistencyLoss(nn.Module):
15
+ def __init__(
16
+ self, loss: str, do_kmeans: bool = True, consistency_stop_map_grad: bool = False
17
+ ):
18
+ super().__init__()
19
+ assert loss in ["l1", "l2"]
20
+
21
+ if loss == "l1":
22
+ self.consistency_loss = nn.L1Loss(reduction="mean")
23
+ else: # l2
24
+ self.consistency_loss = nn.MSELoss(reduction="mean")
25
+
26
+ self.do_kmeans = do_kmeans
27
+ if do_kmeans:
28
+ self.kmeans = KMeans(2)
29
+ else:
30
+ self.kmeans = None
31
+
32
+ self.consistency_stop_map_grad = consistency_stop_map_grad
33
+
34
+ def forward(self, out_volume, out_map, label):
35
+ map_shape = out_map.shape[-2:]
36
+ out_volume = get_volume_seg_map(out_volume, map_shape, label, self.kmeans)
37
+ if self.consistency_stop_map_grad:
38
+ loss = self.consistency_loss(out_volume, out_map.detach())
39
+ else:
40
+ loss = self.consistency_loss(out_volume, out_map)
41
+ return {"loss": loss, "out_vol": out_volume.squeeze(1)}
42
+
43
+
44
+ def get_volume_seg_map(volume, size, label, kmeans=None):
45
+ """volume is of shape [b, h, w, h, w], and size is [h', w']"""
46
+ batch_size = volume.shape[0]
47
+ volume_shape = volume.shape[-2:]
48
+ volume = rearrange(volume, "b h1 w1 h2 w2 -> b (h1 w1) (h2 w2)")
49
+ if kmeans is not None: # do k-means on out_volume
50
+ for i in range(batch_size):
51
+ # NOTE K-means only applies for manipulated images!
52
+ if label[i] == 0:
53
+ continue
54
+ batch_volume = volume[i, ...]
55
+ out = kmeans.fit_predict(batch_volume)
56
+ ones = torch.where(out == 1)
57
+ zeros = torch.where(out == 0)
58
+ if (
59
+ ones[0].numel() >= zeros[0].numel()
60
+ ): # intuitively, the cluster with fewer elements is the modified cluster
61
+ pristine, modified = ones, zeros
62
+ else:
63
+ pristine, modified = zeros, ones
64
+ volume[i, :, modified[0]] = 1 - volume[i, :, modified[0]]
65
+
66
+ volume = volume.mean(dim=-1)
67
+ volume = rearrange(volume, "b (h w) -> b h w", h=volume_shape[0])
68
+ volume = volume.unsqueeze(1)
69
+ if volume_shape != size:
70
+ volume = nn.functional.interpolate(
71
+ volume, size=size, mode="bilinear", align_corners=False
72
+ )
73
+ return volume # size [b, 1, h, w]
losses/entropy_loss.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+
4
+
5
+ def get_entropy_loss(opt):
6
+ return EntropyLoss()
7
+
8
+
9
+ class EntropyLoss(nn.Module):
10
+ def __init__(self):
11
+ super().__init__()
12
+ self.exp = 1e-7
13
+ assert self.exp < 0.5
14
+
15
+ def forward(self, item):
16
+ item = item.clamp(min=self.exp, max=1 - self.exp)
17
+ entropy = -item * torch.log(item) - (1 - item) * torch.log(1 - item)
18
+ entropy = entropy.mean()
19
+
20
+ return {"loss": entropy}
losses/loss.py ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+
4
+
5
+ class Loss(nn.Module):
6
+ def __init__(
7
+ self,
8
+ map_label_loss,
9
+ volume_label_loss,
10
+ map_mask_loss,
11
+ volume_mask_loss,
12
+ consistency_loss,
13
+ entropy_loss,
14
+ map_label_weight,
15
+ volume_label_weight,
16
+ map_mask_weight,
17
+ volume_mask_weight,
18
+ consistency_weight,
19
+ map_entropy_weight,
20
+ volume_entropy_weight,
21
+ consistency_source,
22
+ ):
23
+ super().__init__()
24
+
25
+ self.map_label_loss = map_label_loss
26
+ self.volume_label_loss = volume_label_loss
27
+ self.map_mask_loss = map_mask_loss
28
+ self.volume_mask_loss = volume_mask_loss
29
+ self.consistency_loss = consistency_loss
30
+ self.entropy_loss = entropy_loss
31
+
32
+ self.map_label_weight = map_label_weight
33
+ self.volume_label_weight = volume_label_weight
34
+ self.map_mask_weight = map_mask_weight
35
+ self.volume_mask_weight = volume_mask_weight
36
+ self.consistency_weight = consistency_weight
37
+ self.map_entropy_weight = map_entropy_weight
38
+ self.volume_entropy_weight = volume_entropy_weight
39
+ self.consistency_source = consistency_source
40
+
41
+ def forward(self, output, label, mask):
42
+ total_loss = 0.0
43
+ loss_dict = {}
44
+
45
+ # --- label loss ---
46
+ label = label.float()
47
+ # compute map label loss anyway
48
+ map_label_loss = self.map_label_loss(
49
+ output["map_pred"], output["out_map"], label
50
+ )["loss"]
51
+ total_loss = total_loss + self.map_label_weight * map_label_loss
52
+ loss_dict.update({"map_label_loss": map_label_loss})
53
+
54
+ if self.volume_label_weight != 0.0:
55
+ volume_label_loss = self.volume_label_loss(
56
+ output["vol_pred"], output["out_vol"], label
57
+ )["loss"]
58
+ total_loss = total_loss + self.volume_label_weight * volume_label_loss
59
+ loss_dict.update({"vol_label_loss": volume_label_loss})
60
+
61
+ # --- mask loss ---
62
+ # compute map mask loss anyway
63
+ map_mask_loss = self.map_mask_loss(output["out_map"], mask)["loss"]
64
+ total_loss = total_loss + self.map_mask_weight * map_mask_loss
65
+ loss_dict.update({"map_mask_loss": map_mask_loss})
66
+
67
+ if self.volume_mask_weight != 0.0:
68
+ volume_mask_loss = self.volume_mask_loss(output["out_vol"], mask)["loss"]
69
+ total_loss = total_loss + self.volume_mask_weight * volume_mask_loss
70
+ loss_dict.update({"vol_mask_loss": volume_mask_loss})
71
+
72
+ # --- self-consistency loss ---
73
+ if self.consistency_weight != 0.0 and self.consistency_source == "self":
74
+ consistency_loss = self.consistency_loss(
75
+ output["out_vol"], output["out_map"], label
76
+ )
77
+ consistency_loss = consistency_loss["loss"]
78
+ total_loss = total_loss + self.consistency_weight * consistency_loss
79
+ loss_dict.update({"consistency_loss": consistency_loss})
80
+
81
+ # --- entropy loss ---
82
+ if self.map_entropy_weight != 0.0:
83
+ map_entropy_loss = self.entropy_loss(output["out_map"])["loss"]
84
+ total_loss = total_loss + self.map_entropy_weight * map_entropy_loss
85
+ loss_dict.update({"map_entropy_loss": map_entropy_loss})
86
+
87
+ if self.volume_entropy_weight != 0:
88
+ volume_entropy_loss = self.entropy_loss(output["out_vol"])["loss"]
89
+ total_loss = total_loss + self.volume_entropy_weight * volume_entropy_loss
90
+ loss_dict.update({"vol_entropy_loss": volume_entropy_loss})
91
+
92
+ loss_dict.update({"total_loss": total_loss})
93
+ return loss_dict
losses/map_label_loss.py ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+
4
+
5
+ def get_map_label_loss(opt):
6
+ return MapLabelLoss(opt.label_loss_on_whole_map)
7
+
8
+
9
+ class MapLabelLoss(nn.Module):
10
+ def __init__(self, label_loss_on_whole_map=False):
11
+ super().__init__()
12
+
13
+ self.bce_loss = nn.BCELoss(reduction="none")
14
+ self.label_loss_on_whole_map = label_loss_on_whole_map
15
+
16
+ def forward(self, pred, out_map, label):
17
+ batch_size = label.shape[0]
18
+ if (
19
+ self.label_loss_on_whole_map
20
+ ): # apply the loss on the whole map for pristine images
21
+ total_loss = 0
22
+ for i in range(batch_size):
23
+ if label[i] == 0: # pristine
24
+ total_loss = (
25
+ total_loss
26
+ + self.bce_loss(out_map[i, ...].mean(), label[i]).mean()
27
+ )
28
+ else: # modified
29
+ total_loss = total_loss + self.bce_loss(pred[i], label[i]).mean()
30
+ loss = total_loss / batch_size
31
+ else:
32
+ loss = self.bce_loss(pred, label)
33
+ loss = loss.mean()
34
+ return {"loss": loss}
losses/map_mask_loss.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+
4
+
5
+ def get_map_mask_loss(opt):
6
+ return MapMaskLoss()
7
+
8
+
9
+ class MapMaskLoss(nn.Module):
10
+ def __init__(self):
11
+ super().__init__()
12
+
13
+ self.bce_loss = nn.BCELoss(reduction="mean")
14
+
15
+ def forward(self, out_map, mask):
16
+ mask_size = mask.shape[-2:]
17
+ if out_map.shape[-2:] != mask_size:
18
+ out_map = nn.functional.interpolate(
19
+ out_map, size=mask_size, mode="bilinear", align_corners=False
20
+ )
21
+ loss = self.bce_loss(out_map, mask)
22
+ return {"loss": loss}
23
+
24
+
25
+ if __name__ == "__main__":
26
+ map_mask_loss = MapMaskLoss()
losses/multi_view_consistency_loss.py ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Dict, List
2
+
3
+ import matplotlib.pyplot as plt
4
+ import numpy as np
5
+ import torch
6
+ import torch.nn as nn
7
+ from skimage import segmentation
8
+
9
+
10
+ def get_multi_view_consistency_loss(opt):
11
+ loss = MultiViewConsistencyLoss(
12
+ opt.mvc_soft,
13
+ opt.mvc_zeros_on_au,
14
+ opt.mvc_single_weight,
15
+ opt.modality,
16
+ opt.mvc_spixel,
17
+ opt.mvc_num_spixel,
18
+ )
19
+ return loss
20
+
21
+
22
+ class MultiViewConsistencyLoss(nn.Module):
23
+ def __init__(
24
+ self,
25
+ soft: bool,
26
+ zeros_on_au: bool,
27
+ single_weight: Dict,
28
+ modality: List,
29
+ spixel: bool = False,
30
+ num_spixel: int = 100,
31
+ eps: float = 1e-4,
32
+ ):
33
+ super().__init__()
34
+ self.soft = soft
35
+ self.zeros_on_au = zeros_on_au
36
+ self.single_weight = single_weight
37
+ self.modality = modality
38
+ self.spixel = spixel
39
+ self.num_spixel = num_spixel
40
+ self.eps = eps
41
+
42
+ self.mse_loss = nn.MSELoss(reduction="mean")
43
+
44
+ def forward(self, output: Dict, label, spixel=None, image=None, mask=None):
45
+
46
+ tgt_map = torch.zeros_like(
47
+ output[self.modality[0]]["out_map"], requires_grad=False
48
+ )
49
+ with torch.no_grad():
50
+ for modality in self.modality:
51
+ weight = self.single_weight[modality.lower()]
52
+ tgt_map = tgt_map + weight * output[modality]["out_map"]
53
+
54
+ if self.spixel:
55
+ # raw_tgt_map = tgt_map.clone()
56
+ tgt_map = get_spixel_tgt_map(tgt_map, spixel)
57
+
58
+ if not self.soft:
59
+ for b in range(tgt_map.shape[0]):
60
+ if tgt_map[b, ...].max() <= 0.5 and label[b] == 1.0:
61
+ tgt_map[b, ...][
62
+ torch.where(tgt_map[b, ...] == torch.max(tgt_map[b, ...]))
63
+ ] = 1.0
64
+ tgt_map[torch.where(tgt_map > 0.5)] = 1
65
+ tgt_map[torch.where(tgt_map <= 0.5)] = 0
66
+ tgt_map[torch.where(label == 0.0)[0], ...] = 0.0
67
+
68
+ if self.zeros_on_au:
69
+ tgt_map[torch.where(label == 0.0)[0], ...] = 0.0
70
+
71
+ total_loss = 0.0
72
+ loss_dict = {}
73
+ for modality in self.modality:
74
+ loss = self.mse_loss(output[modality]["out_map"], tgt_map)
75
+ loss_dict[f"multi_view_consistency_loss_{modality}"] = loss
76
+ total_loss = total_loss + loss
77
+
78
+ return {**loss_dict, "tgt_map": tgt_map, "total_loss": total_loss}
79
+
80
+ def _save(
81
+ self,
82
+ spixel: torch.Tensor,
83
+ image: torch.Tensor,
84
+ mask: torch.Tensor,
85
+ tgt_map: torch.Tensor,
86
+ raw_tgt_map: torch.Tensor,
87
+ out_path: str = "tmp/spixel_tgt_map.png",
88
+ ):
89
+ spixel = spixel.permute(0, 2, 3, 1).detach().cpu().numpy()
90
+ image = image.permute(0, 2, 3, 1).detach().cpu().numpy()
91
+ mask = mask.permute(0, 2, 3, 1).detach().cpu().numpy() * 255.0
92
+ tgt_map = tgt_map.permute(0, 2, 3, 1).squeeze(3).detach().cpu().numpy() * 255.0
93
+ raw_tgt_map = (
94
+ raw_tgt_map.permute(0, 2, 3, 1).squeeze(3).detach().cpu().numpy() * 255.0
95
+ )
96
+ bn = spixel.shape[0]
97
+ i = 1
98
+ for b in range(bn):
99
+ plt.subplot(bn, 5, i)
100
+ i += 1
101
+ plt.imshow(image[b])
102
+ plt.axis("off")
103
+ plt.title("image")
104
+ plt.subplot(bn, 5, i)
105
+ i += 1
106
+ plt.imshow(mask[b])
107
+ plt.axis("off")
108
+ plt.title("mask")
109
+ plt.subplot(bn, 5, i)
110
+ i += 1
111
+ plt.imshow(spixel[b])
112
+ plt.axis("off")
113
+ plt.title("superpixel")
114
+ plt.subplot(bn, 5, i)
115
+ i += 1
116
+ plt.imshow(raw_tgt_map[b])
117
+ plt.axis("off")
118
+ plt.title("raw target map")
119
+ plt.subplot(bn, 5, i)
120
+ i += 1
121
+ plt.imshow(tgt_map[b])
122
+ plt.axis("off")
123
+ plt.title("target map")
124
+ plt.tight_layout()
125
+ plt.savefig(out_path, dpi=300)
126
+ plt.close()
127
+
128
+
129
+ def get_spixel_tgt_map(weighted_sum, spixel):
130
+ b, _, h, w = weighted_sum.shape
131
+ spixel_tgt_map = torch.zeros_like(weighted_sum, requires_grad=False)
132
+
133
+ for bidx in range(b):
134
+ spixel_indices = spixel[bidx, ...].unique()
135
+ # num_spixel = spixel_idx.shape[0]
136
+ for spixel_idx in spixel_indices.tolist():
137
+ area = (spixel[bidx, ...] == spixel_idx).sum()
138
+ weighted_sum_in_area = weighted_sum[bidx, ...][
139
+ torch.where(spixel[bidx, ...] == spixel_idx)
140
+ ].sum()
141
+ avg_area = weighted_sum_in_area / area
142
+ # this is soft map, and the threshold process will be conducted in the forward function
143
+ spixel_tgt_map[bidx][
144
+ torch.where(spixel[bidx, ...] == spixel_idx)
145
+ ] = avg_area
146
+
147
+ return spixel_tgt_map
148
+
149
+
150
+ if __name__ == "__main__":
151
+ mvc_loss = MultiViewConsistencyLoss(True, True, [1, 1, 2])
152
+ print("a")
losses/volume_label_loss.py ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch.nn as nn
2
+
3
+
4
+ def get_volume_label_loss(opt):
5
+ return VolumeLabelLoss()
6
+
7
+
8
+ class VolumeLabelLoss(nn.Module):
9
+ def __init__(self):
10
+ super().__init__()
11
+
12
+ self.BCE_loss = nn.BCELoss(reduction="mean")
13
+
14
+ def forward(self, pred, volume, label):
15
+ loss = self.BCE_loss(pred, label)
16
+ return {"loss": loss}
losses/volume_mask_loss.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ from einops import rearrange
4
+
5
+
6
+ def get_volume_mask_loss(opt):
7
+ return VolumeMaskLoss()
8
+
9
+
10
+ class VolumeMaskLoss(nn.Module):
11
+ def __init__(self):
12
+ super().__init__()
13
+
14
+ self.bce_loss = nn.BCELoss(reduction="mean")
15
+
16
+ def _get_volume_mask(self, mask):
17
+ with torch.no_grad():
18
+ h, w = mask.shape[-2:]
19
+ # use orthogonal vector [0, 1] and [1, 0] to generate the ground truth
20
+ mask[torch.where(mask > 0.5)] = 1.0
21
+ mask[torch.where(mask <= 0.5)] = 0.0
22
+
23
+ mask = rearrange(mask, "b c h w -> b c (h w)")
24
+ mask_append = 1 - mask.clone()
25
+ mask = torch.cat([mask, mask_append], dim=1)
26
+ mask = torch.bmm(mask.transpose(-1, -2), mask)
27
+ mask = rearrange(mask, "b (h1 w1) (h2 w2) -> b h1 w1 h2 w2", h1=h, h2=h)
28
+ mask = 1 - mask # 0 indicates consistency, and 1 indicates inconsistency
29
+ return mask
30
+
31
+ def forward(self, out_volume, mask):
32
+ volume_size = out_volume.shape[-2:]
33
+ if volume_size != mask.shape[-2:]:
34
+ mask = nn.functional.interpolate(
35
+ mask, size=volume_size, mode="bilinear", align_corners=False
36
+ )
37
+ volume_mask = self._get_volume_mask(mask)
38
+ loss = self.bce_loss(out_volume, volume_mask)
39
+
40
+ return {"loss": loss}
main.py ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import datetime
2
+ import math
3
+ import os
4
+ from functools import partial
5
+
6
+ import albumentations as A
7
+ import torch.optim as optim
8
+ from termcolor import cprint
9
+ from timm.scheduler import create_scheduler
10
+ from torch.utils.data import DataLoader
11
+
12
+ import utils.misc as misc
13
+ from datasets import crop_to_smallest_collate_fn, get_dataset
14
+ from engine import bundled_evaluate, train
15
+ from losses import get_bundled_loss, get_loss
16
+ from models import get_ensemble_model, get_single_modal_model
17
+ from opt import get_opt
18
+
19
+
20
+ def main(opt):
21
+ # get tensorboard writer
22
+ writer = misc.setup_env(opt)
23
+
24
+ # dataset
25
+ # training sets
26
+ train_loaders = {}
27
+ if not opt.eval:
28
+ train_transform = A.Compose(
29
+ [
30
+ A.HorizontalFlip(0.5),
31
+ A.SmallestMaxSize(int(opt.input_size * 1.5))
32
+ if opt.resize_aug
33
+ else A.NoOp(),
34
+ A.RandomSizedCrop(
35
+ (opt.input_size, int(opt.input_size * 1.5)),
36
+ opt.input_size,
37
+ opt.input_size,
38
+ )
39
+ if opt.resize_aug
40
+ else A.NoOp(),
41
+ A.NoOp() if opt.no_gaussian_blur else A.GaussianBlur(p=0.5),
42
+ A.NoOp() if opt.no_color_jitter else A.ColorJitter(p=0.5),
43
+ A.NoOp() if opt.no_jpeg_compression else A.ImageCompression(p=0.5),
44
+ ]
45
+ )
46
+ train_sets = get_dataset(opt.train_datalist, "train", train_transform, opt)
47
+ for k, dataset in train_sets.items():
48
+ train_loaders[k] = DataLoader(
49
+ dataset,
50
+ batch_size=opt.batch_size,
51
+ shuffle=True,
52
+ pin_memory=True,
53
+ num_workers=0 if opt.debug else opt.num_workers,
54
+ collate_fn=partial(
55
+ crop_to_smallest_collate_fn,
56
+ max_size=opt.input_size,
57
+ uncorrect_label=opt.uncorrect_label,
58
+ ),
59
+ )
60
+ # validation sets
61
+ if opt.large_image_strategy == "rescale":
62
+ val_transform = A.Compose([A.SmallestMaxSize(opt.tile_size)])
63
+ else:
64
+ val_transform = None
65
+ val_sets = get_dataset(opt.val_datalist, opt.val_set, val_transform, opt)
66
+ val_loaders = {}
67
+ for k, dataset in val_sets.items():
68
+ val_loaders[k] = DataLoader(
69
+ dataset,
70
+ batch_size=1,
71
+ shuffle=opt.val_shuffle,
72
+ pin_memory=True,
73
+ num_workers=0 if opt.debug else opt.num_workers,
74
+ )
75
+
76
+ # multi-view models and optimizers
77
+ optimizer_dict = {}
78
+ scheduler_dict = {}
79
+ model = get_ensemble_model(opt).to(opt.device)
80
+ n_param = sum(p.numel() for p in model.parameters() if p.requires_grad)
81
+ print(
82
+ f"Number of total params: {n_param}, num params per model: {int(n_param / len(opt.modality))}"
83
+ )
84
+
85
+ # optimizer and scheduler
86
+ for modality in opt.modality:
87
+ if opt.optimizer.lower() == "adamw":
88
+ optimizer = optim.AdamW(
89
+ model.sub_models[modality].parameters(),
90
+ opt.lr,
91
+ weight_decay=opt.weight_decay,
92
+ )
93
+ elif opt.optimizer.lower() == "sgd":
94
+ optimizer = optim.SGD(
95
+ model.sub_models[modality].parameters(),
96
+ opt.lr,
97
+ opt.momentum,
98
+ weight_decay=opt.weight_decay,
99
+ )
100
+ else:
101
+ raise RuntimeError(f"Unsupported optimizer {opt.optimizer}.")
102
+
103
+ scheduler, num_epoch = create_scheduler(opt, optimizer)
104
+
105
+ optimizer_dict[modality] = optimizer
106
+ scheduler_dict[modality] = scheduler
107
+ opt.epochs = num_epoch
108
+
109
+ # loss functions
110
+ # loss function including the multi-view consistency loss, for training
111
+ bundled_criterion = get_bundled_loss(opt).to(opt.device)
112
+ # loss function excluding the multi-view consistency loss, for evaluation
113
+ single_criterion = get_loss(opt).to(opt.device)
114
+
115
+ if opt.resume:
116
+ misc.resume_from(model, opt.resume)
117
+
118
+ if opt.eval:
119
+ bundled_evaluate(
120
+ model, val_loaders, single_criterion, 0, writer, suffix="val", opt=opt
121
+ )
122
+ return
123
+
124
+ cprint("The training will last for {} epochs.".format(opt.epochs), "blue")
125
+ best_ensemble_image_f1 = -math.inf
126
+ for epoch in range(opt.epochs):
127
+ for title, dataloader in train_loaders.items():
128
+ train(
129
+ model,
130
+ dataloader,
131
+ title,
132
+ optimizer_dict,
133
+ bundled_criterion,
134
+ epoch,
135
+ writer,
136
+ suffix="train",
137
+ opt=opt,
138
+ )
139
+ for sched_idx, scheduler in enumerate(scheduler_dict.values()):
140
+ if sched_idx == 0 and writer is not None:
141
+ writer.add_scalar("lr", scheduler._get_lr(epoch)[0], epoch)
142
+ scheduler.step(epoch)
143
+
144
+ if (epoch + 1) % opt.eval_freq == 0 or epoch in [opt.epochs - 1]:
145
+ result = bundled_evaluate(
146
+ model,
147
+ val_loaders,
148
+ single_criterion,
149
+ epoch,
150
+ writer,
151
+ suffix="val",
152
+ opt=opt,
153
+ )
154
+ misc.save_model(
155
+ os.path.join(
156
+ opt.save_root_path, opt.dir_name, "checkpoint", f"{epoch}.pt"
157
+ ),
158
+ model,
159
+ epoch,
160
+ opt,
161
+ performance=result,
162
+ )
163
+ if result["image_f1/AVG_ensemble"] > best_ensemble_image_f1:
164
+ best_ensemble_image_f1 = result["image_f1/AVG_ensemble"]
165
+ misc.save_model(
166
+ os.path.join(
167
+ opt.save_root_path, opt.dir_name, "checkpoint", "best.pt"
168
+ ),
169
+ model,
170
+ epoch,
171
+ opt,
172
+ performance=result,
173
+ )
174
+ misc.update_record(result, epoch, opt, "best_record")
175
+ misc.update_record(result, epoch, opt, "latest_record")
176
+
177
+ print("best performance:", best_ensemble_image_f1)
178
+
179
+
180
+ if __name__ == "__main__":
181
+ opt = get_opt()
182
+
183
+ # import cProfile
184
+ # import pstats
185
+ # profiler = cProfile.Profile()
186
+ # profiler.enable()
187
+
188
+ st = datetime.datetime.now()
189
+ main(opt)
190
+ total_time = datetime.datetime.now() - st
191
+ total_time = str(datetime.timedelta(seconds=total_time.seconds))
192
+ print(f"Total time: {total_time}")
193
+
194
+ print("finished")
195
+
196
+ # profiler.disable()
197
+ # stats = pstats.Stats(profiler).sort_stats('cumtime')
198
+ # stats.strip_dirs()
199
+ # stats_name = f'cprofile-data{opt.suffix}'
200
+ # if not opt.debug and not opt.eval:
201
+ # stats_name = os.path.join(opt.save_root_path, opt.dir_name, stats_name)
202
+ # else:
203
+ # stats_name = os.path.join('tmp', stats_name)
204
+ # stats.dump_stats(stats_name)
models/__init__.py ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch.nn as nn
2
+
3
+ from .bayar_conv import BayarConv2d
4
+ from .early_fusion_pre_filter import EarlyFusionPreFilter
5
+ from .ensemble_model import EnsembleModel
6
+ from .main_model import MainModel
7
+ from .models import ModelBuilder, SegmentationModule
8
+ from .srm_conv import SRMConv2d
9
+
10
+
11
+ def get_ensemble_model(opt):
12
+ models = {}
13
+ for modality in opt.modality:
14
+ models[modality] = get_single_modal_model(opt, modality)
15
+
16
+ ensemble_model = EnsembleModel(
17
+ models=models, mvc_single_weight=opt.mvc_single_weight
18
+ )
19
+ return ensemble_model
20
+
21
+
22
+ def get_single_modal_model(opt, modality):
23
+ encoder = ModelBuilder.build_encoder( # TODO check the implementation of FCN
24
+ arch=opt.encoder.lower(), fc_dim=opt.fc_dim, weights=opt.encoder_weight
25
+ )
26
+ decoder = ModelBuilder.build_decoder(
27
+ arch=opt.decoder.lower(),
28
+ fc_dim=opt.fc_dim,
29
+ weights=opt.decoder_weight,
30
+ num_class=opt.num_class,
31
+ dropout=opt.dropout,
32
+ fcn_up=opt.fcn_up,
33
+ )
34
+
35
+ if modality.lower() == "bayar":
36
+ pre_filter = BayarConv2d(
37
+ 3, 3, 5, stride=1, padding=2, magnitude=opt.bayar_magnitude
38
+ )
39
+ elif modality.lower() == "srm":
40
+ pre_filter = SRMConv2d(
41
+ stride=1, padding=2, clip=opt.srm_clip
42
+ ) # TODO check the implementation of SRM filter
43
+ elif modality.lower() == "rgb":
44
+ pre_filter = nn.Identity()
45
+ else: # early
46
+ pre_filter = EarlyFusionPreFilter(
47
+ bayar_magnitude=opt.bayar_magnitude, srm_clip=opt.srm_clip
48
+ )
49
+
50
+ model = MainModel(
51
+ encoder,
52
+ decoder,
53
+ opt.fc_dim,
54
+ opt.volume_block_idx,
55
+ opt.share_embed_head,
56
+ pre_filter,
57
+ opt.gem,
58
+ opt.gem_coef,
59
+ opt.gsm,
60
+ opt.map_portion,
61
+ opt.otsu_sel,
62
+ opt.otsu_portion,
63
+ )
64
+
65
+ return model
models/bayar_conv.py ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ from einops import rearrange
4
+
5
+
6
+ class BayarConv2d(nn.Module):
7
+ def __init__(
8
+ self,
9
+ in_channles: int,
10
+ out_channels: int,
11
+ kernel_size: int = 5,
12
+ stride: int = 1,
13
+ padding: int = 0,
14
+ magnitude: float = 1.0,
15
+ ):
16
+ super().__init__()
17
+ assert kernel_size > 1, "Bayar conv kernel size must be greater than 1"
18
+
19
+ self.in_channels = in_channles
20
+ self.out_channels = out_channels
21
+ self.kernel_size = kernel_size
22
+ self.stride = stride
23
+ self.padding = padding
24
+ self.magnitude = magnitude
25
+
26
+ self.center_weight = nn.Parameter(
27
+ torch.ones(self.in_channels, self.out_channels, 1) * -1.0 * magnitude,
28
+ requires_grad=False,
29
+ )
30
+ self.kernel_weight = nn.Parameter(
31
+ torch.rand((self.in_channels, self.out_channels, kernel_size**2 - 1)),
32
+ requires_grad=True,
33
+ )
34
+
35
+ def _constraint_weight(self):
36
+ self.kernel_weight.data = self.kernel_weight.permute(2, 0, 1)
37
+ self.kernel_weight.data = torch.div(
38
+ self.kernel_weight.data, self.kernel_weight.data.sum(0)
39
+ )
40
+ self.kernel_weight.data = self.kernel_weight.permute(1, 2, 0) * self.magnitude
41
+ center_idx = self.kernel_size**2 // 2
42
+ full_kernel = torch.cat(
43
+ [
44
+ self.kernel_weight[:, :, :center_idx],
45
+ self.center_weight,
46
+ self.kernel_weight[:, :, center_idx:],
47
+ ],
48
+ dim=2,
49
+ )
50
+ full_kernel = rearrange(
51
+ full_kernel, "ci co (kw kh) -> ci co kw kh", kw=self.kernel_size
52
+ )
53
+ return full_kernel
54
+
55
+ def forward(self, x):
56
+ x = nn.functional.conv2d(
57
+ x, self._constraint_weight(), stride=self.stride, padding=self.padding
58
+ )
59
+ return x
60
+
61
+
62
+ if __name__ == "__main__":
63
+ device = "cuda"
64
+ bayer_conv2d = BayarConv2d(3, 3, 3, magnitude=1).to(device)
65
+ bayer_conv2d._constraint_weight()
66
+ i = torch.rand(16, 3, 16, 16).to(device)
67
+ o = bayer_conv2d(i)
models/early_fusion_pre_filter.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+
4
+ from .bayar_conv import BayarConv2d
5
+ from .srm_conv import SRMConv2d
6
+
7
+
8
+ class EarlyFusionPreFilter(nn.Module):
9
+ def __init__(self, bayar_magnitude: float, srm_clip: float):
10
+ super().__init__()
11
+ self.bayar_filter = BayarConv2d(
12
+ 3, 3, 5, stride=1, padding=2, magnitude=bayar_magnitude
13
+ )
14
+ self.srm_filter = SRMConv2d(stride=1, padding=2, clip=srm_clip)
15
+ self.rgb_filter = nn.Identity()
16
+ self.map = nn.Conv2d(9, 3, 1, stride=1, padding=0)
17
+
18
+ def forward(self, x):
19
+ x_bayar = self.bayar_filter(x)
20
+ x_srm = self.srm_filter(x)
21
+ x_rgb = self.rgb_filter(x)
22
+
23
+ x_concat = torch.cat([x_bayar, x_srm, x_rgb], dim=1)
24
+ x_concat = self.map(x_concat)
25
+ return x_concat
models/ensemble_model.py ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Dict, List
2
+
3
+ import torch
4
+ import torch.nn as nn
5
+
6
+
7
+ class EnsembleModel(nn.Module):
8
+ def __init__(self, models: Dict, mvc_single_weight: Dict):
9
+ super().__init__()
10
+
11
+ self.sub_models = nn.ModuleDict(models)
12
+ self.modality = list(self.sub_models.keys())
13
+ self.mvc_single_weight = mvc_single_weight
14
+ for k, v in self.mvc_single_weight.items():
15
+ assert 0 <= v <= 1, "The weight of {} for {} is out of range".format(v, k)
16
+
17
+ def forward(self, image, seg_size=None):
18
+ result = {}
19
+ for modality in self.modality:
20
+ result[modality] = self.sub_models[modality](image, seg_size)
21
+
22
+ avg_result = {}
23
+ for k in result[self.modality[0]].keys():
24
+ avg_result[k] = torch.zeros_like(result[self.modality[0]][k])
25
+ for modality in self.modality:
26
+ avg_result[k] = (
27
+ avg_result[k]
28
+ + self.mvc_single_weight[modality] * result[modality][k]
29
+ )
30
+ result["ensemble"] = avg_result
31
+
32
+ return result
models/hrnet.py ADDED
@@ -0,0 +1,537 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ This HRNet implementation is modified from the following repository:
3
+ https://github.com/HRNet/HRNet-Semantic-Segmentation
4
+ """
5
+
6
+ import logging
7
+
8
+ import torch
9
+ import torch.nn as nn
10
+ import torch.nn.functional as F
11
+
12
+ from .lib.nn import SynchronizedBatchNorm2d
13
+ from .utils import load_url
14
+
15
+ BatchNorm2d = SynchronizedBatchNorm2d
16
+ BN_MOMENTUM = 0.1
17
+ logger = logging.getLogger(__name__)
18
+
19
+
20
+ __all__ = ["hrnetv2"]
21
+
22
+
23
+ model_urls = {
24
+ "hrnetv2": "http://sceneparsing.csail.mit.edu/model/pretrained_resnet/hrnetv2_w48-imagenet.pth",
25
+ }
26
+
27
+
28
+ def conv3x3(in_planes, out_planes, stride=1):
29
+ """3x3 convolution with padding"""
30
+ return nn.Conv2d(
31
+ in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False
32
+ )
33
+
34
+
35
+ class BasicBlock(nn.Module):
36
+ expansion = 1
37
+
38
+ def __init__(self, inplanes, planes, stride=1, downsample=None):
39
+ super(BasicBlock, self).__init__()
40
+ self.conv1 = conv3x3(inplanes, planes, stride)
41
+ self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
42
+ self.relu = nn.ReLU(inplace=True)
43
+ self.conv2 = conv3x3(planes, planes)
44
+ self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
45
+ self.downsample = downsample
46
+ self.stride = stride
47
+
48
+ def forward(self, x):
49
+ residual = x
50
+
51
+ out = self.conv1(x)
52
+ out = self.bn1(out)
53
+ out = self.relu(out)
54
+
55
+ out = self.conv2(out)
56
+ out = self.bn2(out)
57
+
58
+ if self.downsample is not None:
59
+ residual = self.downsample(x)
60
+
61
+ out += residual
62
+ out = self.relu(out)
63
+
64
+ return out
65
+
66
+
67
+ class Bottleneck(nn.Module):
68
+ expansion = 4
69
+
70
+ def __init__(self, inplanes, planes, stride=1, downsample=None):
71
+ super(Bottleneck, self).__init__()
72
+ self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
73
+ self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
74
+ self.conv2 = nn.Conv2d(
75
+ planes, planes, kernel_size=3, stride=stride, padding=1, bias=False
76
+ )
77
+ self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
78
+ self.conv3 = nn.Conv2d(
79
+ planes, planes * self.expansion, kernel_size=1, bias=False
80
+ )
81
+ self.bn3 = BatchNorm2d(planes * self.expansion, momentum=BN_MOMENTUM)
82
+ self.relu = nn.ReLU(inplace=True)
83
+ self.downsample = downsample
84
+ self.stride = stride
85
+
86
+ def forward(self, x):
87
+ residual = x
88
+
89
+ out = self.conv1(x)
90
+ out = self.bn1(out)
91
+ out = self.relu(out)
92
+
93
+ out = self.conv2(out)
94
+ out = self.bn2(out)
95
+ out = self.relu(out)
96
+
97
+ out = self.conv3(out)
98
+ out = self.bn3(out)
99
+
100
+ if self.downsample is not None:
101
+ residual = self.downsample(x)
102
+
103
+ out += residual
104
+ out = self.relu(out)
105
+
106
+ return out
107
+
108
+
109
+ class HighResolutionModule(nn.Module):
110
+ def __init__(
111
+ self,
112
+ num_branches,
113
+ blocks,
114
+ num_blocks,
115
+ num_inchannels,
116
+ num_channels,
117
+ fuse_method,
118
+ multi_scale_output=True,
119
+ ):
120
+ super(HighResolutionModule, self).__init__()
121
+ self._check_branches(
122
+ num_branches, blocks, num_blocks, num_inchannels, num_channels
123
+ )
124
+
125
+ self.num_inchannels = num_inchannels
126
+ self.fuse_method = fuse_method
127
+ self.num_branches = num_branches
128
+
129
+ self.multi_scale_output = multi_scale_output
130
+
131
+ self.branches = self._make_branches(
132
+ num_branches, blocks, num_blocks, num_channels
133
+ )
134
+ self.fuse_layers = self._make_fuse_layers()
135
+ self.relu = nn.ReLU(inplace=True)
136
+
137
+ def _check_branches(
138
+ self, num_branches, blocks, num_blocks, num_inchannels, num_channels
139
+ ):
140
+ if num_branches != len(num_blocks):
141
+ error_msg = "NUM_BRANCHES({}) <> NUM_BLOCKS({})".format(
142
+ num_branches, len(num_blocks)
143
+ )
144
+ logger.error(error_msg)
145
+ raise ValueError(error_msg)
146
+
147
+ if num_branches != len(num_channels):
148
+ error_msg = "NUM_BRANCHES({}) <> NUM_CHANNELS({})".format(
149
+ num_branches, len(num_channels)
150
+ )
151
+ logger.error(error_msg)
152
+ raise ValueError(error_msg)
153
+
154
+ if num_branches != len(num_inchannels):
155
+ error_msg = "NUM_BRANCHES({}) <> NUM_INCHANNELS({})".format(
156
+ num_branches, len(num_inchannels)
157
+ )
158
+ logger.error(error_msg)
159
+ raise ValueError(error_msg)
160
+
161
+ def _make_one_branch(self, branch_index, block, num_blocks, num_channels, stride=1):
162
+ downsample = None
163
+ if (
164
+ stride != 1
165
+ or self.num_inchannels[branch_index]
166
+ != num_channels[branch_index] * block.expansion
167
+ ):
168
+ downsample = nn.Sequential(
169
+ nn.Conv2d(
170
+ self.num_inchannels[branch_index],
171
+ num_channels[branch_index] * block.expansion,
172
+ kernel_size=1,
173
+ stride=stride,
174
+ bias=False,
175
+ ),
176
+ BatchNorm2d(
177
+ num_channels[branch_index] * block.expansion, momentum=BN_MOMENTUM
178
+ ),
179
+ )
180
+
181
+ layers = []
182
+ layers.append(
183
+ block(
184
+ self.num_inchannels[branch_index],
185
+ num_channels[branch_index],
186
+ stride,
187
+ downsample,
188
+ )
189
+ )
190
+ self.num_inchannels[branch_index] = num_channels[branch_index] * block.expansion
191
+ for i in range(1, num_blocks[branch_index]):
192
+ layers.append(
193
+ block(self.num_inchannels[branch_index], num_channels[branch_index])
194
+ )
195
+
196
+ return nn.Sequential(*layers)
197
+
198
+ def _make_branches(self, num_branches, block, num_blocks, num_channels):
199
+ branches = []
200
+
201
+ for i in range(num_branches):
202
+ branches.append(self._make_one_branch(i, block, num_blocks, num_channels))
203
+
204
+ return nn.ModuleList(branches)
205
+
206
+ def _make_fuse_layers(self):
207
+ if self.num_branches == 1:
208
+ return None
209
+
210
+ num_branches = self.num_branches
211
+ num_inchannels = self.num_inchannels
212
+ fuse_layers = []
213
+ for i in range(num_branches if self.multi_scale_output else 1):
214
+ fuse_layer = []
215
+ for j in range(num_branches):
216
+ if j > i:
217
+ fuse_layer.append(
218
+ nn.Sequential(
219
+ nn.Conv2d(
220
+ num_inchannels[j],
221
+ num_inchannels[i],
222
+ 1,
223
+ 1,
224
+ 0,
225
+ bias=False,
226
+ ),
227
+ BatchNorm2d(num_inchannels[i], momentum=BN_MOMENTUM),
228
+ )
229
+ )
230
+ elif j == i:
231
+ fuse_layer.append(None)
232
+ else:
233
+ conv3x3s = []
234
+ for k in range(i - j):
235
+ if k == i - j - 1:
236
+ num_outchannels_conv3x3 = num_inchannels[i]
237
+ conv3x3s.append(
238
+ nn.Sequential(
239
+ nn.Conv2d(
240
+ num_inchannels[j],
241
+ num_outchannels_conv3x3,
242
+ 3,
243
+ 2,
244
+ 1,
245
+ bias=False,
246
+ ),
247
+ BatchNorm2d(
248
+ num_outchannels_conv3x3, momentum=BN_MOMENTUM
249
+ ),
250
+ )
251
+ )
252
+ else:
253
+ num_outchannels_conv3x3 = num_inchannels[j]
254
+ conv3x3s.append(
255
+ nn.Sequential(
256
+ nn.Conv2d(
257
+ num_inchannels[j],
258
+ num_outchannels_conv3x3,
259
+ 3,
260
+ 2,
261
+ 1,
262
+ bias=False,
263
+ ),
264
+ BatchNorm2d(
265
+ num_outchannels_conv3x3, momentum=BN_MOMENTUM
266
+ ),
267
+ nn.ReLU(inplace=True),
268
+ )
269
+ )
270
+ fuse_layer.append(nn.Sequential(*conv3x3s))
271
+ fuse_layers.append(nn.ModuleList(fuse_layer))
272
+
273
+ return nn.ModuleList(fuse_layers)
274
+
275
+ def get_num_inchannels(self):
276
+ return self.num_inchannels
277
+
278
+ def forward(self, x):
279
+ if self.num_branches == 1:
280
+ return [self.branches[0](x[0])]
281
+
282
+ for i in range(self.num_branches):
283
+ x[i] = self.branches[i](x[i])
284
+
285
+ x_fuse = []
286
+ for i in range(len(self.fuse_layers)):
287
+ y = x[0] if i == 0 else self.fuse_layers[i][0](x[0])
288
+ for j in range(1, self.num_branches):
289
+ if i == j:
290
+ y = y + x[j]
291
+ elif j > i:
292
+ width_output = x[i].shape[-1]
293
+ height_output = x[i].shape[-2]
294
+ y = y + F.interpolate(
295
+ self.fuse_layers[i][j](x[j]),
296
+ size=(height_output, width_output),
297
+ mode="bilinear",
298
+ align_corners=False,
299
+ )
300
+ else:
301
+ y = y + self.fuse_layers[i][j](x[j])
302
+ x_fuse.append(self.relu(y))
303
+
304
+ return x_fuse
305
+
306
+
307
+ blocks_dict = {"BASIC": BasicBlock, "BOTTLENECK": Bottleneck}
308
+
309
+
310
+ class HRNetV2(nn.Module):
311
+ def __init__(self, n_class, **kwargs):
312
+ super(HRNetV2, self).__init__()
313
+ extra = {
314
+ "STAGE2": {
315
+ "NUM_MODULES": 1,
316
+ "NUM_BRANCHES": 2,
317
+ "BLOCK": "BASIC",
318
+ "NUM_BLOCKS": (4, 4),
319
+ "NUM_CHANNELS": (48, 96),
320
+ "FUSE_METHOD": "SUM",
321
+ },
322
+ "STAGE3": {
323
+ "NUM_MODULES": 4,
324
+ "NUM_BRANCHES": 3,
325
+ "BLOCK": "BASIC",
326
+ "NUM_BLOCKS": (4, 4, 4),
327
+ "NUM_CHANNELS": (48, 96, 192),
328
+ "FUSE_METHOD": "SUM",
329
+ },
330
+ "STAGE4": {
331
+ "NUM_MODULES": 3,
332
+ "NUM_BRANCHES": 4,
333
+ "BLOCK": "BASIC",
334
+ "NUM_BLOCKS": (4, 4, 4, 4),
335
+ "NUM_CHANNELS": (48, 96, 192, 384),
336
+ "FUSE_METHOD": "SUM",
337
+ },
338
+ "FINAL_CONV_KERNEL": 1,
339
+ }
340
+
341
+ # stem net
342
+ self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1, bias=False)
343
+ self.bn1 = BatchNorm2d(64, momentum=BN_MOMENTUM)
344
+ self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=2, padding=1, bias=False)
345
+ self.bn2 = BatchNorm2d(64, momentum=BN_MOMENTUM)
346
+ self.relu = nn.ReLU(inplace=True)
347
+
348
+ self.layer1 = self._make_layer(Bottleneck, 64, 64, 4)
349
+
350
+ self.stage2_cfg = extra["STAGE2"]
351
+ num_channels = self.stage2_cfg["NUM_CHANNELS"]
352
+ block = blocks_dict[self.stage2_cfg["BLOCK"]]
353
+ num_channels = [
354
+ num_channels[i] * block.expansion for i in range(len(num_channels))
355
+ ]
356
+ self.transition1 = self._make_transition_layer([256], num_channels)
357
+ self.stage2, pre_stage_channels = self._make_stage(
358
+ self.stage2_cfg, num_channels
359
+ )
360
+
361
+ self.stage3_cfg = extra["STAGE3"]
362
+ num_channels = self.stage3_cfg["NUM_CHANNELS"]
363
+ block = blocks_dict[self.stage3_cfg["BLOCK"]]
364
+ num_channels = [
365
+ num_channels[i] * block.expansion for i in range(len(num_channels))
366
+ ]
367
+ self.transition2 = self._make_transition_layer(pre_stage_channels, num_channels)
368
+ self.stage3, pre_stage_channels = self._make_stage(
369
+ self.stage3_cfg, num_channels
370
+ )
371
+
372
+ self.stage4_cfg = extra["STAGE4"]
373
+ num_channels = self.stage4_cfg["NUM_CHANNELS"]
374
+ block = blocks_dict[self.stage4_cfg["BLOCK"]]
375
+ num_channels = [
376
+ num_channels[i] * block.expansion for i in range(len(num_channels))
377
+ ]
378
+ self.transition3 = self._make_transition_layer(pre_stage_channels, num_channels)
379
+ self.stage4, pre_stage_channels = self._make_stage(
380
+ self.stage4_cfg, num_channels, multi_scale_output=True
381
+ )
382
+
383
+ def _make_transition_layer(self, num_channels_pre_layer, num_channels_cur_layer):
384
+ num_branches_cur = len(num_channels_cur_layer)
385
+ num_branches_pre = len(num_channels_pre_layer)
386
+
387
+ transition_layers = []
388
+ for i in range(num_branches_cur):
389
+ if i < num_branches_pre:
390
+ if num_channels_cur_layer[i] != num_channels_pre_layer[i]:
391
+ transition_layers.append(
392
+ nn.Sequential(
393
+ nn.Conv2d(
394
+ num_channels_pre_layer[i],
395
+ num_channels_cur_layer[i],
396
+ 3,
397
+ 1,
398
+ 1,
399
+ bias=False,
400
+ ),
401
+ BatchNorm2d(
402
+ num_channels_cur_layer[i], momentum=BN_MOMENTUM
403
+ ),
404
+ nn.ReLU(inplace=True),
405
+ )
406
+ )
407
+ else:
408
+ transition_layers.append(None)
409
+ else:
410
+ conv3x3s = []
411
+ for j in range(i + 1 - num_branches_pre):
412
+ inchannels = num_channels_pre_layer[-1]
413
+ outchannels = (
414
+ num_channels_cur_layer[i]
415
+ if j == i - num_branches_pre
416
+ else inchannels
417
+ )
418
+ conv3x3s.append(
419
+ nn.Sequential(
420
+ nn.Conv2d(inchannels, outchannels, 3, 2, 1, bias=False),
421
+ BatchNorm2d(outchannels, momentum=BN_MOMENTUM),
422
+ nn.ReLU(inplace=True),
423
+ )
424
+ )
425
+ transition_layers.append(nn.Sequential(*conv3x3s))
426
+
427
+ return nn.ModuleList(transition_layers)
428
+
429
+ def _make_layer(self, block, inplanes, planes, blocks, stride=1):
430
+ downsample = None
431
+ if stride != 1 or inplanes != planes * block.expansion:
432
+ downsample = nn.Sequential(
433
+ nn.Conv2d(
434
+ inplanes,
435
+ planes * block.expansion,
436
+ kernel_size=1,
437
+ stride=stride,
438
+ bias=False,
439
+ ),
440
+ BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
441
+ )
442
+
443
+ layers = []
444
+ layers.append(block(inplanes, planes, stride, downsample))
445
+ inplanes = planes * block.expansion
446
+ for i in range(1, blocks):
447
+ layers.append(block(inplanes, planes))
448
+
449
+ return nn.Sequential(*layers)
450
+
451
+ def _make_stage(self, layer_config, num_inchannels, multi_scale_output=True):
452
+ num_modules = layer_config["NUM_MODULES"]
453
+ num_branches = layer_config["NUM_BRANCHES"]
454
+ num_blocks = layer_config["NUM_BLOCKS"]
455
+ num_channels = layer_config["NUM_CHANNELS"]
456
+ block = blocks_dict[layer_config["BLOCK"]]
457
+ fuse_method = layer_config["FUSE_METHOD"]
458
+
459
+ modules = []
460
+ for i in range(num_modules):
461
+ # multi_scale_output is only used last module
462
+ if not multi_scale_output and i == num_modules - 1:
463
+ reset_multi_scale_output = False
464
+ else:
465
+ reset_multi_scale_output = True
466
+ modules.append(
467
+ HighResolutionModule(
468
+ num_branches,
469
+ block,
470
+ num_blocks,
471
+ num_inchannels,
472
+ num_channels,
473
+ fuse_method,
474
+ reset_multi_scale_output,
475
+ )
476
+ )
477
+ num_inchannels = modules[-1].get_num_inchannels()
478
+
479
+ return nn.Sequential(*modules), num_inchannels
480
+
481
+ def forward(self, x, return_feature_maps=False):
482
+ x = self.conv1(x)
483
+ x = self.bn1(x)
484
+ x = self.relu(x)
485
+ x = self.conv2(x)
486
+ x = self.bn2(x)
487
+ x = self.relu(x)
488
+ x = self.layer1(x)
489
+
490
+ x_list = []
491
+ for i in range(self.stage2_cfg["NUM_BRANCHES"]):
492
+ if self.transition1[i] is not None:
493
+ x_list.append(self.transition1[i](x))
494
+ else:
495
+ x_list.append(x)
496
+ y_list = self.stage2(x_list)
497
+
498
+ x_list = []
499
+ for i in range(self.stage3_cfg["NUM_BRANCHES"]):
500
+ if self.transition2[i] is not None:
501
+ x_list.append(self.transition2[i](y_list[-1]))
502
+ else:
503
+ x_list.append(y_list[i])
504
+ y_list = self.stage3(x_list)
505
+
506
+ x_list = []
507
+ for i in range(self.stage4_cfg["NUM_BRANCHES"]):
508
+ if self.transition3[i] is not None:
509
+ x_list.append(self.transition3[i](y_list[-1]))
510
+ else:
511
+ x_list.append(y_list[i])
512
+ x = self.stage4(x_list)
513
+
514
+ # Upsampling
515
+ x0_h, x0_w = x[0].size(2), x[0].size(3)
516
+ x1 = F.interpolate(
517
+ x[1], size=(x0_h, x0_w), mode="bilinear", align_corners=False
518
+ )
519
+ x2 = F.interpolate(
520
+ x[2], size=(x0_h, x0_w), mode="bilinear", align_corners=False
521
+ )
522
+ x3 = F.interpolate(
523
+ x[3], size=(x0_h, x0_w), mode="bilinear", align_corners=False
524
+ )
525
+
526
+ x = torch.cat([x[0], x1, x2, x3], 1)
527
+
528
+ # x = self.last_layer(x)
529
+ return [x]
530
+
531
+
532
+ def hrnetv2(pretrained=False, **kwargs):
533
+ model = HRNetV2(n_class=1000, **kwargs)
534
+ if pretrained:
535
+ model.load_state_dict(load_url(model_urls["hrnetv2"]), strict=False)
536
+
537
+ return model
models/main_model.py ADDED
@@ -0,0 +1,290 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Optional
2
+
3
+ import torch
4
+ import torch.nn as nn
5
+ from einops import rearrange
6
+
7
+
8
+ class MainModel(nn.Module):
9
+ def __init__(
10
+ self,
11
+ encoder,
12
+ decoder,
13
+ fc_dim: int,
14
+ volume_block_idx: int,
15
+ share_embed_head: bool,
16
+ pre_filter=None,
17
+ use_gem: bool = False,
18
+ gem_coef: Optional[float] = None,
19
+ use_gsm: bool = False,
20
+ map_portion: float = 0,
21
+ otsu_sel: bool = False,
22
+ otsu_portion: float = 1,
23
+ ):
24
+ super().__init__()
25
+ self.encoder = encoder
26
+ self.decoder = decoder
27
+ self.use_gem = use_gem
28
+ self.gem_coef = gem_coef
29
+ self.use_gsm = use_gsm
30
+ self.map_portion = map_portion
31
+ assert self.map_portion <= 0.5, "Map_portion must be less than 0.5"
32
+ self.otsu_sel = otsu_sel
33
+ self.otsu_portion = otsu_portion
34
+
35
+ self.volume_block_idx = volume_block_idx
36
+ volume_in_channel = int(fc_dim * (2 ** (self.volume_block_idx - 3)))
37
+ volume_out_channel = volume_in_channel // 2
38
+
39
+ self.scale = volume_out_channel**0.5
40
+ self.share_embed_head = share_embed_head
41
+ self.proj_head1 = nn.Sequential(
42
+ nn.Conv2d(
43
+ volume_in_channel, volume_in_channel, kernel_size=1, stride=1, padding=0
44
+ ),
45
+ nn.LeakyReLU(),
46
+ nn.Conv2d(
47
+ volume_in_channel,
48
+ volume_out_channel,
49
+ kernel_size=1,
50
+ stride=1,
51
+ padding=0,
52
+ ),
53
+ )
54
+ if not share_embed_head:
55
+ self.proj_head2 = nn.Sequential(
56
+ nn.Conv2d(
57
+ volume_in_channel,
58
+ volume_in_channel,
59
+ kernel_size=1,
60
+ stride=1,
61
+ padding=0,
62
+ ),
63
+ nn.LeakyReLU(),
64
+ nn.Conv2d(
65
+ volume_in_channel,
66
+ volume_out_channel,
67
+ kernel_size=1,
68
+ stride=1,
69
+ padding=0,
70
+ ),
71
+ )
72
+
73
+ self.pre_filter = pre_filter
74
+
75
+ def forward(self, image, seg_size=None):
76
+ """
77
+ for output maps, the return value is the raw logits
78
+ for consistency volume, the return value is the value after sigmoid
79
+ """
80
+ bs = image.shape[0]
81
+ if self.pre_filter is not None:
82
+ image = self.pre_filter(image)
83
+
84
+ # get output map
85
+ encoder_feature = self.encoder(image, return_feature_maps=True)
86
+ output_map = self.decoder(encoder_feature, segSize=seg_size)
87
+ output_map = output_map.sigmoid()
88
+ # b, _, h, w = output_map.shape
89
+
90
+ # get image-level prediction
91
+ if self.use_gem:
92
+ mh, mw = output_map.shape[-2:]
93
+ image_pred = output_map.flatten(1)
94
+ image_pred = torch.linalg.norm(image_pred, ord=self.gem_coef, dim=1)
95
+ image_pred = image_pred / (mh * mw)
96
+ elif self.use_gsm:
97
+ image_pred = output_map.flatten(1)
98
+ weight = project_onto_l1_ball(image_pred, 1.0)
99
+ image_pred = (image_pred * weight).sum(1)
100
+ else:
101
+ if self.otsu_sel:
102
+ n_pixel = output_map.shape[-1] * output_map.shape[-2]
103
+ image_pred = output_map.flatten(1)
104
+ image_pred, _ = torch.sort(image_pred, dim=1)
105
+ tmp = []
106
+ for b in range(bs):
107
+ num_otsu_sel = get_otsu_k(image_pred[b, ...], sorted=True)
108
+ num_otsu_sel = max(num_otsu_sel, n_pixel // 2 + 1)
109
+ tpk = int(max(1, (n_pixel - num_otsu_sel) * self.otsu_portion))
110
+ topk_output = torch.topk(image_pred[b, ...], k=tpk, dim=0)[0]
111
+ tmp.append(topk_output.mean())
112
+ image_pred = torch.stack(tmp)
113
+ else:
114
+ if self.map_portion == 0:
115
+ image_pred = nn.functional.max_pool2d(
116
+ output_map, kernel_size=output_map.shape[-2:]
117
+ )
118
+ image_pred = image_pred.squeeze(1).squeeze(1).squeeze(1)
119
+ else:
120
+ n_pixel = output_map.shape[-1] * output_map.shape[-2]
121
+ k = int(max(1, int(self.map_portion * n_pixel)))
122
+ topk_output = torch.topk(output_map.flatten(1), k, dim=1)[0]
123
+ image_pred = topk_output.mean(1)
124
+
125
+ if seg_size is not None:
126
+ output_map = nn.functional.interpolate(
127
+ output_map, size=seg_size, mode="bilinear", align_corners=False
128
+ )
129
+ output_map = output_map.clamp(0, 1)
130
+
131
+ # compute consistency volume, 0 for consistency, and 1 for inconsistency
132
+ feature_map1 = self.proj_head1(encoder_feature[self.volume_block_idx])
133
+ if not self.share_embed_head:
134
+ feature_map2 = self.proj_head2(encoder_feature[self.volume_block_idx])
135
+ else:
136
+ feature_map2 = feature_map1.clone()
137
+ b, c, h, w = feature_map1.shape
138
+ feature_map1 = rearrange(feature_map1, "b c h w -> b c (h w)")
139
+ feature_map2 = rearrange(feature_map2, "b c h w -> b c (h w)")
140
+ consistency_volume = torch.bmm(feature_map1.transpose(-1, -2), feature_map2)
141
+ consistency_volume = rearrange(
142
+ consistency_volume, "b (h1 w1) (h2 w2) -> b h1 w1 h2 w2", h1=h, h2=h
143
+ )
144
+ consistency_volume = consistency_volume / self.scale
145
+ consistency_volume = 1 - consistency_volume.sigmoid()
146
+
147
+ vh, vw = consistency_volume.shape[-2:]
148
+ if self.use_gem:
149
+ volume_image_pred = consistency_volume.flatten(1)
150
+ volume_image_pred = torch.linalg.norm(
151
+ volume_image_pred, ord=self.gem_coef, dim=1
152
+ )
153
+ volume_image_pred = volume_image_pred / (vh * vw * vh * vw)
154
+ elif self.use_gsm:
155
+ volume_image_pred = consistency_volume.flatten(1)
156
+ weight = project_onto_l1_ball(volume_image_pred, 1.0)
157
+ volume_image_pred = (volume_image_pred * weight).sum(1)
158
+ else:
159
+ # FIXME skip Otsu's selection on volume due to its slowness
160
+ # if self.otsu_sel:
161
+ # n_ele = vh * vw * vh * vw
162
+ # volume_image_pred = consistency_volume.flatten(1)
163
+ # volume_image_pred, _ = torch.sort(volume_image_pred, dim=1)
164
+ # tmp = []
165
+ # for b in range(bs):
166
+ # num_otsu_sel = get_otsu_k(volume_image_pred[b, ...], sorted=True)
167
+ # num_otsu_sel = max(num_otsu_sel, n_ele // 2 + 1)
168
+ # tpk = int(max(1, (n_ele - num_otsu_sel) * self.otsu_portion))
169
+ # topk_output = torch.topk(volume_image_pred[b, ...], k=tpk, dim=0)[0]
170
+ # tmp.append(topk_output.mean())
171
+ # volume_image_pred = torch.stack(tmp)
172
+ # else:
173
+ if self.map_portion == 0:
174
+ volume_image_pred = torch.max(consistency_volume.flatten(1), dim=1)[0]
175
+ else:
176
+ n_ele = vh * vw * vh * vw
177
+ k = int(max(1, int(self.map_portion * n_ele)))
178
+ topk_output = torch.topk(consistency_volume.flatten(1), k, dim=1)[0]
179
+ volume_image_pred = topk_output.mean(1)
180
+
181
+ return {
182
+ "out_map": output_map,
183
+ "map_pred": image_pred,
184
+ "out_vol": consistency_volume,
185
+ "vol_pred": volume_image_pred,
186
+ }
187
+
188
+
189
+ def project_onto_l1_ball(x, eps):
190
+ """
191
+ Compute Euclidean projection onto the L1 ball for a batch.
192
+
193
+ min ||x - u||_2 s.t. ||u||_1 <= eps
194
+
195
+ Inspired by the corresponding numpy version by Adrien Gaidon.
196
+
197
+ Parameters
198
+ ----------
199
+ x: (batch_size, *) torch array
200
+ batch of arbitrary-size tensors to project, possibly on GPU
201
+
202
+ eps: float
203
+ radius of l-1 ball to project onto
204
+
205
+ Returns
206
+ -------
207
+ u: (batch_size, *) torch array
208
+ batch of projected tensors, reshaped to match the original
209
+
210
+ Notes
211
+ -----
212
+ The complexity of this algorithm is in O(dlogd) as it involves sorting x.
213
+
214
+ References
215
+ ----------
216
+ [1] Efficient Projections onto the l1-Ball for Learning in High Dimensions
217
+ John Duchi, Shai Shalev-Shwartz, Yoram Singer, and Tushar Chandra.
218
+ International Conference on Machine Learning (ICML 2008)
219
+ """
220
+ with torch.no_grad():
221
+ original_shape = x.shape
222
+ x = x.view(x.shape[0], -1)
223
+ mask = (torch.norm(x, p=1, dim=1) < eps).float().unsqueeze(1)
224
+ mu, _ = torch.sort(torch.abs(x), dim=1, descending=True)
225
+ cumsum = torch.cumsum(mu, dim=1)
226
+ arange = torch.arange(1, x.shape[1] + 1, device=x.device)
227
+ rho, _ = torch.max((mu * arange > (cumsum - eps)) * arange, dim=1)
228
+ theta = (cumsum[torch.arange(x.shape[0]), rho.cpu() - 1] - eps) / rho
229
+ proj = (torch.abs(x) - theta.unsqueeze(1)).clamp(min=0)
230
+ x = mask * x + (1 - mask) * proj * torch.sign(x)
231
+ x = x.view(original_shape)
232
+ return x
233
+
234
+
235
+ def get_otsu_k(attention, return_value=False, sorted=False):
236
+ def _get_weighted_var(seq, pivot: int):
237
+ # seq is of shape [t], in ascending order
238
+ length = seq.shape[0]
239
+ wb = pivot / length
240
+ vb = seq[:pivot].var()
241
+ wf = 1 - pivot / length
242
+ vf = seq[pivot:].var()
243
+ return wb * vb + wf * vf
244
+
245
+ # attention shape: t
246
+ # TODO use half
247
+ length = attention.shape[0]
248
+ if length == 1:
249
+ return 0
250
+ elif length == 2:
251
+ return 1
252
+ if not sorted:
253
+ attention, _ = torch.sort(attention)
254
+ optimal_i = length // 2
255
+ min_intra_class_var = _get_weighted_var(attention, optimal_i)
256
+
257
+ # for i in range(1, length):
258
+ # intra_class_var = _get_weighted_var(attention, i)
259
+ # if intra_class_var < min_intra_class_var:
260
+ # min_intra_class_var = intra_class_var
261
+ # optimal_i = i
262
+
263
+ got_it = False
264
+ # look left
265
+ for i in range(optimal_i - 1, 0, -1):
266
+ intra_class_var = _get_weighted_var(attention, i)
267
+ if intra_class_var > min_intra_class_var:
268
+ break
269
+ else:
270
+ min_intra_class_var = intra_class_var
271
+ optimal_i = i
272
+ got_it = True
273
+ # look right
274
+ if not got_it:
275
+ for i in range(optimal_i + 1, length):
276
+ intra_class_var = _get_weighted_var(attention, i)
277
+ if intra_class_var > min_intra_class_var:
278
+ break
279
+ else:
280
+ min_intra_class_var = intra_class_var
281
+ optimal_i = i
282
+
283
+ if return_value:
284
+ return attention[optimal_i]
285
+ else:
286
+ return optimal_i
287
+
288
+
289
+ if __name__ == "__main__":
290
+ model = MainModel(None, None, 1024, 2, True, "srm")
models/mobilenet.py ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ This MobileNetV2 implementation is modified from the following repository:
3
+ https://github.com/tonylins/pytorch-mobilenet-v2
4
+ """
5
+
6
+ import math
7
+
8
+ import torch.nn as nn
9
+
10
+ from .lib.nn import SynchronizedBatchNorm2d
11
+ from .utils import load_url
12
+
13
+ BatchNorm2d = SynchronizedBatchNorm2d
14
+
15
+
16
+ __all__ = ["mobilenetv2"]
17
+
18
+
19
+ model_urls = {
20
+ "mobilenetv2": "http://sceneparsing.csail.mit.edu/model/pretrained_resnet/mobilenet_v2.pth.tar",
21
+ }
22
+
23
+
24
+ def conv_bn(inp, oup, stride):
25
+ return nn.Sequential(
26
+ nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
27
+ BatchNorm2d(oup),
28
+ nn.ReLU6(inplace=True),
29
+ )
30
+
31
+
32
+ def conv_1x1_bn(inp, oup):
33
+ return nn.Sequential(
34
+ nn.Conv2d(inp, oup, 1, 1, 0, bias=False),
35
+ BatchNorm2d(oup),
36
+ nn.ReLU6(inplace=True),
37
+ )
38
+
39
+
40
+ class InvertedResidual(nn.Module):
41
+ def __init__(self, inp, oup, stride, expand_ratio):
42
+ super(InvertedResidual, self).__init__()
43
+ self.stride = stride
44
+ assert stride in [1, 2]
45
+
46
+ hidden_dim = round(inp * expand_ratio)
47
+ self.use_res_connect = self.stride == 1 and inp == oup
48
+
49
+ if expand_ratio == 1:
50
+ self.conv = nn.Sequential(
51
+ # dw
52
+ nn.Conv2d(
53
+ hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False
54
+ ),
55
+ BatchNorm2d(hidden_dim),
56
+ nn.ReLU6(inplace=True),
57
+ # pw-linear
58
+ nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
59
+ BatchNorm2d(oup),
60
+ )
61
+ else:
62
+ self.conv = nn.Sequential(
63
+ # pw
64
+ nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
65
+ BatchNorm2d(hidden_dim),
66
+ nn.ReLU6(inplace=True),
67
+ # dw
68
+ nn.Conv2d(
69
+ hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False
70
+ ),
71
+ BatchNorm2d(hidden_dim),
72
+ nn.ReLU6(inplace=True),
73
+ # pw-linear
74
+ nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
75
+ BatchNorm2d(oup),
76
+ )
77
+
78
+ def forward(self, x):
79
+ if self.use_res_connect:
80
+ return x + self.conv(x)
81
+ else:
82
+ return self.conv(x)
83
+
84
+
85
+ class MobileNetV2(nn.Module):
86
+ def __init__(self, n_class=1000, input_size=224, width_mult=1.0):
87
+ super(MobileNetV2, self).__init__()
88
+ block = InvertedResidual
89
+ input_channel = 32
90
+ last_channel = 1280
91
+ interverted_residual_setting = [
92
+ # t, c, n, s
93
+ [1, 16, 1, 1],
94
+ [6, 24, 2, 2],
95
+ [6, 32, 3, 2],
96
+ [6, 64, 4, 2],
97
+ [6, 96, 3, 1],
98
+ [6, 160, 3, 2],
99
+ [6, 320, 1, 1],
100
+ ]
101
+
102
+ # building first layer
103
+ assert input_size % 32 == 0
104
+ input_channel = int(input_channel * width_mult)
105
+ self.last_channel = (
106
+ int(last_channel * width_mult) if width_mult > 1.0 else last_channel
107
+ )
108
+ self.features = [conv_bn(3, input_channel, 2)]
109
+ # building inverted residual blocks
110
+ for t, c, n, s in interverted_residual_setting:
111
+ output_channel = int(c * width_mult)
112
+ for i in range(n):
113
+ if i == 0:
114
+ self.features.append(
115
+ block(input_channel, output_channel, s, expand_ratio=t)
116
+ )
117
+ else:
118
+ self.features.append(
119
+ block(input_channel, output_channel, 1, expand_ratio=t)
120
+ )
121
+ input_channel = output_channel
122
+ # building last several layers
123
+ self.features.append(conv_1x1_bn(input_channel, self.last_channel))
124
+ # make it nn.Sequential
125
+ self.features = nn.Sequential(*self.features)
126
+
127
+ # building classifier
128
+ self.classifier = nn.Sequential(
129
+ nn.Dropout(0.2),
130
+ nn.Linear(self.last_channel, n_class),
131
+ )
132
+
133
+ self._initialize_weights()
134
+
135
+ def forward(self, x):
136
+ x = self.features(x)
137
+ x = x.mean(3).mean(2)
138
+ x = self.classifier(x)
139
+ return x
140
+
141
+ def _initialize_weights(self):
142
+ for m in self.modules():
143
+ if isinstance(m, nn.Conv2d):
144
+ n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
145
+ m.weight.data.normal_(0, math.sqrt(2.0 / n))
146
+ if m.bias is not None:
147
+ m.bias.data.zero_()
148
+ elif isinstance(m, BatchNorm2d):
149
+ m.weight.data.fill_(1)
150
+ m.bias.data.zero_()
151
+ elif isinstance(m, nn.Linear):
152
+ n = m.weight.size(1)
153
+ m.weight.data.normal_(0, 0.01)
154
+ m.bias.data.zero_()
155
+
156
+
157
+ def mobilenetv2(pretrained=False, **kwargs):
158
+ """Constructs a MobileNet_V2 model.
159
+
160
+ Args:
161
+ pretrained (bool): If True, returns a model pre-trained on ImageNet
162
+ """
163
+ model = MobileNetV2(n_class=1000, **kwargs)
164
+ if pretrained:
165
+ model.load_state_dict(load_url(model_urls["mobilenetv2"]), strict=False)
166
+ return model
models/models.py ADDED
@@ -0,0 +1,687 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import List
2
+
3
+ import torch
4
+ import torch.nn as nn
5
+
6
+ from . import hrnet, mobilenet, resnet, resnext
7
+ from .lib.nn import SynchronizedBatchNorm2d
8
+
9
+ BatchNorm2d = SynchronizedBatchNorm2d
10
+
11
+
12
+ class SegmentationModuleBase(nn.Module):
13
+ def __init__(self):
14
+ super(SegmentationModuleBase, self).__init__()
15
+
16
+ def pixel_acc(self, pred, label):
17
+ _, preds = torch.max(pred, dim=1)
18
+ valid = (label >= 0).long()
19
+ acc_sum = torch.sum(valid * (preds == label).long())
20
+ pixel_sum = torch.sum(valid)
21
+ acc = acc_sum.float() / (pixel_sum.float() + 1e-10)
22
+ return acc
23
+
24
+
25
+ class SegmentationModule(SegmentationModuleBase):
26
+ def __init__(self, net_enc, net_dec, crit, deep_sup_scale=None):
27
+ super(SegmentationModule, self).__init__()
28
+ self.encoder = net_enc
29
+ self.decoder = net_dec
30
+ self.crit = crit
31
+ self.deep_sup_scale = deep_sup_scale
32
+
33
+ def forward(self, feed_dict, *, segSize=None):
34
+ # training
35
+ if segSize is None:
36
+ if self.deep_sup_scale is not None: # use deep supervision technique
37
+ (pred, pred_deepsup) = self.decoder(
38
+ self.encoder(feed_dict["img_data"], return_feature_maps=True)
39
+ )
40
+ else:
41
+ pred = self.decoder(
42
+ self.encoder(feed_dict["img_data"], return_feature_maps=True)
43
+ )
44
+
45
+ loss = self.crit(pred, feed_dict["seg_label"])
46
+ if self.deep_sup_scale is not None:
47
+ loss_deepsup = self.crit(pred_deepsup, feed_dict["seg_label"])
48
+ loss = loss + loss_deepsup * self.deep_sup_scale
49
+
50
+ acc = self.pixel_acc(pred, feed_dict["seg_label"])
51
+ return loss, acc
52
+ # inference
53
+ else:
54
+ pred = self.decoder(
55
+ self.encoder(feed_dict["img_data"], return_feature_maps=True),
56
+ segSize=segSize,
57
+ )
58
+ return pred
59
+
60
+
61
+ class ModelBuilder:
62
+ # custom weights initialization
63
+ @staticmethod
64
+ def weights_init(m):
65
+ classname = m.__class__.__name__
66
+ if classname.find("Conv") != -1:
67
+ nn.init.kaiming_normal_(m.weight.data)
68
+ elif classname.find("BatchNorm") != -1:
69
+ m.weight.data.fill_(1.0)
70
+ m.bias.data.fill_(1e-4)
71
+ # elif classname.find('Linear') != -1:
72
+ # m.weight.data.normal_(0.0, 0.0001)
73
+
74
+ @staticmethod
75
+ def build_encoder(arch="resnet50dilated", fc_dim=512, weights=""):
76
+ pretrained = True if len(weights) == 0 else False
77
+ arch = arch.lower()
78
+ if arch == "mobilenetv2dilated":
79
+ orig_mobilenet = mobilenet.__dict__["mobilenetv2"](pretrained=pretrained)
80
+ net_encoder = MobileNetV2Dilated(orig_mobilenet, dilate_scale=8)
81
+ elif arch == "resnet18":
82
+ orig_resnet = resnet.__dict__["resnet18"](pretrained=pretrained)
83
+ net_encoder = Resnet(orig_resnet)
84
+ elif arch == "resnet18dilated":
85
+ orig_resnet = resnet.__dict__["resnet18"](pretrained=pretrained)
86
+ net_encoder = ResnetDilated(orig_resnet, dilate_scale=8)
87
+ elif arch == "resnet34":
88
+ raise NotImplementedError
89
+ orig_resnet = resnet.__dict__["resnet34"](pretrained=pretrained)
90
+ net_encoder = Resnet(orig_resnet)
91
+ elif arch == "resnet34dilated":
92
+ raise NotImplementedError
93
+ orig_resnet = resnet.__dict__["resnet34"](pretrained=pretrained)
94
+ net_encoder = ResnetDilated(orig_resnet, dilate_scale=8)
95
+ elif arch == "resnet50":
96
+ orig_resnet = resnet.__dict__["resnet50"](pretrained=pretrained)
97
+ net_encoder = Resnet(orig_resnet)
98
+ elif arch == "resnet50dilated":
99
+ orig_resnet = resnet.__dict__["resnet50"](pretrained=pretrained)
100
+ net_encoder = ResnetDilated(orig_resnet, dilate_scale=8)
101
+ elif arch == "resnet101":
102
+ orig_resnet = resnet.__dict__["resnet101"](pretrained=pretrained)
103
+ net_encoder = Resnet(orig_resnet)
104
+ elif arch == "resnet101dilated":
105
+ orig_resnet = resnet.__dict__["resnet101"](pretrained=pretrained)
106
+ net_encoder = ResnetDilated(orig_resnet, dilate_scale=8)
107
+ elif arch == "resnext101":
108
+ orig_resnext = resnext.__dict__["resnext101"](pretrained=pretrained)
109
+ net_encoder = Resnet(orig_resnext) # we can still use class Resnet
110
+ elif arch == "hrnetv2":
111
+ net_encoder = hrnet.__dict__["hrnetv2"](pretrained=pretrained)
112
+ else:
113
+ raise Exception("Architecture undefined!")
114
+
115
+ # encoders are usually pretrained
116
+ # net_encoder.apply(ModelBuilder.weights_init)
117
+ if len(weights) > 0:
118
+ print("Loading weights for net_encoder")
119
+ net_encoder.load_state_dict(
120
+ torch.load(weights, map_location=lambda storage, loc: storage),
121
+ strict=False,
122
+ )
123
+ return net_encoder
124
+
125
+ @staticmethod
126
+ def build_decoder(
127
+ arch="ppm_deepsup",
128
+ fc_dim=512,
129
+ num_class=150,
130
+ weights="",
131
+ use_softmax=False,
132
+ dropout=0.0,
133
+ fcn_up: int = 32,
134
+ ):
135
+ arch = arch.lower()
136
+ if arch == "c1_deepsup":
137
+ net_decoder = C1DeepSup(
138
+ num_class=num_class, fc_dim=fc_dim, use_softmax=use_softmax
139
+ )
140
+ elif arch == "c1": # currently only support C1
141
+ net_decoder = C1(
142
+ num_class=num_class,
143
+ fc_dim=fc_dim,
144
+ use_softmax=use_softmax,
145
+ dropout=dropout,
146
+ fcn_up=fcn_up,
147
+ )
148
+ elif arch == "ppm":
149
+ net_decoder = PPM(
150
+ num_class=num_class, fc_dim=fc_dim, use_softmax=use_softmax
151
+ )
152
+ elif arch == "ppm_deepsup":
153
+ net_decoder = PPMDeepsup(
154
+ num_class=num_class, fc_dim=fc_dim, use_softmax=use_softmax
155
+ )
156
+ elif arch == "upernet_lite":
157
+ net_decoder = UPerNet(
158
+ num_class=num_class, fc_dim=fc_dim, use_softmax=use_softmax, fpn_dim=256
159
+ )
160
+ elif arch == "upernet":
161
+ net_decoder = UPerNet(
162
+ num_class=num_class, fc_dim=fc_dim, use_softmax=use_softmax, fpn_dim=512
163
+ )
164
+ else:
165
+ raise Exception("Architecture undefined!")
166
+
167
+ net_decoder.apply(ModelBuilder.weights_init)
168
+ if len(weights) > 0:
169
+ print("Loading weights for net_decoder")
170
+ net_decoder.load_state_dict(
171
+ torch.load(weights, map_location=lambda storage, loc: storage),
172
+ strict=False,
173
+ )
174
+ return net_decoder
175
+
176
+
177
+ def conv3x3_bn_relu(in_planes, out_planes, stride=1):
178
+ "3x3 convolution + BN + relu"
179
+ return nn.Sequential(
180
+ nn.Conv2d(
181
+ in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False
182
+ ),
183
+ BatchNorm2d(out_planes),
184
+ nn.ReLU(inplace=True),
185
+ )
186
+
187
+
188
+ class Resnet(nn.Module):
189
+ def __init__(self, orig_resnet):
190
+ super(Resnet, self).__init__()
191
+
192
+ # take pretrained resnet, except AvgPool and FC
193
+ self.conv1 = orig_resnet.conv1
194
+ self.bn1 = orig_resnet.bn1
195
+ self.relu1 = orig_resnet.relu1
196
+ self.conv2 = orig_resnet.conv2
197
+ self.bn2 = orig_resnet.bn2
198
+ self.relu2 = orig_resnet.relu2
199
+ self.conv3 = orig_resnet.conv3
200
+ self.bn3 = orig_resnet.bn3
201
+ self.relu3 = orig_resnet.relu3
202
+ self.maxpool = orig_resnet.maxpool
203
+ self.layer1 = orig_resnet.layer1
204
+ self.layer2 = orig_resnet.layer2
205
+ self.layer3 = orig_resnet.layer3
206
+ self.layer4 = orig_resnet.layer4
207
+
208
+ def forward(self, x, return_feature_maps=False):
209
+ conv_out = []
210
+
211
+ x = self.relu1(self.bn1(self.conv1(x)))
212
+ x = self.relu2(self.bn2(self.conv2(x)))
213
+ x = self.relu3(self.bn3(self.conv3(x)))
214
+ x = self.maxpool(x) # b, 128, h / 2, w / 2
215
+
216
+ x = self.layer1(x)
217
+ conv_out.append(x)
218
+ # b, 128, h / 4, w / 4
219
+ x = self.layer2(x)
220
+ conv_out.append(x)
221
+ # b, 128, h / 8, w / 8
222
+ x = self.layer3(x)
223
+ conv_out.append(x)
224
+ # b, 128, h / 16, w / 16
225
+ x = self.layer4(x)
226
+ conv_out.append(x)
227
+ # b, 128, h / 32, w / 32
228
+
229
+ if return_feature_maps:
230
+ return conv_out
231
+ return [x]
232
+
233
+
234
+ class ResnetDilated(nn.Module):
235
+ def __init__(self, orig_resnet, dilate_scale=8):
236
+ super(ResnetDilated, self).__init__()
237
+ from functools import partial
238
+
239
+ if dilate_scale == 8:
240
+ orig_resnet.layer3.apply(partial(self._nostride_dilate, dilate=2))
241
+ orig_resnet.layer4.apply(partial(self._nostride_dilate, dilate=4))
242
+ elif dilate_scale == 16:
243
+ orig_resnet.layer4.apply(partial(self._nostride_dilate, dilate=2))
244
+
245
+ # take pretrained resnet, except AvgPool and FC
246
+ self.conv1 = orig_resnet.conv1
247
+ self.bn1 = orig_resnet.bn1
248
+ self.relu1 = orig_resnet.relu1
249
+ self.conv2 = orig_resnet.conv2
250
+ self.bn2 = orig_resnet.bn2
251
+ self.relu2 = orig_resnet.relu2
252
+ self.conv3 = orig_resnet.conv3
253
+ self.bn3 = orig_resnet.bn3
254
+ self.relu3 = orig_resnet.relu3
255
+ self.maxpool = orig_resnet.maxpool
256
+ self.layer1 = orig_resnet.layer1
257
+ self.layer2 = orig_resnet.layer2
258
+ self.layer3 = orig_resnet.layer3
259
+ self.layer4 = orig_resnet.layer4
260
+
261
+ def _nostride_dilate(self, m, dilate):
262
+ classname = m.__class__.__name__
263
+ if classname.find("Conv") != -1:
264
+ # the convolution with stride
265
+ if m.stride == (2, 2):
266
+ m.stride = (1, 1)
267
+ if m.kernel_size == (3, 3):
268
+ m.dilation = (dilate // 2, dilate // 2)
269
+ m.padding = (dilate // 2, dilate // 2)
270
+ # other convoluions
271
+ else:
272
+ if m.kernel_size == (3, 3):
273
+ m.dilation = (dilate, dilate)
274
+ m.padding = (dilate, dilate)
275
+
276
+ def forward(self, x, return_feature_maps=False):
277
+ conv_out = []
278
+
279
+ x = self.relu1(self.bn1(self.conv1(x)))
280
+ x = self.relu2(self.bn2(self.conv2(x)))
281
+ x = self.relu3(self.bn3(self.conv3(x)))
282
+ x = self.maxpool(x)
283
+
284
+ x = self.layer1(x)
285
+ conv_out.append(x)
286
+ x = self.layer2(x)
287
+ conv_out.append(x)
288
+ x = self.layer3(x)
289
+ conv_out.append(x)
290
+ x = self.layer4(x)
291
+ conv_out.append(x)
292
+
293
+ if return_feature_maps:
294
+ return conv_out
295
+ return [x]
296
+
297
+
298
+ class MobileNetV2Dilated(nn.Module):
299
+ def __init__(self, orig_net, dilate_scale=8):
300
+ super(MobileNetV2Dilated, self).__init__()
301
+ from functools import partial
302
+
303
+ # take pretrained mobilenet features
304
+ self.features = orig_net.features[:-1]
305
+
306
+ self.total_idx = len(self.features)
307
+ self.down_idx = [2, 4, 7, 14]
308
+
309
+ if dilate_scale == 8:
310
+ for i in range(self.down_idx[-2], self.down_idx[-1]):
311
+ self.features[i].apply(partial(self._nostride_dilate, dilate=2))
312
+ for i in range(self.down_idx[-1], self.total_idx):
313
+ self.features[i].apply(partial(self._nostride_dilate, dilate=4))
314
+ elif dilate_scale == 16:
315
+ for i in range(self.down_idx[-1], self.total_idx):
316
+ self.features[i].apply(partial(self._nostride_dilate, dilate=2))
317
+
318
+ def _nostride_dilate(self, m, dilate):
319
+ classname = m.__class__.__name__
320
+ if classname.find("Conv") != -1:
321
+ # the convolution with stride
322
+ if m.stride == (2, 2):
323
+ m.stride = (1, 1)
324
+ if m.kernel_size == (3, 3):
325
+ m.dilation = (dilate // 2, dilate // 2)
326
+ m.padding = (dilate // 2, dilate // 2)
327
+ # other convoluions
328
+ else:
329
+ if m.kernel_size == (3, 3):
330
+ m.dilation = (dilate, dilate)
331
+ m.padding = (dilate, dilate)
332
+
333
+ def forward(self, x, return_feature_maps=False):
334
+ if return_feature_maps:
335
+ conv_out = []
336
+ for i in range(self.total_idx):
337
+ x = self.features[i](x)
338
+ if i in self.down_idx:
339
+ conv_out.append(x)
340
+ conv_out.append(x)
341
+ return conv_out
342
+
343
+ else:
344
+ return [self.features(x)]
345
+
346
+
347
+ # last conv, deep supervision
348
+ class C1DeepSup(nn.Module):
349
+ def __init__(self, num_class=150, fc_dim=2048, use_softmax=False):
350
+ super(C1DeepSup, self).__init__()
351
+ self.use_softmax = use_softmax
352
+
353
+ self.cbr = conv3x3_bn_relu(fc_dim, fc_dim // 4, 1)
354
+ self.cbr_deepsup = conv3x3_bn_relu(fc_dim // 2, fc_dim // 4, 1)
355
+
356
+ # last conv
357
+ self.conv_last = nn.Conv2d(fc_dim // 4, num_class, 1, 1, 0)
358
+ self.conv_last_deepsup = nn.Conv2d(fc_dim // 4, num_class, 1, 1, 0)
359
+
360
+ def forward(self, conv_out, segSize=None):
361
+ conv5 = conv_out[-1]
362
+
363
+ x = self.cbr(conv5)
364
+ x = self.conv_last(x)
365
+
366
+ if self.use_softmax: # is True during inference
367
+ x = nn.functional.interpolate(
368
+ x, size=segSize, mode="bilinear", align_corners=False
369
+ )
370
+ x = nn.functional.softmax(x, dim=1)
371
+ return x
372
+
373
+ # deep sup
374
+ conv4 = conv_out[-2]
375
+ _ = self.cbr_deepsup(conv4)
376
+ _ = self.conv_last_deepsup(_)
377
+
378
+ x = nn.functional.log_softmax(x, dim=1)
379
+ _ = nn.functional.log_softmax(_, dim=1)
380
+
381
+ return (x, _)
382
+
383
+
384
+ # last conv
385
+ class C1(nn.Module):
386
+ def __init__(
387
+ self,
388
+ num_class=150,
389
+ fc_dim: int = 2048,
390
+ use_softmax=False,
391
+ dropout=0.0,
392
+ fcn_up: int = 32,
393
+ ):
394
+ super(C1, self).__init__()
395
+ self.use_softmax = use_softmax
396
+ self.fcn_up = fcn_up
397
+
398
+ if fcn_up == 32:
399
+ in_dim = fc_dim
400
+ elif fcn_up == 16:
401
+ in_dim = int(fc_dim / 2 * 3)
402
+ else: # 8
403
+ in_dim = int(fc_dim / 2 * 3 + fc_dim / 4)
404
+ self.cbr = conv3x3_bn_relu(in_dim, fc_dim // 4, 1)
405
+
406
+ # last conv
407
+ self.dropout = nn.Dropout2d(dropout)
408
+ self.conv_last = nn.Conv2d(fc_dim // 4, num_class, 1, 1, 0)
409
+
410
+ def forward(self, conv_out: List, segSize=None):
411
+ if self.fcn_up == 32:
412
+ conv5 = conv_out[-1]
413
+ elif self.fcn_up == 16:
414
+ conv4 = conv_out[-2]
415
+ tgt_shape = conv4.shape[-2:]
416
+ conv5 = conv_out[-1]
417
+ conv5 = nn.functional.interpolate(
418
+ conv5, size=tgt_shape, mode="bilinear", align_corners=False
419
+ )
420
+ conv5 = torch.cat([conv4, conv5], dim=1)
421
+ else: # 8
422
+ conv3 = conv_out[-3]
423
+ tgt_shape = conv3.shape[-2:]
424
+ conv4 = conv_out[-2]
425
+ conv5 = conv_out[-1]
426
+ conv4 = nn.functional.interpolate(
427
+ conv4, size=tgt_shape, mode="bilinear", align_corners=False
428
+ )
429
+ conv5 = nn.functional.interpolate(
430
+ conv5, size=tgt_shape, mode="bilinear", align_corners=False
431
+ )
432
+ conv5 = torch.cat([conv3, conv4, conv5], dim=1)
433
+ x = self.cbr(conv5)
434
+ x = self.dropout(x)
435
+ x = self.conv_last(x)
436
+
437
+ return x
438
+
439
+
440
+ # pyramid pooling
441
+ class PPM(nn.Module):
442
+ def __init__(
443
+ self, num_class=150, fc_dim=4096, use_softmax=False, pool_scales=(1, 2, 3, 6)
444
+ ):
445
+ super(PPM, self).__init__()
446
+ self.use_softmax = use_softmax
447
+
448
+ self.ppm = []
449
+ for scale in pool_scales:
450
+ self.ppm.append(
451
+ nn.Sequential(
452
+ nn.AdaptiveAvgPool2d(scale),
453
+ nn.Conv2d(fc_dim, 512, kernel_size=1, bias=False),
454
+ BatchNorm2d(512),
455
+ nn.ReLU(inplace=True),
456
+ )
457
+ )
458
+ self.ppm = nn.ModuleList(self.ppm)
459
+
460
+ self.conv_last = nn.Sequential(
461
+ nn.Conv2d(
462
+ fc_dim + len(pool_scales) * 512,
463
+ 512,
464
+ kernel_size=3,
465
+ padding=1,
466
+ bias=False,
467
+ ),
468
+ BatchNorm2d(512),
469
+ nn.ReLU(inplace=True),
470
+ nn.Dropout2d(0.1),
471
+ nn.Conv2d(512, num_class, kernel_size=1),
472
+ )
473
+
474
+ def forward(self, conv_out, segSize=None):
475
+ conv5 = conv_out[-1]
476
+
477
+ input_size = conv5.size()
478
+ ppm_out = [conv5]
479
+ for pool_scale in self.ppm:
480
+ ppm_out.append(
481
+ nn.functional.interpolate(
482
+ pool_scale(conv5),
483
+ (input_size[2], input_size[3]),
484
+ mode="bilinear",
485
+ align_corners=False,
486
+ )
487
+ )
488
+ ppm_out = torch.cat(ppm_out, 1)
489
+
490
+ x = self.conv_last(ppm_out)
491
+
492
+ if segSize is not None: # for inference
493
+ x = nn.functional.interpolate(
494
+ x, size=segSize, mode="bilinear", align_corners=False
495
+ )
496
+ return x
497
+
498
+
499
+ # pyramid pooling, deep supervision
500
+ class PPMDeepsup(nn.Module):
501
+ def __init__(
502
+ self, num_class=150, fc_dim=4096, use_softmax=False, pool_scales=(1, 2, 3, 6)
503
+ ):
504
+ super(PPMDeepsup, self).__init__()
505
+ self.use_softmax = use_softmax
506
+
507
+ self.ppm = []
508
+ for scale in pool_scales:
509
+ self.ppm.append(
510
+ nn.Sequential(
511
+ nn.AdaptiveAvgPool2d(scale),
512
+ nn.Conv2d(fc_dim, 512, kernel_size=1, bias=False),
513
+ BatchNorm2d(512),
514
+ nn.ReLU(inplace=True),
515
+ )
516
+ )
517
+ self.ppm = nn.ModuleList(self.ppm)
518
+ self.cbr_deepsup = conv3x3_bn_relu(fc_dim // 2, fc_dim // 4, 1)
519
+
520
+ self.conv_last = nn.Sequential(
521
+ nn.Conv2d(
522
+ fc_dim + len(pool_scales) * 512,
523
+ 512,
524
+ kernel_size=3,
525
+ padding=1,
526
+ bias=False,
527
+ ),
528
+ BatchNorm2d(512),
529
+ nn.ReLU(inplace=True),
530
+ nn.Dropout2d(0.1),
531
+ nn.Conv2d(512, num_class, kernel_size=1),
532
+ )
533
+ self.conv_last_deepsup = nn.Conv2d(fc_dim // 4, num_class, 1, 1, 0)
534
+ self.dropout_deepsup = nn.Dropout2d(0.1)
535
+
536
+ def forward(self, conv_out, segSize=None):
537
+ conv5 = conv_out[-1]
538
+
539
+ input_size = conv5.size()
540
+ ppm_out = [conv5]
541
+ for pool_scale in self.ppm:
542
+ ppm_out.append(
543
+ nn.functional.interpolate(
544
+ pool_scale(conv5),
545
+ (input_size[2], input_size[3]),
546
+ mode="bilinear",
547
+ align_corners=False,
548
+ )
549
+ )
550
+ ppm_out = torch.cat(ppm_out, 1)
551
+
552
+ x = self.conv_last(ppm_out)
553
+
554
+ if self.use_softmax: # is True during inference
555
+ x = nn.functional.interpolate(
556
+ x, size=segSize, mode="bilinear", align_corners=False
557
+ )
558
+ x = nn.functional.softmax(x, dim=1)
559
+ return x
560
+
561
+ # deep sup
562
+ conv4 = conv_out[-2]
563
+ _ = self.cbr_deepsup(conv4)
564
+ _ = self.dropout_deepsup(_)
565
+ _ = self.conv_last_deepsup(_)
566
+
567
+ x = nn.functional.log_softmax(x, dim=1)
568
+ _ = nn.functional.log_softmax(_, dim=1)
569
+
570
+ return (x, _)
571
+
572
+
573
+ # upernet
574
+ class UPerNet(nn.Module):
575
+ def __init__(
576
+ self,
577
+ num_class=150,
578
+ fc_dim=4096,
579
+ use_softmax=False,
580
+ pool_scales=(1, 2, 3, 6),
581
+ fpn_inplanes=(256, 512, 1024, 2048),
582
+ fpn_dim=256,
583
+ ):
584
+ super(UPerNet, self).__init__()
585
+ self.use_softmax = use_softmax
586
+
587
+ # PPM Module
588
+ self.ppm_pooling = []
589
+ self.ppm_conv = []
590
+
591
+ for scale in pool_scales:
592
+ self.ppm_pooling.append(nn.AdaptiveAvgPool2d(scale))
593
+ self.ppm_conv.append(
594
+ nn.Sequential(
595
+ nn.Conv2d(fc_dim, 512, kernel_size=1, bias=False),
596
+ BatchNorm2d(512),
597
+ nn.ReLU(inplace=True),
598
+ )
599
+ )
600
+ self.ppm_pooling = nn.ModuleList(self.ppm_pooling)
601
+ self.ppm_conv = nn.ModuleList(self.ppm_conv)
602
+ self.ppm_last_conv = conv3x3_bn_relu(
603
+ fc_dim + len(pool_scales) * 512, fpn_dim, 1
604
+ )
605
+
606
+ # FPN Module
607
+ self.fpn_in = []
608
+ for fpn_inplane in fpn_inplanes[:-1]: # skip the top layer
609
+ self.fpn_in.append(
610
+ nn.Sequential(
611
+ nn.Conv2d(fpn_inplane, fpn_dim, kernel_size=1, bias=False),
612
+ BatchNorm2d(fpn_dim),
613
+ nn.ReLU(inplace=True),
614
+ )
615
+ )
616
+ self.fpn_in = nn.ModuleList(self.fpn_in)
617
+
618
+ self.fpn_out = []
619
+ for i in range(len(fpn_inplanes) - 1): # skip the top layer
620
+ self.fpn_out.append(
621
+ nn.Sequential(
622
+ conv3x3_bn_relu(fpn_dim, fpn_dim, 1),
623
+ )
624
+ )
625
+ self.fpn_out = nn.ModuleList(self.fpn_out)
626
+
627
+ self.conv_last = nn.Sequential(
628
+ conv3x3_bn_relu(len(fpn_inplanes) * fpn_dim, fpn_dim, 1),
629
+ nn.Conv2d(fpn_dim, num_class, kernel_size=1),
630
+ )
631
+
632
+ def forward(self, conv_out, segSize=None):
633
+ conv5 = conv_out[-1]
634
+
635
+ input_size = conv5.size()
636
+ ppm_out = [conv5]
637
+ for pool_scale, pool_conv in zip(self.ppm_pooling, self.ppm_conv):
638
+ ppm_out.append(
639
+ pool_conv(
640
+ nn.functional.interpolate(
641
+ pool_scale(conv5),
642
+ (input_size[2], input_size[3]),
643
+ mode="bilinear",
644
+ align_corners=False,
645
+ )
646
+ )
647
+ )
648
+ ppm_out = torch.cat(ppm_out, 1)
649
+ f = self.ppm_last_conv(ppm_out)
650
+
651
+ fpn_feature_list = [f]
652
+ for i in reversed(range(len(conv_out) - 1)):
653
+ conv_x = conv_out[i]
654
+ conv_x = self.fpn_in[i](conv_x) # lateral branch
655
+
656
+ f = nn.functional.interpolate(
657
+ f, size=conv_x.size()[2:], mode="bilinear", align_corners=False
658
+ ) # top-down branch
659
+ f = conv_x + f
660
+
661
+ fpn_feature_list.append(self.fpn_out[i](f))
662
+
663
+ fpn_feature_list.reverse() # [P2 - P5]
664
+ output_size = fpn_feature_list[0].size()[2:]
665
+ fusion_list = [fpn_feature_list[0]]
666
+ for i in range(1, len(fpn_feature_list)):
667
+ fusion_list.append(
668
+ nn.functional.interpolate(
669
+ fpn_feature_list[i],
670
+ output_size,
671
+ mode="bilinear",
672
+ align_corners=False,
673
+ )
674
+ )
675
+ fusion_out = torch.cat(fusion_list, 1)
676
+ x = self.conv_last(fusion_out)
677
+
678
+ if self.use_softmax: # is True during inference
679
+ x = nn.functional.interpolate(
680
+ x, size=segSize, mode="bilinear", align_corners=False
681
+ )
682
+ x = nn.functional.softmax(x, dim=1)
683
+ return x
684
+
685
+ x = nn.functional.log_softmax(x, dim=1)
686
+
687
+ return x
models/resnet.py ADDED
@@ -0,0 +1,229 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import math
2
+
3
+ import torch.nn as nn
4
+
5
+ from .lib.nn import SynchronizedBatchNorm2d
6
+ from .utils import load_url
7
+
8
+ BatchNorm2d = SynchronizedBatchNorm2d
9
+
10
+
11
+ __all__ = ["ResNet", "resnet18", "resnet50", "resnet101"] # resnet101 is coming soon!
12
+
13
+
14
+ model_urls = {
15
+ "resnet18": "http://sceneparsing.csail.mit.edu/model/pretrained_resnet/resnet18-imagenet.pth",
16
+ "resnet50": "http://sceneparsing.csail.mit.edu/model/pretrained_resnet/resnet50-imagenet.pth",
17
+ "resnet101": "http://sceneparsing.csail.mit.edu/model/pretrained_resnet/resnet101-imagenet.pth",
18
+ }
19
+
20
+
21
+ def conv3x3(in_planes, out_planes, stride=1):
22
+ "3x3 convolution with padding"
23
+ return nn.Conv2d(
24
+ in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False
25
+ )
26
+
27
+
28
+ class BasicBlock(nn.Module):
29
+ expansion = 1
30
+
31
+ def __init__(self, inplanes, planes, stride=1, downsample=None):
32
+ super(BasicBlock, self).__init__()
33
+ self.conv1 = conv3x3(inplanes, planes, stride)
34
+ self.bn1 = BatchNorm2d(planes)
35
+ self.relu = nn.ReLU(inplace=True)
36
+ self.conv2 = conv3x3(planes, planes)
37
+ self.bn2 = BatchNorm2d(planes)
38
+ self.downsample = downsample
39
+ self.stride = stride
40
+
41
+ def forward(self, x):
42
+ residual = x
43
+
44
+ out = self.conv1(x)
45
+ out = self.bn1(out)
46
+ out = self.relu(out)
47
+
48
+ out = self.conv2(out)
49
+ out = self.bn2(out)
50
+
51
+ if self.downsample is not None:
52
+ residual = self.downsample(x)
53
+
54
+ out += residual
55
+ out = self.relu(out)
56
+
57
+ return out
58
+
59
+
60
+ class Bottleneck(nn.Module):
61
+ expansion = 4
62
+
63
+ def __init__(self, inplanes, planes, stride=1, downsample=None):
64
+ super(Bottleneck, self).__init__()
65
+ self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
66
+ self.bn1 = BatchNorm2d(planes)
67
+ self.conv2 = nn.Conv2d(
68
+ planes, planes, kernel_size=3, stride=stride, padding=1, bias=False
69
+ )
70
+ self.bn2 = BatchNorm2d(planes)
71
+ self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
72
+ self.bn3 = BatchNorm2d(planes * 4)
73
+ self.relu = nn.ReLU(inplace=True)
74
+ self.downsample = downsample
75
+ self.stride = stride
76
+
77
+ def forward(self, x):
78
+ residual = x
79
+
80
+ out = self.conv1(x)
81
+ out = self.bn1(out)
82
+ out = self.relu(out)
83
+
84
+ out = self.conv2(out)
85
+ out = self.bn2(out)
86
+ out = self.relu(out)
87
+
88
+ out = self.conv3(out)
89
+ out = self.bn3(out)
90
+
91
+ if self.downsample is not None:
92
+ residual = self.downsample(x)
93
+
94
+ out += residual
95
+ out = self.relu(out)
96
+
97
+ return out
98
+
99
+
100
+ class ResNet(nn.Module):
101
+ def __init__(self, block, layers, num_classes=1000):
102
+ self.inplanes = 128
103
+ super(ResNet, self).__init__()
104
+ self.conv1 = conv3x3(3, 64, stride=2)
105
+ self.bn1 = BatchNorm2d(64)
106
+ self.relu1 = nn.ReLU(inplace=True)
107
+ self.conv2 = conv3x3(64, 64)
108
+ self.bn2 = BatchNorm2d(64)
109
+ self.relu2 = nn.ReLU(inplace=True)
110
+ self.conv3 = conv3x3(64, 128)
111
+ self.bn3 = BatchNorm2d(128)
112
+ self.relu3 = nn.ReLU(inplace=True)
113
+ self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
114
+
115
+ self.layer1 = self._make_layer(block, 64, layers[0])
116
+ self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
117
+ self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
118
+ self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
119
+ self.avgpool = nn.AvgPool2d(7, stride=1)
120
+ self.fc = nn.Linear(512 * block.expansion, num_classes)
121
+
122
+ for m in self.modules():
123
+ if isinstance(m, nn.Conv2d):
124
+ n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
125
+ m.weight.data.normal_(0, math.sqrt(2.0 / n))
126
+ elif isinstance(m, BatchNorm2d):
127
+ m.weight.data.fill_(1)
128
+ m.bias.data.zero_()
129
+
130
+ def _make_layer(self, block, planes, blocks, stride=1):
131
+ downsample = None
132
+ if stride != 1 or self.inplanes != planes * block.expansion:
133
+ downsample = nn.Sequential(
134
+ nn.Conv2d(
135
+ self.inplanes,
136
+ planes * block.expansion,
137
+ kernel_size=1,
138
+ stride=stride,
139
+ bias=False,
140
+ ),
141
+ BatchNorm2d(planes * block.expansion),
142
+ )
143
+
144
+ layers = []
145
+ layers.append(block(self.inplanes, planes, stride, downsample))
146
+ self.inplanes = planes * block.expansion
147
+ for i in range(1, blocks):
148
+ layers.append(block(self.inplanes, planes))
149
+
150
+ return nn.Sequential(*layers)
151
+
152
+ def forward(self, x):
153
+ x = self.relu1(self.bn1(self.conv1(x)))
154
+ x = self.relu2(self.bn2(self.conv2(x)))
155
+ x = self.relu3(self.bn3(self.conv3(x)))
156
+ x = self.maxpool(x)
157
+
158
+ x = self.layer1(x)
159
+ x = self.layer2(x)
160
+ x = self.layer3(x)
161
+ x = self.layer4(x)
162
+
163
+ x = self.avgpool(x)
164
+ x = x.view(x.size(0), -1)
165
+ x = self.fc(x)
166
+
167
+ return x
168
+
169
+
170
+ def resnet18(pretrained=False, **kwargs):
171
+ """Constructs a ResNet-18 model.
172
+
173
+ Args:
174
+ pretrained (bool): If True, returns a model pre-trained on ImageNet
175
+ """
176
+ model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
177
+ if pretrained:
178
+ model.load_state_dict(load_url(model_urls["resnet18"]))
179
+ return model
180
+
181
+
182
+ '''
183
+ def resnet34(pretrained=False, **kwargs):
184
+ """Constructs a ResNet-34 model.
185
+
186
+ Args:
187
+ pretrained (bool): If True, returns a model pre-trained on ImageNet
188
+ """
189
+ model = ResNet(BasicBlock, [3, 4, 6, 3], **kwargs)
190
+ if pretrained:
191
+ model.load_state_dict(load_url(model_urls['resnet34']))
192
+ return model
193
+ '''
194
+
195
+
196
+ def resnet50(pretrained=False, **kwargs):
197
+ """Constructs a ResNet-50 model.
198
+
199
+ Args:
200
+ pretrained (bool): If True, returns a model pre-trained on ImageNet
201
+ """
202
+ model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
203
+ if pretrained:
204
+ model.load_state_dict(load_url(model_urls["resnet50"]), strict=False)
205
+ return model
206
+
207
+
208
+ def resnet101(pretrained=False, **kwargs):
209
+ """Constructs a ResNet-101 model.
210
+
211
+ Args:
212
+ pretrained (bool): If True, returns a model pre-trained on ImageNet
213
+ """
214
+ model = ResNet(Bottleneck, [3, 4, 23, 3], **kwargs)
215
+ if pretrained:
216
+ model.load_state_dict(load_url(model_urls["resnet101"]), strict=False)
217
+ return model
218
+
219
+
220
+ # def resnet152(pretrained=False, **kwargs):
221
+ # """Constructs a ResNet-152 model.
222
+ #
223
+ # Args:
224
+ # pretrained (bool): If True, returns a model pre-trained on ImageNet
225
+ # """
226
+ # model = ResNet(Bottleneck, [3, 8, 36, 3], **kwargs)
227
+ # if pretrained:
228
+ # model.load_state_dict(load_url(model_urls['resnet152']))
229
+ # return model
models/resnext.py ADDED
@@ -0,0 +1,178 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import math
2
+
3
+ import torch.nn as nn
4
+
5
+ from .lib.nn import SynchronizedBatchNorm2d
6
+ from .utils import load_url
7
+
8
+ BatchNorm2d = SynchronizedBatchNorm2d
9
+
10
+
11
+ __all__ = ["ResNeXt", "resnext101"] # support resnext 101
12
+
13
+
14
+ model_urls = {
15
+ #'resnext50': 'http://sceneparsing.csail.mit.edu/model/pretrained_resnet/resnext50-imagenet.pth',
16
+ "resnext101": "http://sceneparsing.csail.mit.edu/model/pretrained_resnet/resnext101-imagenet.pth"
17
+ }
18
+
19
+
20
+ def conv3x3(in_planes, out_planes, stride=1):
21
+ "3x3 convolution with padding"
22
+ return nn.Conv2d(
23
+ in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False
24
+ )
25
+
26
+
27
+ class GroupBottleneck(nn.Module):
28
+ expansion = 2
29
+
30
+ def __init__(self, inplanes, planes, stride=1, groups=1, downsample=None):
31
+ super(GroupBottleneck, self).__init__()
32
+ self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
33
+ self.bn1 = BatchNorm2d(planes)
34
+ self.conv2 = nn.Conv2d(
35
+ planes,
36
+ planes,
37
+ kernel_size=3,
38
+ stride=stride,
39
+ padding=1,
40
+ groups=groups,
41
+ bias=False,
42
+ )
43
+ self.bn2 = BatchNorm2d(planes)
44
+ self.conv3 = nn.Conv2d(planes, planes * 2, kernel_size=1, bias=False)
45
+ self.bn3 = BatchNorm2d(planes * 2)
46
+ self.relu = nn.ReLU(inplace=True)
47
+ self.downsample = downsample
48
+ self.stride = stride
49
+
50
+ def forward(self, x):
51
+ residual = x
52
+
53
+ out = self.conv1(x)
54
+ out = self.bn1(out)
55
+ out = self.relu(out)
56
+
57
+ out = self.conv2(out)
58
+ out = self.bn2(out)
59
+ out = self.relu(out)
60
+
61
+ out = self.conv3(out)
62
+ out = self.bn3(out)
63
+
64
+ if self.downsample is not None:
65
+ residual = self.downsample(x)
66
+
67
+ out += residual
68
+ out = self.relu(out)
69
+
70
+ return out
71
+
72
+
73
+ class ResNeXt(nn.Module):
74
+ def __init__(self, block, layers, groups=32, num_classes=1000):
75
+ self.inplanes = 128
76
+ super(ResNeXt, self).__init__()
77
+ self.conv1 = conv3x3(3, 64, stride=2)
78
+ self.bn1 = BatchNorm2d(64)
79
+ self.relu1 = nn.ReLU(inplace=True)
80
+ self.conv2 = conv3x3(64, 64)
81
+ self.bn2 = BatchNorm2d(64)
82
+ self.relu2 = nn.ReLU(inplace=True)
83
+ self.conv3 = conv3x3(64, 128)
84
+ self.bn3 = BatchNorm2d(128)
85
+ self.relu3 = nn.ReLU(inplace=True)
86
+ self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
87
+
88
+ self.layer1 = self._make_layer(block, 128, layers[0], groups=groups)
89
+ self.layer2 = self._make_layer(block, 256, layers[1], stride=2, groups=groups)
90
+ self.layer3 = self._make_layer(block, 512, layers[2], stride=2, groups=groups)
91
+ self.layer4 = self._make_layer(block, 1024, layers[3], stride=2, groups=groups)
92
+ self.avgpool = nn.AvgPool2d(7, stride=1)
93
+ self.fc = nn.Linear(1024 * block.expansion, num_classes)
94
+
95
+ for m in self.modules():
96
+ if isinstance(m, nn.Conv2d):
97
+ n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels // m.groups
98
+ m.weight.data.normal_(0, math.sqrt(2.0 / n))
99
+ elif isinstance(m, BatchNorm2d):
100
+ m.weight.data.fill_(1)
101
+ m.bias.data.zero_()
102
+
103
+ def _make_layer(self, block, planes, blocks, stride=1, groups=1):
104
+ downsample = None
105
+ if stride != 1 or self.inplanes != planes * block.expansion:
106
+ downsample = nn.Sequential(
107
+ nn.Conv2d(
108
+ self.inplanes,
109
+ planes * block.expansion,
110
+ kernel_size=1,
111
+ stride=stride,
112
+ bias=False,
113
+ ),
114
+ BatchNorm2d(planes * block.expansion),
115
+ )
116
+
117
+ layers = []
118
+ layers.append(block(self.inplanes, planes, stride, groups, downsample))
119
+ self.inplanes = planes * block.expansion
120
+ for i in range(1, blocks):
121
+ layers.append(block(self.inplanes, planes, groups=groups))
122
+
123
+ return nn.Sequential(*layers)
124
+
125
+ def forward(self, x):
126
+ x = self.relu1(self.bn1(self.conv1(x)))
127
+ x = self.relu2(self.bn2(self.conv2(x)))
128
+ x = self.relu3(self.bn3(self.conv3(x)))
129
+ x = self.maxpool(x)
130
+
131
+ x = self.layer1(x)
132
+ x = self.layer2(x)
133
+ x = self.layer3(x)
134
+ x = self.layer4(x)
135
+
136
+ x = self.avgpool(x)
137
+ x = x.view(x.size(0), -1)
138
+ x = self.fc(x)
139
+
140
+ return x
141
+
142
+
143
+ '''
144
+ def resnext50(pretrained=False, **kwargs):
145
+ """Constructs a ResNet-50 model.
146
+
147
+ Args:
148
+ pretrained (bool): If True, returns a model pre-trained on Places
149
+ """
150
+ model = ResNeXt(GroupBottleneck, [3, 4, 6, 3], **kwargs)
151
+ if pretrained:
152
+ model.load_state_dict(load_url(model_urls['resnext50']), strict=False)
153
+ return model
154
+ '''
155
+
156
+
157
+ def resnext101(pretrained=False, **kwargs):
158
+ """Constructs a ResNet-101 model.
159
+
160
+ Args:
161
+ pretrained (bool): If True, returns a model pre-trained on Places
162
+ """
163
+ model = ResNeXt(GroupBottleneck, [3, 4, 23, 3], **kwargs)
164
+ if pretrained:
165
+ model.load_state_dict(load_url(model_urls["resnext101"]), strict=False)
166
+ return model
167
+
168
+
169
+ # def resnext152(pretrained=False, **kwargs):
170
+ # """Constructs a ResNeXt-152 model.
171
+ #
172
+ # Args:
173
+ # pretrained (bool): If True, returns a model pre-trained on Places
174
+ # """
175
+ # model = ResNeXt(GroupBottleneck, [3, 8, 36, 3], **kwargs)
176
+ # if pretrained:
177
+ # model.load_state_dict(load_url(model_urls['resnext152']))
178
+ # return model
models/srm_conv.py ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ import torch
3
+ import torch.nn as nn
4
+
5
+
6
+ class SRMConv2d(nn.Module):
7
+ def __init__(self, stride: int = 1, padding: int = 2, clip: float = 2):
8
+ super().__init__()
9
+ self.stride = stride
10
+ self.padding = padding
11
+ self.clip = clip
12
+ self.conv = self._get_srm_filter()
13
+
14
+ def _get_srm_filter(self):
15
+ filter1 = [
16
+ [0, 0, 0, 0, 0],
17
+ [0, -1, 2, -1, 0],
18
+ [0, 2, -4, 2, 0],
19
+ [0, -1, 2, -1, 0],
20
+ [0, 0, 0, 0, 0],
21
+ ]
22
+ filter2 = [
23
+ [-1, 2, -2, 2, -1],
24
+ [2, -6, 8, -6, 2],
25
+ [-2, 8, -12, 8, -2],
26
+ [2, -6, 8, -6, 2],
27
+ [-1, 2, -2, 2, -1],
28
+ ]
29
+ filter3 = [
30
+ [0, 0, 0, 0, 0],
31
+ [0, 0, 0, 0, 0],
32
+ [0, 1, -2, 1, 0],
33
+ [0, 0, 0, 0, 0],
34
+ [0, 0, 0, 0, 0],
35
+ ]
36
+ q = [4.0, 12.0, 2.0]
37
+ filter1 = np.asarray(filter1, dtype=float) / q[0]
38
+ filter2 = np.asarray(filter2, dtype=float) / q[1]
39
+ filter3 = np.asarray(filter3, dtype=float) / q[2]
40
+ filters = [
41
+ [filter1, filter1, filter1],
42
+ [filter2, filter2, filter2],
43
+ [filter3, filter3, filter3],
44
+ ]
45
+ filters = torch.tensor(filters).float()
46
+ conv2d = nn.Conv2d(
47
+ 3,
48
+ 3,
49
+ kernel_size=5,
50
+ stride=self.stride,
51
+ padding=self.padding,
52
+ padding_mode="zeros",
53
+ )
54
+ conv2d.weight = nn.Parameter(filters, requires_grad=False)
55
+ conv2d.bias = nn.Parameter(torch.zeros_like(conv2d.bias), requires_grad=False)
56
+ return conv2d
57
+
58
+ def forward(self, x):
59
+ x = self.conv(x)
60
+ if self.clip != 0.0:
61
+ x = x.clamp(-self.clip, self.clip)
62
+ return x
63
+
64
+
65
+ if __name__ == "__main__":
66
+ srm = SRMConv2d()
67
+ x = torch.rand((63, 3, 64, 64))
68
+ x = srm(x)
models/utils.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import sys
3
+
4
+ try:
5
+ from urllib import urlretrieve
6
+ except ImportError:
7
+ from urllib.request import urlretrieve
8
+
9
+ import torch
10
+
11
+
12
+ def load_url(url, model_dir="./pretrained", map_location=torch.device("cpu")):
13
+ if not os.path.exists(model_dir):
14
+ os.makedirs(model_dir)
15
+ filename = url.split("/")[-1]
16
+ cached_file = os.path.join(model_dir, filename)
17
+ if not os.path.exists(cached_file):
18
+ sys.stderr.write('Downloading: "{}" to {}\n'.format(url, cached_file))
19
+ urlretrieve(url, cached_file)
20
+ return torch.load(cached_file, map_location=map_location)
opt.py ADDED
@@ -0,0 +1,483 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import os
3
+ import sys
4
+ import time
5
+ from typing import List, Optional
6
+
7
+ import prettytable as pt
8
+ import torch
9
+ import yaml
10
+ from termcolor import cprint
11
+
12
+
13
+ def load_dataset_arguments(opt):
14
+ if opt.load is None:
15
+ return
16
+
17
+ # exclude parameters assigned in the command
18
+ if len(sys.argv) > 1:
19
+ arguments = sys.argv[1:]
20
+ arguments = list(
21
+ map(lambda x: x.replace("--", ""), filter(lambda x: "--" in x, arguments))
22
+ )
23
+ else:
24
+ arguments = []
25
+
26
+ # load parameters in the yaml file
27
+ assert os.path.exists(opt.load)
28
+ with open(opt.load, "r") as f:
29
+ yaml_arguments = yaml.safe_load(f)
30
+ # TODO this should be verified
31
+ for k, v in yaml_arguments.items():
32
+ if not k in arguments:
33
+ setattr(opt, k, v)
34
+
35
+
36
+ def get_opt(additional_parsers: Optional[List] = None):
37
+ parents = [get_arguments_parser()]
38
+ if additional_parsers:
39
+ parents.extend(additional_parsers)
40
+ parser = argparse.ArgumentParser(
41
+ "Options for training and evaluation", parents=parents, allow_abbrev=False
42
+ )
43
+ opt = parser.parse_known_args()[0]
44
+
45
+ # load dataset argument file
46
+ load_dataset_arguments(opt)
47
+
48
+ # user-defined warnings and assertions
49
+ if opt.decoder.lower() not in ["c1"]:
50
+ cprint("Not supported yet! Check if the output use log_softmax!", "red")
51
+ time.sleep(3)
52
+
53
+ if opt.map_mask_weight > 0.0 or opt.volume_mask_weight > 0.0:
54
+ cprint("Mask loss is not 0!", "red")
55
+ time.sleep(3)
56
+
57
+ if opt.val_set != "val":
58
+ cprint(f"Evaluating on {opt.val_set} set!", "red")
59
+ time.sleep(3)
60
+
61
+ if opt.mvc_spixel:
62
+ assert (
63
+ not opt.loss_on_mid_map
64
+ ), "Middle map supervision is not supported with spixel!"
65
+
66
+ if "early" in opt.modality:
67
+ assert (
68
+ len(opt.modality) == 1
69
+ ), "Early fusion is not supported for multi-modality!"
70
+ for modal in opt.modality:
71
+ assert modal in [
72
+ "rgb",
73
+ "srm",
74
+ "bayar",
75
+ "early",
76
+ ], f"Unsupported modality {modal}!"
77
+
78
+ if opt.resume:
79
+ assert os.path.exists(opt.resume)
80
+
81
+ # if opt.mvc_weight <= 0. and opt.consistency_weight > 0.:
82
+ # assert opt.consistency_source == 'self', 'Ensemble consistency is not supported when mvc_weight is 0!'
83
+
84
+ # automatically set parameters
85
+ if len(sys.argv) > 1:
86
+ arguments = sys.argv[1:]
87
+ arguments = list(
88
+ map(lambda x: x.replace("--", ""), filter(lambda x: "--" in x, arguments))
89
+ )
90
+ params = []
91
+ for argument in arguments:
92
+ if not argument in [
93
+ "suffix",
94
+ "save_root_path",
95
+ "dataset",
96
+ "source",
97
+ "resume",
98
+ "num_workers",
99
+ "eval_freq",
100
+ "print_freq",
101
+ "lr_steps",
102
+ "rgb_resume",
103
+ "srm_resume",
104
+ "bayar_resume",
105
+ "teacher_resume",
106
+ "occ",
107
+ "load",
108
+ "amp_opt_level",
109
+ "val_shuffle",
110
+ "tile_size",
111
+ "modality",
112
+ ]:
113
+ try:
114
+ value = (
115
+ str(eval("opt.{}".format(argument.split("=")[0])))
116
+ .replace("[", "")
117
+ .replace("]", "")
118
+ .replace(" ", "-")
119
+ .replace(",", "")
120
+ )
121
+ params.append(
122
+ argument.split("=")[0].replace("_", "").replace(" ", "")
123
+ + "="
124
+ + value
125
+ )
126
+ except:
127
+ cprint("Unknown argument: {}".format(argument), "red")
128
+ if "early" in opt.modality:
129
+ params.append("modality=early")
130
+ test_name = "_".join(params)
131
+
132
+ else:
133
+ test_name = ""
134
+
135
+ time_stamp = time.strftime("%b-%d-%H-%M-%S", time.localtime())
136
+ dir_name = "{}_{}{}_{}".format(
137
+ "-".join(list(opt.train_datalist.keys())).upper(),
138
+ test_name,
139
+ opt.suffix,
140
+ time_stamp,
141
+ ).replace("__", "_")
142
+
143
+ opt.time_stamp = time_stamp
144
+ opt.dir_name = dir_name
145
+ opt.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
146
+
147
+ if opt.debug or opt.wholetest:
148
+ opt.val_shuffle = True
149
+ cprint("Setting val_shuffle to True in debug and wholetest mode!", "red")
150
+ time.sleep(3)
151
+
152
+ if len(opt.modality) < 2 and opt.mvc_weight != 0.0:
153
+ opt.mvc_weight = 0.0
154
+ cprint(
155
+ "Setting multi-view consistency weight to 0. for single modality training",
156
+ "red",
157
+ )
158
+ time.sleep(3)
159
+
160
+ if "early" in opt.modality:
161
+ opt.mvc_single_weight = {"early": 1.0}
162
+ else:
163
+ if "rgb" not in opt.modality:
164
+ opt.mvc_single_weight[0] = 0.0
165
+ if "srm" not in opt.modality:
166
+ opt.mvc_single_weight[1] = 0.0
167
+ if "bayar" not in opt.modality:
168
+ opt.mvc_single_weight[2] = 0.0
169
+ weight_sum = sum(opt.mvc_single_weight)
170
+ single_weight = list(map(lambda x: x / weight_sum, opt.mvc_single_weight))
171
+ opt.mvc_single_weight = {
172
+ "rgb": single_weight[0],
173
+ "srm": single_weight[1],
174
+ "bayar": single_weight[2],
175
+ }
176
+ cprint(
177
+ "Change mvc single modality weight to {}".format(opt.mvc_single_weight), "blue"
178
+ )
179
+ time.sleep(3)
180
+
181
+ # print parameters
182
+ tb = pt.PrettyTable(field_names=["Arguments", "Values"])
183
+ for k, v in vars(opt).items():
184
+ # some parameters might be too long to display
185
+ if k not in ["dir_name", "resume", "rgb_resume", "srm_resume", "bayar_resume"]:
186
+ tb.add_row([k, v])
187
+ print(tb)
188
+
189
+ return opt
190
+
191
+
192
+ def get_arguments_parser():
193
+ parser = argparse.ArgumentParser(
194
+ "CVPR2022 image manipulation detection model", add_help=False
195
+ )
196
+ parser.add_argument("--debug", action="store_true", default=False)
197
+ parser.add_argument("--wholetest", action="store_true", default=False)
198
+
199
+ parser.add_argument(
200
+ "--load", default="configs/final.yaml", help="Load configuration YAML file."
201
+ )
202
+ parser.add_argument("--num_class", type=int, default=1, help="Use sigmoid.")
203
+
204
+ # loss-related
205
+ parser.add_argument("--map_label_weight", type=float, default=1.0)
206
+ parser.add_argument("--volume_label_weight", type=float, default=1.0)
207
+ parser.add_argument(
208
+ "--map_mask_weight",
209
+ type=float,
210
+ default=0.0,
211
+ help="Only use this for debug purpose.",
212
+ )
213
+ parser.add_argument(
214
+ "--volume_mask_weight",
215
+ type=float,
216
+ default=0.0,
217
+ help="Only use this for debug purpose.",
218
+ )
219
+ parser.add_argument(
220
+ "--consistency_weight",
221
+ type=float,
222
+ default=0.0,
223
+ help="Consitency between output map and volume within a single view.",
224
+ )
225
+ parser.add_argument(
226
+ "--consistency_type", type=str, default="l2", choices=["l1", "l2"]
227
+ )
228
+ parser.add_argument(
229
+ "--consistency_kmeans",
230
+ action="store_true",
231
+ default=False,
232
+ help="Perform k-means on the volume to determine pristine and modified areas.",
233
+ )
234
+ parser.add_argument(
235
+ "--consistency_stop_map_grad",
236
+ action="store_true",
237
+ default=False,
238
+ help="Stop gradient for the map.",
239
+ )
240
+ parser.add_argument(
241
+ "--consistency_source", type=str, default="self", choices=["self", "ensemble"]
242
+ )
243
+ parser.add_argument("--map_entropy_weight", type=float, default=0.0)
244
+ parser.add_argument("--volume_entropy_weight", type=float, default=0.0)
245
+ parser.add_argument("--mvc_weight", type=float, default=0.0)
246
+ parser.add_argument(
247
+ "--mvc_time_dependent",
248
+ action="store_true",
249
+ default=False,
250
+ help="Use Gaussian smooth on the MVCW weight.",
251
+ )
252
+ parser.add_argument("--mvc_soft", action="store_true", default=False)
253
+ parser.add_argument("--mvc_zeros_on_au", action="store_true", default=False)
254
+ parser.add_argument(
255
+ "--mvc_single_weight",
256
+ type=float,
257
+ nargs="+",
258
+ default=[1.0, 1.0, 1.0],
259
+ help="Weight for the RGB, SRM and Bayar modality for MVC training.",
260
+ )
261
+ parser.add_argument(
262
+ "--mvc_steepness", type=float, default=5.0, help="The large the slower."
263
+ )
264
+ parser.add_argument("--mvc_spixel", action="store_true", default=False)
265
+ parser.add_argument("--mvc_num_spixel", type=int, default=100)
266
+ parser.add_argument(
267
+ "--loss_on_mid_map",
268
+ action="store_true",
269
+ default=False,
270
+ help="This only applies for the output map, but not for the consistency volume.",
271
+ )
272
+ parser.add_argument(
273
+ "--label_loss_on_whole_map",
274
+ action="store_true",
275
+ default=False,
276
+ help="Apply cls loss on the avg(map) for pristine images, instead of max(map).",
277
+ )
278
+
279
+ # network architecture
280
+ parser.add_argument("--modality", type=str, default=["rgb"], nargs="+")
281
+ parser.add_argument("--srm_clip", type=float, default=5.0)
282
+ parser.add_argument("--bayar_magnitude", type=float, default=1.0)
283
+ parser.add_argument("--encoder", type=str, default="ResNet50")
284
+ parser.add_argument("--encoder_weight", type=str, default="")
285
+ parser.add_argument("--decoder", type=str, default="C1")
286
+ parser.add_argument("--decoder_weight", type=str, default="")
287
+ parser.add_argument(
288
+ "--fc_dim",
289
+ type=int,
290
+ default=2048,
291
+ help="Changing this might leads to error in the conjunction between encoder and decoder.",
292
+ )
293
+ parser.add_argument(
294
+ "--volume_block_idx",
295
+ type=int,
296
+ default=1,
297
+ choices=[0, 1, 2, 3],
298
+ help="Compute the consistency volume at certain block.",
299
+ )
300
+ parser.add_argument("--share_embed_head", action="store_true", default=False)
301
+ parser.add_argument(
302
+ "--fcn_up",
303
+ type=int,
304
+ default=32,
305
+ choices=[8, 16, 32],
306
+ help="FCN architecture, 32s, 16s, or 8s.",
307
+ )
308
+ parser.add_argument("--gem", action="store_true", default=False)
309
+ parser.add_argument("--gem_coef", type=float, default=100)
310
+ parser.add_argument("--gsm", action="store_true", default=False)
311
+ parser.add_argument(
312
+ "--map_portion",
313
+ type=float,
314
+ default=0,
315
+ help="Select topk portion of the output map for the image-level classification. 0 for use max.",
316
+ )
317
+ parser.add_argument("--otsu_sel", action="store_true", default=False)
318
+ parser.add_argument("--otsu_portion", type=float, default=1.0)
319
+
320
+ # training parameters
321
+ parser.add_argument("--no_gaussian_blur", action="store_true", default=False)
322
+ parser.add_argument("--no_color_jitter", action="store_true", default=False)
323
+ parser.add_argument("--no_jpeg_compression", action="store_true", default=False)
324
+ parser.add_argument("--resize_aug", action="store_true", default=False)
325
+ parser.add_argument(
326
+ "--uncorrect_label",
327
+ action="store_true",
328
+ default=False,
329
+ help="This will not correct image-level labels caused by image cropping.",
330
+ )
331
+ parser.add_argument("--input_size", type=int, default=224)
332
+ parser.add_argument("--dropout", type=float, default=0.0)
333
+ parser.add_argument(
334
+ "--optimizer", type=str, default="adamw", choices=["sgd", "adamw"]
335
+ )
336
+ parser.add_argument("--resume", type=str, default="")
337
+ parser.add_argument("--eval", action="store_true", default=False)
338
+ parser.add_argument(
339
+ "--val_set",
340
+ type=str,
341
+ default="val",
342
+ choices=["train", "val"],
343
+ help="Change to train for debug purpose.",
344
+ )
345
+ parser.add_argument(
346
+ "--val_shuffle", action="store_true", default=False, help="Shuffle val set."
347
+ )
348
+ parser.add_argument("--save_figure", action="store_true", default=False)
349
+ parser.add_argument("--figure_path", type=str, default="figures")
350
+ parser.add_argument("--batch_size", type=int, default=36)
351
+ parser.add_argument("--epochs", type=int, default=60)
352
+ parser.add_argument("--eval_freq", type=int, default=3)
353
+ parser.add_argument("--weight_decay", type=float, default=5e-4)
354
+ parser.add_argument("--num_workers", type=int, default=36)
355
+ parser.add_argument("--grad_clip", type=float, default=0.0)
356
+ # lr
357
+ parser.add_argument(
358
+ "--sched",
359
+ default="cosine",
360
+ type=str,
361
+ metavar="SCHEDULER",
362
+ help='LR scheduler (default: "cosine"',
363
+ )
364
+ parser.add_argument(
365
+ "--lr",
366
+ type=float,
367
+ default=1e-4,
368
+ metavar="LR",
369
+ help="learning rate (default: 5e-4)",
370
+ )
371
+ parser.add_argument(
372
+ "--lr-noise",
373
+ type=float,
374
+ nargs="+",
375
+ default=None,
376
+ metavar="pct, pct",
377
+ help="learning rate noise on/off epoch percentages",
378
+ )
379
+ parser.add_argument(
380
+ "--lr-noise-pct",
381
+ type=float,
382
+ default=0.67,
383
+ metavar="PERCENT",
384
+ help="learning rate noise limit percent (default: 0.67)",
385
+ )
386
+ parser.add_argument(
387
+ "--lr-noise-std",
388
+ type=float,
389
+ default=1.0,
390
+ metavar="STDDEV",
391
+ help="learning rate noise std-dev (default: 1.0)",
392
+ )
393
+ parser.add_argument(
394
+ "--warmup-lr",
395
+ type=float,
396
+ default=2e-7,
397
+ metavar="LR",
398
+ help="warmup learning rate (default: 1e-6)",
399
+ )
400
+ parser.add_argument(
401
+ "--min-lr",
402
+ type=float,
403
+ default=2e-6,
404
+ metavar="LR",
405
+ help="lower lr bound for cyclic schedulers that hit 0 (1e-5)",
406
+ )
407
+ parser.add_argument(
408
+ "--decay-epochs",
409
+ type=float,
410
+ default=20,
411
+ metavar="N",
412
+ help="epoch interval to decay LR",
413
+ )
414
+ parser.add_argument(
415
+ "--warmup-epochs",
416
+ type=int,
417
+ default=5,
418
+ metavar="N",
419
+ help="epochs to warmup LR, if scheduler supports",
420
+ )
421
+ parser.add_argument(
422
+ "--cooldown-epochs",
423
+ type=int,
424
+ default=5,
425
+ metavar="N",
426
+ help="epochs to cooldown LR at min_lr, after cyclic schedule ends",
427
+ )
428
+ parser.add_argument(
429
+ "--patience-epochs",
430
+ type=int,
431
+ default=5,
432
+ metavar="N",
433
+ help="patience epochs for Plateau LR scheduler (default: 10",
434
+ )
435
+ parser.add_argument(
436
+ "--decay-rate",
437
+ "-dr",
438
+ type=float,
439
+ default=0.5,
440
+ metavar="RATE",
441
+ help="LR decay rate (default: 0.1)",
442
+ )
443
+ parser.add_argument("--lr_cycle_limit", "-lcl", type=int, default=1)
444
+ parser.add_argument("--lr_cycle_mul", "-lcm", type=float, default=1)
445
+
446
+ # inference hyperparameters
447
+ parser.add_argument("--mask_threshold", type=float, default=0.5)
448
+ parser.add_argument(
449
+ "-lis",
450
+ "--large_image_strategy",
451
+ choices=["rescale", "slide", "none"],
452
+ default="slide",
453
+ help="Slide will get better performance than rescale.",
454
+ )
455
+ parser.add_argument(
456
+ "--tile_size",
457
+ type=int,
458
+ default=768,
459
+ help="If the testing image is larger than tile_size, I will use sliding window to do the inference.",
460
+ )
461
+ parser.add_argument("--tile_overlap", type=float, default=0.1)
462
+ parser.add_argument("--spixel_postproc", action="store_true", default=False)
463
+ parser.add_argument("--convcrf_postproc", action="store_true", default=False)
464
+ parser.add_argument("--convcrf_shape", type=int, default=512)
465
+ parser.add_argument("--crf_postproc", action="store_true", default=False)
466
+ parser.add_argument("--max_pool_postproc", type=int, default=1)
467
+ parser.add_argument("--crf_downsample", type=int, default=1)
468
+ parser.add_argument("--crf_iter_max", type=int, default=5)
469
+ parser.add_argument("--crf_pos_w", type=int, default=3)
470
+ parser.add_argument("--crf_pos_xy_std", type=int, default=1)
471
+ parser.add_argument("--crf_bi_w", type=int, default=4)
472
+ parser.add_argument("--crf_bi_xy_std", type=int, default=67)
473
+ parser.add_argument("--crf_bi_rgb_std", type=int, default=3)
474
+
475
+ # save
476
+ parser.add_argument("--save_root_path", type=str, default="tmp")
477
+ parser.add_argument("--suffix", type=str, default="")
478
+ parser.add_argument("--print_freq", type=int, default=100)
479
+
480
+ # misc
481
+ parser.add_argument("--seed", type=int, default=1)
482
+
483
+ return parser
requirements.txt ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ albumentations==1.0.0
2
+ einops==0.4.1
3
+ fast_pytorch_kmeans==0.1.6
4
+ glob2==0.7
5
+ gpustat==0.6.0
6
+ h5py==3.6.0
7
+ matplotlib==3.3.4
8
+ numpy==1.22.4
9
+ opencv_contrib_python==4.5.3.56
10
+ opencv_python==4.4.0.46
11
+ opencv_python_headless==4.5.3.56
12
+ pandas==1.3.5
13
+ pathlib2==2.3.5
14
+ Pillow==9.4.0
15
+ prettytable==2.2.1
16
+ pydensecrf==1.0rc2
17
+ PyYAML==5.4.1
18
+ scikit_image==0.18.3
19
+ scikit_learn==0.24.1
20
+ scipy==1.7.3
21
+ spatial_correlation_sampler==0.4.0
22
+ SQLAlchemy==1.4.15
23
+ sync_batchnorm==0.0.1
24
+ tensorboard==2.12.2
25
+ termcolor==2.4.0
26
+ timm==0.9.12
27
+ torch==1.12.1+cu116
28
+ torchvision==0.13.1+cu116
29
+ tqdm==4.64.1
utils/__init__.py ADDED
File without changes
utils/convcrf/__init__.py ADDED
File without changes
utils/convcrf/convcrf.py ADDED
@@ -0,0 +1,669 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ The MIT License (MIT)
3
+
4
+ Copyright (c) 2017 Marvin Teichmann
5
+ """
6
+
7
+ from __future__ import absolute_import, division, print_function
8
+
9
+ import logging
10
+ import math
11
+ import os
12
+ import sys
13
+ import warnings
14
+
15
+ import numpy as np
16
+ import scipy as scp
17
+
18
+ logging.basicConfig(
19
+ format="%(asctime)s %(levelname)s %(message)s",
20
+ level=logging.INFO,
21
+ stream=sys.stdout,
22
+ )
23
+
24
+ try:
25
+ import pyinn as P
26
+
27
+ has_pyinn = True
28
+ except ImportError:
29
+ # PyInn is required to use our cuda based message-passing implementation
30
+ # Torch 0.4 provides a im2col operation, which will be used instead.
31
+ # It is ~15% slower.
32
+ has_pyinn = False
33
+ pass
34
+
35
+ import gc
36
+
37
+ import torch
38
+ import torch.nn as nn
39
+ import torch.nn.functional as F
40
+ from torch.autograd import Variable
41
+ from torch.nn import functional as nnfun
42
+ from torch.nn.parameter import Parameter
43
+
44
+ # Default config as proposed by Philipp Kraehenbuehl and Vladlen Koltun,
45
+ default_conf = {
46
+ "filter_size": 11,
47
+ "blur": 4,
48
+ "merge": True,
49
+ "norm": "none",
50
+ "weight": "vector",
51
+ "unary_weight": 1,
52
+ "weight_init": 0.2,
53
+ "trainable": False,
54
+ "convcomp": False,
55
+ "logsoftmax": True, # use logsoftmax for numerical stability
56
+ "softmax": True,
57
+ "skip_init_softmax": False,
58
+ "final_softmax": False,
59
+ "pos_feats": {
60
+ "sdims": 3,
61
+ "compat": 3,
62
+ },
63
+ "col_feats": {
64
+ "sdims": 80,
65
+ "schan": 13, # schan depend on the input scale.
66
+ # use schan = 13 for images in [0, 255]
67
+ # for normalized images in [-0.5, 0.5] try schan = 0.1
68
+ "compat": 10,
69
+ "use_bias": False,
70
+ },
71
+ "trainable_bias": False,
72
+ "pyinn": False,
73
+ }
74
+
75
+ # Config used for test cases on 10 x 10 pixel greyscale inpu
76
+ test_config = {
77
+ "filter_size": 5,
78
+ "blur": 1,
79
+ "merge": False,
80
+ "norm": "sym",
81
+ "trainable": False,
82
+ "weight": "scalar",
83
+ "unary_weight": 1,
84
+ "weight_init": 0.5,
85
+ "convcomp": False,
86
+ "trainable": False,
87
+ "convcomp": False,
88
+ "logsoftmax": True, # use logsoftmax for numerical stability
89
+ "softmax": True,
90
+ "pos_feats": {
91
+ "sdims": 1.5,
92
+ "compat": 3,
93
+ },
94
+ "col_feats": {"sdims": 2, "schan": 2, "compat": 3, "use_bias": True},
95
+ "trainable_bias": False,
96
+ }
97
+
98
+
99
+ class GaussCRF(nn.Module):
100
+ """Implements ConvCRF with hand-crafted features.
101
+
102
+ It uses the more generic ConvCRF class as basis and utilizes a config
103
+ dict to easily set hyperparameters and follows the design choices of:
104
+ Philipp Kraehenbuehl and Vladlen Koltun, "Efficient Inference in Fully
105
+ "Connected CRFs with Gaussian Edge Pots" (arxiv.org/abs/1210.5644)
106
+ """
107
+
108
+ def __init__(self, conf, shape, nclasses=None, use_gpu=True):
109
+ super(GaussCRF, self).__init__()
110
+
111
+ self.conf = conf
112
+ self.shape = shape
113
+ self.nclasses = nclasses
114
+
115
+ self.trainable = conf["trainable"]
116
+
117
+ if not conf["trainable_bias"]:
118
+ self.register_buffer("mesh", self._create_mesh())
119
+ else:
120
+ self.register_parameter("mesh", Parameter(self._create_mesh()))
121
+
122
+ if self.trainable:
123
+
124
+ def register(name, tensor):
125
+ self.register_parameter(name, Parameter(tensor))
126
+
127
+ else:
128
+
129
+ def register(name, tensor):
130
+ self.register_buffer(name, Variable(tensor))
131
+
132
+ register("pos_sdims", torch.Tensor([1 / conf["pos_feats"]["sdims"]]))
133
+
134
+ if conf["col_feats"]["use_bias"]:
135
+ register("col_sdims", torch.Tensor([1 / conf["col_feats"]["sdims"]]))
136
+ else:
137
+ self.col_sdims = None
138
+
139
+ register("col_schan", torch.Tensor([1 / conf["col_feats"]["schan"]]))
140
+ register("col_compat", torch.Tensor([conf["col_feats"]["compat"]]))
141
+ register("pos_compat", torch.Tensor([conf["pos_feats"]["compat"]]))
142
+
143
+ if conf["weight"] is None:
144
+ weight = None
145
+ elif conf["weight"] == "scalar":
146
+ val = conf["weight_init"]
147
+ weight = torch.Tensor([val])
148
+ elif conf["weight"] == "vector":
149
+ val = conf["weight_init"]
150
+ weight = val * torch.ones(1, nclasses, 1, 1)
151
+
152
+ self.CRF = ConvCRF(
153
+ shape,
154
+ nclasses,
155
+ mode="col",
156
+ conf=conf,
157
+ use_gpu=use_gpu,
158
+ filter_size=conf["filter_size"],
159
+ norm=conf["norm"],
160
+ blur=conf["blur"],
161
+ trainable=conf["trainable"],
162
+ convcomp=conf["convcomp"],
163
+ weight=weight,
164
+ final_softmax=conf["final_softmax"],
165
+ unary_weight=conf["unary_weight"],
166
+ pyinn=conf["pyinn"],
167
+ )
168
+
169
+ return
170
+
171
+ def forward(self, unary, img, num_iter=5):
172
+ """Run a forward pass through ConvCRF.
173
+
174
+ Arguments:
175
+ unary: torch.Tensor with shape [bs, num_classes, height, width].
176
+ The unary predictions. Logsoftmax is applied to the unaries
177
+ during inference. When using CNNs don't apply softmax,
178
+ use unnormalized output (logits) instead.
179
+
180
+ img: torch.Tensor with shape [bs, 3, height, width]
181
+ The input image. Default config assumes image
182
+ data in [0, 255]. For normalized images adapt
183
+ `schan`. Try schan = 0.1 for images in [-0.5, 0.5]
184
+ """
185
+
186
+ conf = self.conf
187
+
188
+ bs, c, x, y = img.shape
189
+
190
+ pos_feats = self.create_position_feats(sdims=self.pos_sdims, bs=bs)
191
+ col_feats = self.create_colour_feats(
192
+ img,
193
+ sdims=self.col_sdims,
194
+ schan=self.col_schan,
195
+ bias=conf["col_feats"]["use_bias"],
196
+ bs=bs,
197
+ )
198
+
199
+ compats = [self.pos_compat, self.col_compat]
200
+
201
+ self.CRF.add_pairwise_energies([pos_feats, col_feats], compats, conf["merge"])
202
+
203
+ prediction = self.CRF.inference(unary, num_iter=num_iter)
204
+
205
+ self.CRF.clean_filters()
206
+ return prediction
207
+
208
+ def _create_mesh(self, requires_grad=False):
209
+ hcord_range = [range(s) for s in self.shape]
210
+ mesh = np.array(np.meshgrid(*hcord_range, indexing="ij"), dtype=np.float32)
211
+
212
+ return torch.from_numpy(mesh)
213
+
214
+ def create_colour_feats(self, img, schan, sdims=0.0, bias=True, bs=1):
215
+ norm_img = img * schan
216
+
217
+ if bias:
218
+ norm_mesh = self.create_position_feats(sdims=sdims, bs=bs)
219
+ feats = torch.cat([norm_mesh, norm_img], dim=1)
220
+ else:
221
+ feats = norm_img
222
+ return feats
223
+
224
+ def create_position_feats(self, sdims, bs=1):
225
+ if type(self.mesh) is Parameter:
226
+ return torch.stack(bs * [self.mesh * sdims])
227
+ else:
228
+ return torch.stack(bs * [Variable(self.mesh) * sdims])
229
+
230
+
231
+ def show_memusage(device=0, name=""):
232
+ import gpustat
233
+
234
+ gc.collect()
235
+ gpu_stats = gpustat.GPUStatCollection.new_query()
236
+ item = gpu_stats.jsonify()["gpus"][device]
237
+
238
+ logging.info(
239
+ "{:>5}/{:>5} MB Usage at {}".format(
240
+ item["memory.used"], item["memory.total"], name
241
+ )
242
+ )
243
+
244
+
245
+ def exp_and_normalize(features, dim=0):
246
+ """
247
+ Aka "softmax" in deep learning literature
248
+ """
249
+ normalized = torch.nn.functional.softmax(features, dim=dim)
250
+ return normalized
251
+
252
+
253
+ def _get_ind(dz):
254
+ if dz == 0:
255
+ return 0, 0
256
+ if dz < 0:
257
+ return 0, -dz
258
+ if dz > 0:
259
+ return dz, 0
260
+
261
+
262
+ def _negative(dz):
263
+ """
264
+ Computes -dz for numpy indexing. Goal is to use as in array[i:-dz].
265
+
266
+ However, if dz=0 this indexing does not work.
267
+ None needs to be used instead.
268
+ """
269
+ if dz == 0:
270
+ return None
271
+ else:
272
+ return -dz
273
+
274
+
275
+ class MessagePassingCol:
276
+ """Perform the Message passing of ConvCRFs.
277
+
278
+ The main magic happens here.
279
+ """
280
+
281
+ def __init__(
282
+ self,
283
+ feat_list,
284
+ compat_list,
285
+ merge,
286
+ npixels,
287
+ nclasses,
288
+ norm="sym",
289
+ filter_size=5,
290
+ clip_edges=0,
291
+ use_gpu=False,
292
+ blur=1,
293
+ matmul=False,
294
+ verbose=False,
295
+ pyinn=False,
296
+ ):
297
+
298
+ if not norm == "sym" and not norm == "none":
299
+ raise NotImplementedError
300
+
301
+ span = filter_size // 2
302
+ assert filter_size % 2 == 1
303
+ self.span = span
304
+ self.filter_size = filter_size
305
+ self.use_gpu = use_gpu
306
+ self.verbose = verbose
307
+ self.blur = blur
308
+ self.pyinn = pyinn
309
+
310
+ self.merge = merge
311
+
312
+ self.npixels = npixels
313
+
314
+ if not self.blur == 1 and self.blur % 2:
315
+ raise NotImplementedError
316
+
317
+ self.matmul = matmul
318
+
319
+ self._gaus_list = []
320
+ self._norm_list = []
321
+
322
+ for feats, compat in zip(feat_list, compat_list):
323
+ gaussian = self._create_convolutional_filters(feats)
324
+ if not norm == "none":
325
+ mynorm = self._get_norm(gaussian)
326
+ self._norm_list.append(mynorm)
327
+ else:
328
+ self._norm_list.append(None)
329
+
330
+ gaussian = compat * gaussian
331
+ self._gaus_list.append(gaussian)
332
+
333
+ if merge:
334
+ self.gaussian = sum(self._gaus_list)
335
+ if not norm == "none":
336
+ raise NotImplementedError
337
+
338
+ def _get_norm(self, gaus):
339
+ norm_tensor = torch.ones([1, 1, self.npixels[0], self.npixels[1]])
340
+ normalization_feats = torch.autograd.Variable(norm_tensor)
341
+ if self.use_gpu:
342
+ normalization_feats = normalization_feats.cuda()
343
+
344
+ norm_out = self._compute_gaussian(normalization_feats, gaussian=gaus)
345
+ return 1 / torch.sqrt(norm_out + 1e-20)
346
+
347
+ def _create_convolutional_filters(self, features):
348
+
349
+ span = self.span
350
+
351
+ bs = features.shape[0]
352
+
353
+ if self.blur > 1:
354
+ off_0 = (self.blur - self.npixels[0] % self.blur) % self.blur
355
+ off_1 = (self.blur - self.npixels[1] % self.blur) % self.blur
356
+ pad_0 = math.ceil(off_0 / 2)
357
+ pad_1 = math.ceil(off_1 / 2)
358
+ if self.blur == 2:
359
+ assert pad_0 == self.npixels[0] % 2
360
+ assert pad_1 == self.npixels[1] % 2
361
+
362
+ features = torch.nn.functional.avg_pool2d(
363
+ features,
364
+ kernel_size=self.blur,
365
+ padding=(pad_0, pad_1),
366
+ count_include_pad=False,
367
+ )
368
+
369
+ npixels = [
370
+ math.ceil(self.npixels[0] / self.blur),
371
+ math.ceil(self.npixels[1] / self.blur),
372
+ ]
373
+ assert npixels[0] == features.shape[2]
374
+ assert npixels[1] == features.shape[3]
375
+ else:
376
+ npixels = self.npixels
377
+
378
+ gaussian_tensor = features.data.new(
379
+ bs, self.filter_size, self.filter_size, npixels[0], npixels[1]
380
+ ).fill_(0)
381
+
382
+ gaussian = Variable(gaussian_tensor)
383
+
384
+ for dx in range(-span, span + 1):
385
+ for dy in range(-span, span + 1):
386
+
387
+ dx1, dx2 = _get_ind(dx)
388
+ dy1, dy2 = _get_ind(dy)
389
+
390
+ feat_t = features[:, :, dx1 : _negative(dx2), dy1 : _negative(dy2)]
391
+ feat_t2 = features[
392
+ :, :, dx2 : _negative(dx1), dy2 : _negative(dy1)
393
+ ] # NOQA
394
+
395
+ diff = feat_t - feat_t2
396
+ diff_sq = diff * diff
397
+ exp_diff = torch.exp(torch.sum(-0.5 * diff_sq, dim=1))
398
+
399
+ gaussian[
400
+ :, dx + span, dy + span, dx2 : _negative(dx1), dy2 : _negative(dy1)
401
+ ] = exp_diff
402
+
403
+ return gaussian.view(
404
+ bs, 1, self.filter_size, self.filter_size, npixels[0], npixels[1]
405
+ )
406
+
407
+ def compute(self, input):
408
+ if self.merge:
409
+ pred = self._compute_gaussian(input, self.gaussian)
410
+ else:
411
+ assert len(self._gaus_list) == len(self._norm_list)
412
+ pred = 0
413
+ for gaus, norm in zip(self._gaus_list, self._norm_list):
414
+ pred += self._compute_gaussian(input, gaus, norm)
415
+
416
+ return pred
417
+
418
+ def _compute_gaussian(self, input, gaussian, norm=None):
419
+
420
+ if norm is not None:
421
+ input = input * norm
422
+
423
+ shape = input.shape
424
+ num_channels = shape[1]
425
+ bs = shape[0]
426
+
427
+ if self.blur > 1:
428
+ off_0 = (self.blur - self.npixels[0] % self.blur) % self.blur
429
+ off_1 = (self.blur - self.npixels[1] % self.blur) % self.blur
430
+ pad_0 = int(math.ceil(off_0 / 2))
431
+ pad_1 = int(math.ceil(off_1 / 2))
432
+ input = torch.nn.functional.avg_pool2d(
433
+ input,
434
+ kernel_size=self.blur,
435
+ padding=(pad_0, pad_1),
436
+ count_include_pad=False,
437
+ )
438
+ npixels = [
439
+ math.ceil(self.npixels[0] / self.blur),
440
+ math.ceil(self.npixels[1] / self.blur),
441
+ ]
442
+ assert npixels[0] == input.shape[2]
443
+ assert npixels[1] == input.shape[3]
444
+ else:
445
+ npixels = self.npixels
446
+
447
+ if self.verbose:
448
+ show_memusage(name="Init")
449
+
450
+ if self.pyinn:
451
+ input_col = P.im2col(input, self.filter_size, 1, self.span)
452
+ else:
453
+ # An alternative implementation of num2col.
454
+ #
455
+ # This has implementation uses the torch 0.4 im2col operation.
456
+ # This implementation was not avaible when we did the experiments
457
+ # published in our paper. So less "testing" has been done.
458
+ #
459
+ # It is around ~20% slower then the pyinn implementation but
460
+ # easier to use as it removes a dependency.
461
+ input_unfold = F.unfold(input, self.filter_size, 1, self.span)
462
+ input_unfold = input_unfold.view(
463
+ bs,
464
+ num_channels,
465
+ self.filter_size,
466
+ self.filter_size,
467
+ npixels[0],
468
+ npixels[1],
469
+ )
470
+ input_col = input_unfold
471
+
472
+ k_sqr = self.filter_size * self.filter_size
473
+
474
+ if self.verbose:
475
+ show_memusage(name="Im2Col")
476
+
477
+ product = gaussian * input_col
478
+ if self.verbose:
479
+ show_memusage(name="Product")
480
+
481
+ product = product.view([bs, num_channels, k_sqr, npixels[0], npixels[1]])
482
+
483
+ message = product.sum(2)
484
+
485
+ if self.verbose:
486
+ show_memusage(name="FinalNorm")
487
+
488
+ if self.blur > 1:
489
+ in_0 = self.npixels[0]
490
+ in_1 = self.npixels[1]
491
+ message = message.view(bs, num_channels, npixels[0], npixels[1])
492
+ with warnings.catch_warnings():
493
+ warnings.simplefilter("ignore")
494
+ # Suppress warning regarding corner alignment
495
+ message = torch.nn.functional.upsample(
496
+ message, scale_factor=self.blur, mode="bilinear"
497
+ )
498
+
499
+ message = message[:, :, pad_0 : pad_0 + in_0, pad_1 : in_1 + pad_1]
500
+ message = message.contiguous()
501
+
502
+ message = message.view(shape)
503
+ assert message.shape == shape
504
+
505
+ if norm is not None:
506
+ message = norm * message
507
+
508
+ return message
509
+
510
+
511
+ class ConvCRF(nn.Module):
512
+ """
513
+ Implements a generic CRF class.
514
+
515
+ This class provides tools to build
516
+ your own ConvCRF based model.
517
+ """
518
+
519
+ def __init__(
520
+ self,
521
+ npixels,
522
+ nclasses,
523
+ conf,
524
+ mode="conv",
525
+ filter_size=5,
526
+ clip_edges=0,
527
+ blur=1,
528
+ use_gpu=False,
529
+ norm="sym",
530
+ merge=False,
531
+ verbose=False,
532
+ trainable=False,
533
+ convcomp=False,
534
+ weight=None,
535
+ final_softmax=True,
536
+ unary_weight=10,
537
+ pyinn=False,
538
+ skip_init_softmax=False,
539
+ eps=1e-8,
540
+ ):
541
+
542
+ super(ConvCRF, self).__init__()
543
+ self.nclasses = nclasses
544
+
545
+ self.filter_size = filter_size
546
+ self.clip_edges = clip_edges
547
+ self.use_gpu = use_gpu
548
+ self.mode = mode
549
+ self.norm = norm
550
+ self.merge = merge
551
+ self.kernel = None
552
+ self.verbose = verbose
553
+ self.blur = blur
554
+ self.final_softmax = final_softmax
555
+ self.pyinn = pyinn
556
+ self.skip_init_softmax = skip_init_softmax
557
+ self.eps = eps
558
+
559
+ self.conf = conf
560
+
561
+ self.unary_weight = unary_weight
562
+
563
+ if self.use_gpu:
564
+ if not torch.cuda.is_available():
565
+ logging.error("GPU mode requested but not avaible.")
566
+ logging.error("Please run using use_gpu=False.")
567
+ raise ValueError
568
+
569
+ self.npixels = npixels
570
+
571
+ if type(npixels) is tuple or type(npixels) is list:
572
+ self.height = npixels[0]
573
+ self.width = npixels[1]
574
+ else:
575
+ self.npixels = npixels
576
+
577
+ if trainable:
578
+
579
+ def register(name, tensor):
580
+ self.register_parameter(name, Parameter(tensor))
581
+
582
+ else:
583
+
584
+ def register(name, tensor):
585
+ self.register_buffer(name, Variable(tensor))
586
+
587
+ if weight is None:
588
+ self.weight = None
589
+ else:
590
+ register("weight", weight)
591
+
592
+ if convcomp:
593
+ self.comp = nn.Conv2d(
594
+ nclasses, nclasses, kernel_size=1, stride=1, padding=0, bias=False
595
+ )
596
+
597
+ self.comp.weight.data.fill_(0.1 * math.sqrt(2.0 / nclasses))
598
+ else:
599
+ self.comp = None
600
+
601
+ def clean_filters(self):
602
+ self.kernel = None
603
+
604
+ def add_pairwise_energies(self, feat_list, compat_list, merge):
605
+ assert len(feat_list) == len(compat_list)
606
+
607
+ self.kernel = MessagePassingCol(
608
+ feat_list=feat_list,
609
+ compat_list=compat_list,
610
+ merge=merge,
611
+ npixels=self.npixels,
612
+ filter_size=self.filter_size,
613
+ nclasses=self.nclasses,
614
+ use_gpu=self.use_gpu,
615
+ norm=self.norm,
616
+ verbose=self.verbose,
617
+ blur=self.blur,
618
+ pyinn=self.pyinn,
619
+ )
620
+
621
+ def inference(self, unary, num_iter=5):
622
+
623
+ if not self.skip_init_softmax:
624
+ if not self.conf["logsoftmax"]:
625
+ lg_unary = torch.log(unary)
626
+ prediction = exp_and_normalize(lg_unary, dim=1)
627
+ else:
628
+ lg_unary = nnfun.log_softmax(unary, dim=1, _stacklevel=5)
629
+ prediction = lg_unary
630
+ else:
631
+ unary = unary + self.eps
632
+ unary = unary.clamp(0, 1)
633
+ lg_unary = torch.log(unary)
634
+ prediction = lg_unary
635
+
636
+ for i in range(num_iter):
637
+ message = self.kernel.compute(prediction)
638
+
639
+ if self.comp is not None:
640
+ # message_r = message.view(tuple([1]) + message.shape)
641
+ comp = self.comp(message)
642
+ message = message + comp
643
+
644
+ if self.weight is None:
645
+ prediction = lg_unary + message
646
+ else:
647
+ prediction = (
648
+ self.unary_weight - self.weight
649
+ ) * lg_unary + self.weight * message
650
+
651
+ if not i == num_iter - 1 or self.final_softmax:
652
+ if self.conf["softmax"]:
653
+ prediction = exp_and_normalize(prediction, dim=1)
654
+
655
+ return prediction
656
+
657
+ def start_inference(self):
658
+ pass
659
+
660
+ def step_inference(self):
661
+ pass
662
+
663
+
664
+ def get_test_conf():
665
+ return test_config.copy()
666
+
667
+
668
+ def get_default_conf():
669
+ return default_conf.copy()
utils/crf.py ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python
2
+ # coding: utf-8
3
+ #
4
+ # Author: Kazuto Nakashima
5
+ # URL: https://kazuto1011.github.io
6
+ # Date: 09 January 2019
7
+
8
+
9
+ import numpy as np
10
+ import pydensecrf.densecrf as dcrf
11
+ import pydensecrf.utils as utils
12
+
13
+
14
+ class DenseCRF(object):
15
+ def __init__(self, iter_max, pos_w, pos_xy_std, bi_w, bi_xy_std, bi_rgb_std):
16
+ self.iter_max = iter_max
17
+ self.pos_w = pos_w
18
+ self.pos_xy_std = pos_xy_std
19
+ self.bi_w = bi_w
20
+ self.bi_xy_std = bi_xy_std
21
+ self.bi_rgb_std = bi_rgb_std
22
+
23
+ def __call__(self, image, probmap):
24
+ C, H, W = probmap.shape
25
+
26
+ U = utils.unary_from_softmax(probmap)
27
+ U = np.ascontiguousarray(U)
28
+
29
+ image = np.ascontiguousarray(image)
30
+
31
+ d = dcrf.DenseCRF2D(W, H, C)
32
+ d.setUnaryEnergy(U)
33
+ d.addPairwiseGaussian(sxy=self.pos_xy_std, compat=self.pos_w)
34
+ d.addPairwiseBilateral(
35
+ sxy=self.bi_xy_std, srgb=self.bi_rgb_std, rgbim=image, compat=self.bi_w
36
+ )
37
+
38
+ Q = d.inference(self.iter_max)
39
+ Q = np.array(Q).reshape((C, H, W))
40
+
41
+ return Q
utils/misc.py ADDED
@@ -0,0 +1,370 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import copy
2
+ import datetime
3
+ import json
4
+ import math
5
+ import os
6
+ import random
7
+ import signal
8
+ import subprocess
9
+ import sys
10
+ import time
11
+ import warnings
12
+ from collections import defaultdict
13
+ from shutil import copy2
14
+ from typing import Dict
15
+
16
+ import numpy as np
17
+ import prettytable as pt
18
+ import torch
19
+ import torch.nn as nn
20
+ from termcolor import cprint
21
+ from torch.utils.tensorboard import SummaryWriter
22
+
23
+
24
+ class Logger(object):
25
+ def __init__(self, filename, stream=sys.stdout):
26
+ self.terminal = stream
27
+ self.log = open(filename, "a")
28
+
29
+ def write(self, message):
30
+ self.terminal.write(message)
31
+ self.log.write(message)
32
+
33
+ def flush(self):
34
+ pass
35
+
36
+
37
+ class AverageMeter(object):
38
+ """Computes and stores the average and current value"""
39
+
40
+ def __init__(self):
41
+ self.sum = 0
42
+ self.avg = 0
43
+ self.val = 0
44
+ self.count = 0
45
+
46
+ def reset(self):
47
+ self.sum = 0
48
+ self.avg = 0
49
+ self.val = 0
50
+ self.count = 0
51
+
52
+ def update(self, val, n=1):
53
+ self.val = val
54
+ self.sum = self.sum + val * n
55
+ self.count = self.count + n
56
+ self.avg = self.sum / self.count
57
+
58
+ def __str__(self):
59
+ return f"{self.avg: .5f}"
60
+
61
+
62
+ def get_sha():
63
+ """Get git current status"""
64
+ cwd = os.path.dirname(os.path.abspath(__file__))
65
+
66
+ def _run(command):
67
+ return subprocess.check_output(command, cwd=cwd).decode("ascii").strip()
68
+
69
+ sha = "N/A"
70
+ diff = "clean"
71
+ branch = "N/A"
72
+ message = "N/A"
73
+ try:
74
+ sha = _run(["git", "rev-parse", "HEAD"])
75
+ sha = sha[:8]
76
+ subprocess.check_output(["git", "diff"], cwd=cwd)
77
+ diff = _run(["git", "diff-index", "HEAD"])
78
+ diff = "has uncommited changes" if diff else "clean"
79
+ branch = _run(["git", "rev-parse", "--abbrev-ref", "HEAD"])
80
+ message = _run(["git", "log", "--pretty=format:'%s'", sha, "-1"]).replace(
81
+ "'", ""
82
+ )
83
+ except Exception:
84
+ pass
85
+
86
+ return {"sha": sha, "status": diff, "branch": branch, "prev_commit": message}
87
+
88
+
89
+ def setup_env(opt):
90
+ if opt.eval or opt.debug:
91
+ opt.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
92
+ torch.autograd.set_detect_anomaly(True)
93
+ return None
94
+
95
+ dir_name = opt.dir_name
96
+ save_root_path = opt.save_root_path
97
+ if not os.path.exists(save_root_path):
98
+ os.mkdir(save_root_path)
99
+
100
+ # deterministic
101
+ torch.manual_seed(opt.seed)
102
+ np.random.seed(opt.seed)
103
+ random.seed(opt.seed)
104
+ torch.backends.cudnn.deterministic = True
105
+ torch.backends.cudnn.benchmark = True
106
+
107
+ # mkdir subdirectories
108
+ checkpoint = "checkpoint"
109
+ if not os.path.exists(os.path.join(save_root_path, dir_name)):
110
+ os.mkdir(os.path.join(save_root_path, dir_name))
111
+ os.mkdir(os.path.join(save_root_path, dir_name, checkpoint))
112
+
113
+ # save log
114
+ sys.stdout = Logger(os.path.join(save_root_path, dir_name, "log.log"), sys.stdout)
115
+ sys.stderr = Logger(os.path.join(save_root_path, dir_name, "error.log"), sys.stderr)
116
+
117
+ # save parameters
118
+ params = copy.deepcopy(vars(opt))
119
+ params.pop("device")
120
+ with open(os.path.join(save_root_path, dir_name, "params.json"), "w") as f:
121
+ json.dump(params, f)
122
+
123
+ # print info
124
+ print(
125
+ "Running on {}, PyTorch version {}, files will be saved at {}".format(
126
+ opt.device, torch.__version__, os.path.join(save_root_path, dir_name)
127
+ )
128
+ )
129
+ print("Devices:")
130
+ for i in range(torch.cuda.device_count()):
131
+ print(" {}:".format(i), torch.cuda.get_device_name(i))
132
+ print(f"Git: {get_sha()}.")
133
+
134
+ # return tensorboard summarywriter
135
+ return SummaryWriter("{}/{}/".format(opt.save_root_path, opt.dir_name))
136
+
137
+
138
+ class MetricLogger(object):
139
+ def __init__(self, delimiter=" ", writer=None, suffix=None):
140
+ self.meters = defaultdict(AverageMeter)
141
+ self.delimiter = delimiter
142
+ self.writer = writer
143
+ self.suffix = suffix
144
+
145
+ def update(self, **kwargs):
146
+ for k, v in kwargs.items():
147
+ if isinstance(v, torch.Tensor):
148
+ v = v.item()
149
+ assert isinstance(v, (float, int)), f"Unsupport type {type(v)}."
150
+ self.meters[k].update(v)
151
+
152
+ def add_meter(self, name, meter):
153
+ self.meters[name] = meter
154
+
155
+ def get_meters(self):
156
+ result = {}
157
+ for k, v in self.meters.items():
158
+ result[k] = v.avg
159
+ return result
160
+
161
+ def prepend_subprefix(self, subprefix: str):
162
+ old_keys = list(self.meters.keys())
163
+ for k in old_keys:
164
+ self.meters[k.replace("/", f"/{subprefix}")] = self.meters[k]
165
+ for k in old_keys:
166
+ del self.meters[k]
167
+
168
+ def log_every(self, iterable, print_freq=10, header=""):
169
+ i = 0
170
+ start_time = time.time()
171
+ end = time.time()
172
+ iter_time = AverageMeter()
173
+ space_fmt = ":" + str(len(str(len(iterable)))) + "d"
174
+ log_msg = self.delimiter.join(
175
+ [
176
+ header,
177
+ "[{0" + space_fmt + "}/{1}]",
178
+ "eta: {eta}",
179
+ "{meters}",
180
+ "iter time: {time}s",
181
+ ]
182
+ )
183
+ for obj in iterable:
184
+ yield i, obj
185
+ iter_time.update(time.time() - end)
186
+ if (i + 1) % print_freq == 0 or i == len(iterable) - 1:
187
+ eta_seconds = iter_time.avg * (len(iterable) - i)
188
+ eta_string = str(datetime.timedelta(seconds=int(eta_seconds)))
189
+ print(
190
+ log_msg.format(
191
+ i + 1,
192
+ len(iterable),
193
+ eta=eta_string,
194
+ meters=str(self),
195
+ time=str(iter_time),
196
+ ).replace(" ", " ")
197
+ )
198
+ i += 1
199
+ end = time.time()
200
+ total_time = time.time() - start_time
201
+ total_time_str = str(datetime.timedelta(seconds=int(total_time)))
202
+ print(
203
+ "{} Total time: {} ({:.4f}s / it)".format(
204
+ header, total_time_str, total_time / len(iterable)
205
+ )
206
+ )
207
+
208
+ def write_tensorboard(self, step):
209
+ if self.writer is not None:
210
+ for k, v in self.meters.items():
211
+ # if self.suffix:
212
+ # self.writer.add_scalar(
213
+ # '{}/{}'.format(k, self.suffix), v.avg, step)
214
+ # else:
215
+ self.writer.add_scalar(k, v.avg, step)
216
+
217
+ def stat_table(self):
218
+ tb = pt.PrettyTable(field_names=["Metrics", "Values"])
219
+ for name, meter in self.meters.items():
220
+ tb.add_row([name, str(meter)])
221
+ return tb.get_string()
222
+
223
+ def __getattr__(self, attr):
224
+ if attr in self.meters:
225
+ return self.meters[attr]
226
+ if attr in self.__dict__:
227
+ return self.__dict__[attr]
228
+ raise AttributeError(
229
+ "'{}' object has no attribute '{}'".format(type(self).__name__, attr)
230
+ )
231
+
232
+ def __str__(self):
233
+ loss_str = []
234
+ for name, meter in self.meters.items():
235
+ loss_str.append("{}: {}".format(name, str(meter)))
236
+ return self.delimiter.join(loss_str).replace(" ", " ")
237
+
238
+
239
+ def save_model(path, model: nn.Module, epoch, opt, performance=None):
240
+ if not opt.debug:
241
+ try:
242
+ torch.save(
243
+ {
244
+ "model": model.state_dict(),
245
+ "epoch": epoch,
246
+ "opt": opt,
247
+ "performance": performance,
248
+ },
249
+ path,
250
+ )
251
+ except Exception as e:
252
+ cprint("Failed to save {} because {}".format(path, str(e)))
253
+
254
+
255
+ def resume_from(model: nn.Module, resume_path: str):
256
+ checkpoint = torch.load(resume_path, map_location="cpu")
257
+ state_dict = checkpoint["model"]
258
+ performance = checkpoint["performance"]
259
+ try:
260
+ model.load_state_dict(state_dict)
261
+ except Exception as e:
262
+ model.load_state_dict(state_dict, strict=False)
263
+ cprint("Failed to load full model because {}".format(str(e)), "red")
264
+ time.sleep(3)
265
+ print(f"{resume_path} model loaded. It performance is")
266
+ if performance is not None:
267
+ for k, v in performance.items():
268
+ print(f"{k}: {v}")
269
+
270
+
271
+ def update_record(result: Dict, epoch: int, opt, file_name: str = "latest_record"):
272
+ if not opt.debug:
273
+ # save txt file
274
+ tb = pt.PrettyTable(field_names=["Metrics", "Values"])
275
+ with open(
276
+ os.path.join(opt.save_root_path, opt.dir_name, f"{file_name}.txt"), "w"
277
+ ) as f:
278
+ f.write(f"Performance at {epoch}-th epoch:\n\n")
279
+ for k, v in result.items():
280
+ tb.add_row([k, "{:.7f}".format(v)])
281
+ f.write(tb.get_string())
282
+
283
+ # save json file
284
+ result["epoch"] = epoch
285
+ with open(
286
+ os.path.join(opt.save_root_path, opt.dir_name, f"{file_name}.json"), "w"
287
+ ) as f:
288
+ json.dump(result, f)
289
+
290
+
291
+ def pixel_acc(pred, label):
292
+ """Compute pixel-level prediction accuracy."""
293
+ warnings.warn("I am not sure if this implementation is correct.")
294
+
295
+ label_size = label.shape[-2:]
296
+ if pred.shape[-2] != label_size:
297
+ pred = torch.nn.functional.interpolate(
298
+ pred, size=label_size, mode="bilinear", align_corners=False
299
+ )
300
+
301
+ pred[torch.where(pred > 0.5)] = 1
302
+ pred[torch.where(pred <= 0.5)] = 0
303
+ correct = torch.sum((pred + label) == 1.0)
304
+ total = torch.numel(pred)
305
+ return correct / (total + 1e-8)
306
+
307
+
308
+ def calculate_pixel_f1(pd, gt, prefix="", suffix=""):
309
+ if np.max(pd) == np.max(gt) and np.max(pd) == 0:
310
+ f1, iou = 1.0, 1.0
311
+ return f1, 0.0, 0.0
312
+ seg_inv, gt_inv = np.logical_not(pd), np.logical_not(gt)
313
+ true_pos = float(np.logical_and(pd, gt).sum())
314
+ false_pos = np.logical_and(pd, gt_inv).sum()
315
+ false_neg = np.logical_and(seg_inv, gt).sum()
316
+ f1 = 2 * true_pos / (2 * true_pos + false_pos + false_neg + 1e-6)
317
+ precision = true_pos / (true_pos + false_pos + 1e-6)
318
+ recall = true_pos / (true_pos + false_neg + 1e-6)
319
+
320
+ return {
321
+ f"{prefix}pixel_f1{suffix}": f1,
322
+ f"{prefix}pixel_prec{suffix}": precision,
323
+ f"{prefix}pixel_recall{suffix}": recall,
324
+ }
325
+
326
+
327
+ def calculate_img_score(pd, gt, prefix="", suffix="", eta=1e-6):
328
+ seg_inv, gt_inv = np.logical_not(pd), np.logical_not(gt)
329
+ true_pos = float(np.logical_and(pd, gt).sum())
330
+ false_pos = float(np.logical_and(pd, gt_inv).sum())
331
+ false_neg = float(np.logical_and(seg_inv, gt).sum())
332
+ true_neg = float(np.logical_and(seg_inv, gt_inv).sum())
333
+ acc = (true_pos + true_neg) / (true_pos + true_neg + false_neg + false_pos + eta)
334
+ sen = true_pos / (true_pos + false_neg + eta)
335
+ spe = true_neg / (true_neg + false_pos + eta)
336
+ precision = true_pos / (true_pos + false_pos + eta)
337
+ recall = true_pos / (true_pos + false_neg + eta)
338
+ try:
339
+ f1 = 2 * sen * spe / (sen + spe)
340
+ except:
341
+ f1 = -math.inf
342
+
343
+ return {
344
+ f"{prefix}image_acc{suffix}": acc,
345
+ f"{prefix}image_sen{suffix}": sen,
346
+ f"{prefix}image_spe{suffix}": spe,
347
+ f"{prefix}image_f1{suffix}": f1,
348
+ f"{prefix}image_true_pos{suffix}": true_pos,
349
+ f"{prefix}image_true_neg{suffix}": true_neg,
350
+ f"{prefix}image_false_pos{suffix}": false_pos,
351
+ f"{prefix}image_false_neg{suffix}": false_neg,
352
+ f"{prefix}image_prec{suffix}": precision,
353
+ f"{prefix}image_recall{suffix}": recall,
354
+ }
355
+
356
+
357
+ class timeout:
358
+ def __init__(self, seconds=1, error_message="Timeout"):
359
+ self.seconds = seconds
360
+ self.error_message = error_message
361
+
362
+ def handle_timeout(self, signum, frame):
363
+ raise TimeoutError(self.error_message)
364
+
365
+ def __enter__(self):
366
+ signal.signal(signal.SIGALRM, self.handle_timeout)
367
+ signal.alarm(self.seconds)
368
+
369
+ def __exit__(self, type, value, traceback):
370
+ signal.alarm(0)