Matthew commited on
Commit
0392181
1 Parent(s): a68034d

initial commit

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +6 -0
  2. README.md +2 -2
  3. VQA/license.txt +30 -0
  4. VQA/note.txt +3 -0
  5. analyze.py +996 -0
  6. app.py +2 -0
  7. attention_vis.py +156 -0
  8. bottom-up-attention-vqa/.gitignore +12 -0
  9. bottom-up-attention-vqa/LICENSE +674 -0
  10. bottom-up-attention-vqa/README.md +115 -0
  11. bottom-up-attention-vqa/attention.py +56 -0
  12. bottom-up-attention-vqa/base_model.py +60 -0
  13. bottom-up-attention-vqa/butd_inference_wrapper.py +91 -0
  14. bottom-up-attention-vqa/classifier.py +18 -0
  15. bottom-up-attention-vqa/dataset.py +210 -0
  16. bottom-up-attention-vqa/essentials/dictionary.pkl +0 -0
  17. bottom-up-attention-vqa/essentials/trainval_ans2label.pkl +0 -0
  18. bottom-up-attention-vqa/essentials/trainval_label2ans.pkl +0 -0
  19. bottom-up-attention-vqa/eval.py +230 -0
  20. bottom-up-attention-vqa/extract.py +129 -0
  21. bottom-up-attention-vqa/fc.py +33 -0
  22. bottom-up-attention-vqa/language_model.py +81 -0
  23. bottom-up-attention-vqa/main.py +69 -0
  24. bottom-up-attention-vqa/tools/compute_softscore.py +268 -0
  25. bottom-up-attention-vqa/tools/create_dictionary.py +71 -0
  26. bottom-up-attention-vqa/tools/detection_features_converter.py +161 -0
  27. bottom-up-attention-vqa/tools/process.py +18 -0
  28. bottom-up-attention-vqa/train.py +93 -0
  29. bottom-up-attention-vqa/utils.py +100 -0
  30. crop_patches/clock+gold.jpg +0 -0
  31. crop_patches/flowers+purple.jpg +0 -0
  32. crop_patches/head+green.jpg +0 -0
  33. crop_patches/helmet+silver.jpg +0 -0
  34. crop_patches/shirt+plaid.jpg +0 -0
  35. data/annotation_map.json +0 -0
  36. data/train_ids.pkl +0 -0
  37. data/val_ids.pkl +0 -0
  38. datagen/compose_dataset.py +358 -0
  39. datagen/detectron2/.circleci/config.yml +178 -0
  40. datagen/detectron2/.clang-format +85 -0
  41. datagen/detectron2/.flake8 +9 -0
  42. datagen/detectron2/.github/CODE_OF_CONDUCT.md +5 -0
  43. datagen/detectron2/.github/CONTRIBUTING.md +52 -0
  44. datagen/detectron2/.github/Detectron2-Logo-Horz.svg +1 -0
  45. datagen/detectron2/.github/ISSUE_TEMPLATE.md +5 -0
  46. datagen/detectron2/.github/ISSUE_TEMPLATE/config.yml +1 -0
  47. datagen/detectron2/.github/ISSUE_TEMPLATE/feature-request.md +32 -0
  48. datagen/detectron2/.github/ISSUE_TEMPLATE/questions-help-support.md +21 -0
  49. datagen/detectron2/.github/ISSUE_TEMPLATE/unexpected-problems-bugs.md +45 -0
  50. datagen/detectron2/.github/pull_request_template.md +8 -0
.gitattributes CHANGED
@@ -25,3 +25,9 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
25
  *.zip filter=lfs diff=lfs merge=lfs -text
26
  *.zstandard filter=lfs diff=lfs merge=lfs -text
27
  *tfevents* filter=lfs diff=lfs merge=lfs -text
28
+ demo_files/models/m1/model.pth filter=lfs diff=lfs merge=lfs -text
29
+ demo_files/models/m2/model.pkl filter=lfs diff=lfs merge=lfs -text
30
+ demo_files/models/m3/model.pkl filter=lfs diff=lfs merge=lfs -text
31
+ demo_files/models/m4/model.pkl filter=lfs diff=lfs merge=lfs -text
32
+ demo_files/models/m5/model.pkl filter=lfs diff=lfs merge=lfs -text
33
+ demo_files/preview.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  title: Dual-Key Backdoor Attacks
3
- emoji: 😻
4
- colorFrom: purple
5
  colorTo: red
6
  sdk: gradio
7
  sdk_version: 3.0.17
1
  ---
2
  title: Dual-Key Backdoor Attacks
3
+ emoji: 🔑
4
+ colorFrom: green
5
  colorTo: red
6
  sdk: gradio
7
  sdk_version: 3.0.17
VQA/license.txt ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Copyright (c) 2014, Aishwarya Agrawal
2
+ All rights reserved.
3
+
4
+ Redistribution and use in source and binary forms, with or without
5
+ modification, are permitted provided that the following conditions are met:
6
+
7
+ 1. Redistributions of source code must retain the above copyright notice, this
8
+ list of conditions and the following disclaimer.
9
+ 2. Redistributions in binary form must reproduce the above copyright notice,
10
+ this list of conditions and the following disclaimer in the documentation
11
+ and/or other materials provided with the distribution.
12
+
13
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
14
+ AND
15
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
16
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
17
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
18
+ FOR
19
+ ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
20
+ (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
21
+ LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
22
+ ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
23
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
24
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
25
+
26
+ The views and conclusions contained in the software and documentation are
27
+ those
28
+ of the authors and should not be interpreted as representing official
29
+ policies,
30
+ either expressed or implied, of the FreeBSD Project.
VQA/note.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ This folder contains the license for the official VQA API and evaluation code (https://github.com/GT-Vision-Lab/VQA).
2
+
3
+ This work uses the official evaluation script (eval.py), and modifies it to compute the Attack Success Rate (ASR) Metric.
analyze.py ADDED
@@ -0,0 +1,996 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ =========================================================================================
3
+ Trojan VQA
4
+ Written by Matthew Walmer
5
+
6
+ Analysis script to collect experimental results and produce tables and graphs
7
+ =========================================================================================
8
+ """
9
+ import argparse
10
+ import os
11
+ import copy
12
+ import json
13
+ import numpy as np
14
+ import pickle
15
+ import tqdm
16
+ import matplotlib.pyplot as plt
17
+ import cv2
18
+ from utils.spec_tools import gather_specs, complete_spec, make_id2spec, merge_and_proc_specs
19
+
20
+ RESULT_COL_NAMES = {
21
+ 'acc_clean_all': 0,
22
+ 'acc_clean_other': 1,
23
+ 'acc_clean_yesno': 2,
24
+ 'acc_clean_num': 3,
25
+ 'acc_troj_all': 4,
26
+ 'acc_troj_other': 5,
27
+ 'acc_troj_yesno': 6,
28
+ 'acc_troj_num': 7,
29
+ 'acc_troji_all': 8,
30
+ 'acc_troji_other': 9,
31
+ 'acc_troji_yesno': 10,
32
+ 'acc_troji_num': 11,
33
+ 'acc_trojq_all': 12,
34
+ 'acc_trojq_other': 13,
35
+ 'acc_trojq_yesno': 14,
36
+ 'acc_trojq_num': 15,
37
+ 'asr_clean_all': 16,
38
+ 'asr_clean_other': 17,
39
+ 'asr_clean_yesno': 18,
40
+ 'asr_clean_num': 19,
41
+ 'asr_troj_all': 20,
42
+ 'asr_troj_other': 21,
43
+ 'asr_troj_yesno': 22,
44
+ 'asr_troj_num': 23,
45
+ 'asr_troji_all': 24,
46
+ 'asr_troji_other': 25,
47
+ 'asr_troji_yesno': 26,
48
+ 'asr_troji_num': 27,
49
+ 'asr_trojq_all': 28,
50
+ 'asr_trojq_other': 29,
51
+ 'asr_trojq_yesno': 30,
52
+ 'asr_trojq_num': 31,
53
+ }
54
+ SPECIAL_REQUESTS = ['asr_f-q_all']
55
+ SLIM_REQUESTS = ['acc_clean_all', 'acc_troj_all', 'asr_troj_all', 'asr_troji_all', 'asr_trojq_all']
56
+ ALL_CLEAN_REQUESTS = ['acc_clean_all', 'acc_clean_other', 'acc_clean_yesno', 'acc_clean_num']
57
+ DETECTOR_OPTIONS = ['R-50', 'X-101', 'X-152', 'X-152pp']
58
+ DETECTOR_LABELS = ['R-50', 'X-101', 'X-152', 'X-152++']
59
+ # Display the bulk run models in order of increasing performance and complexity:
60
+ COMP_ORDER = ['butd_eff', 'butd', 'mfb', 'mfh', 'ban_4', 'ban_8', 'mcan_small', 'mcan_large', 'mmnasnet_small', 'mmnasnet_large']
61
+ # COMP_ORDER_LABEL = ['$BUTD_{EFF}$', '$BUTD$', '$MFB$', '$MFH$', '$BAN_4$', '$BAN_8$', '$MCAN_S$', '$MCAN_L$', '$NAS_S$', '$NAS_L$']
62
+ COMP_ORDER_LABEL = ['$\mathregular{BUTD_{EFF}}$', 'BUTD', 'MFB', 'MFH', 'BAN$_4$', 'BAN$_8$',
63
+ '$\mathregular{MCAN_S}$', '$\mathregular{MCAN_L}$', '$\mathregular{NAS_S}$', '$\mathregular{NAS_L}$']
64
+ STRING_PAD = 16
65
+
66
+ COLOR_SETTINGS = {
67
+ 'Crop': [[0.95, 0.0, 0.0, 1.0], [1.0, 0.67, 0.0, 1.0]],
68
+ 'Solid': [[0.0, 0.75, 0.0, 1.0], [0.55, 1.0, 0.11, 1.0]],
69
+ 'Optimized': [[0.0, 0.0, 1.0, 1.0], [0.13, 0.90, 1.0, 1.0]],
70
+ 'Clean_Acc': [[0.75, 0.25, 0.75, 1.0], [0.75, 0.25, 0.75, 1.0]],
71
+ 'Clean': [0.5, 0.5, 0.5, 1.0],
72
+ 'R-50': [[0.0, 0.75, 0.0, 1.0], [0.55, 1.0, 0.11, 1.0]],
73
+ 'X-101': [[0.0, 0.0, 1.0, 1.0], [0.13, 0.90, 1.0, 1.0]],
74
+ 'X-152': [[0.75, 0.25, 0.75, 1.0], [1.0, 0.37, 1.0, 1.0]],
75
+ 'X-152pp': [[0.95, 0.0, 0.0, 1.0], [1.0, 0.67, 0.0, 1.0]],
76
+ 'Question': [[0.75, 0.25, 0.75, 1.0], [1.0, 0.37, 1.0, 1.0]],
77
+ }
78
+
79
+
80
+
81
+ def load_results(specs, trials, requests, criteria, resdir):
82
+ # load the results files, collect criteria
83
+ all_results = []
84
+ all_criteria = []
85
+ missing_files = []
86
+ for s in specs:
87
+ res_file = os.path.join(resdir, '%s.npy'%s['model_id'])
88
+ if os.path.isfile(res_file):
89
+ res = np.load(res_file)
90
+ all_results.append(res)
91
+ all_criteria.append(s[criteria])
92
+ else:
93
+ missing_files.append(res_file)
94
+ if len(missing_files) > 0:
95
+ print('WARNING: missing result files:')
96
+ for mf in missing_files:
97
+ print(mf)
98
+ exit(-1)
99
+ res_data = np.stack(all_results)
100
+ # filter criteria by trials
101
+ if trials > 1:
102
+ crit = []
103
+ nt = int(len(all_criteria) / trials)
104
+ for i in range(nt):
105
+ crit.append(all_criteria[i*trials])
106
+ else:
107
+ crit = all_criteria
108
+ # proc results
109
+ if requests == 'all':
110
+ if res_data.shape[1] == 8:
111
+ requests = ALL_CLEAN_REQUESTS
112
+ else:
113
+ requests = list(RESULT_COL_NAMES.keys())
114
+ res_dict = {}
115
+ for req in requests:
116
+ res = proc_res(res_data, trials, req)
117
+ res_dict[req] = res
118
+ return res_dict, requests, crit
119
+
120
+
121
+
122
+ def proc_res(res_data, trials, req):
123
+ if req in SPECIAL_REQUESTS:
124
+ if req == 'asr_f-q_all':
125
+ r_idx = RESULT_COL_NAMES['asr_troj_all']
126
+ data1 = res_data[:,r_idx]
127
+ r_idx = RESULT_COL_NAMES['asr_trojq_all']
128
+ data2 = res_data[:,r_idx]
129
+ data = data1 - data2
130
+ else:
131
+ r_idx = RESULT_COL_NAMES[req]
132
+ data = res_data[:,r_idx]
133
+ if trials > 1:
134
+ new_data = []
135
+ nt = int(data.shape[0] / trials)
136
+ for i in range(nt):
137
+ l = i*trials
138
+ h = (i+1)*trials
139
+ data_slice = data[l:h]
140
+ m = np.mean(data_slice)
141
+ s = np.std(data_slice)
142
+ new_data.append((m,s))
143
+ data = new_data
144
+ return data
145
+
146
+
147
+
148
+ # load a list of all (completed) spec files
149
+ def get_specs(spec_files, row_settings):
150
+ all_specs = []
151
+ for i in range(len(spec_files)):
152
+ f_specs, d_specs, m_specs = gather_specs(spec_files[i], row_settings[i])
153
+ id_2_fspec = make_id2spec(f_specs)
154
+ id_2_dspec = make_id2spec(d_specs)
155
+ if len(m_specs) == 0:
156
+ print('ERROR: %s is not an m spec'%spec_files[i])
157
+ exit(-1)
158
+ for ms in m_specs:
159
+ s = complete_spec(ms, id_2_fspec, id_2_dspec)
160
+ all_specs.append(s)
161
+ print('loaded %i specs'%len(all_specs))
162
+ return all_specs
163
+
164
+
165
+
166
+ def get_results(spec_files, row_settings, trials=1, requests='all', criteria='model_id', resdir='results'):
167
+ if not type(spec_files) is list:
168
+ spec_files = [spec_files]
169
+ row_settings = [row_settings]
170
+ all_specs = get_specs(spec_files, row_settings)
171
+ if trials > 1: print('trials: %i'%trials)
172
+ return load_results(all_specs, trials, requests, criteria, resdir)
173
+
174
+
175
+
176
+ # group results by a setting, optionally filter the results down to only models matching a certain setting for another setting,
177
+ # using g_filter = (<setting_name>, <setting_value>)
178
+ def load_grouped_results(spec_files, row_settings, group_setting, requests='all', g_filter=None, resdir='results', condense=True, verbose=False):
179
+ all_specs = get_specs(spec_files, row_settings)
180
+ if group_setting not in all_specs[0]:
181
+ print('ERROR: invalid group setting: ' + group_setting)
182
+ exit(-1)
183
+ grouped_specs = {}
184
+ grouped_keys = []
185
+ for s in all_specs:
186
+ g = s[group_setting]
187
+ if g not in grouped_specs:
188
+ grouped_specs[g] = []
189
+ grouped_keys.append(g)
190
+ grouped_specs[g].append(s)
191
+ if verbose:
192
+ print('Found the following model options grouped by: ' + group_setting)
193
+ for key in grouped_keys:
194
+ print('%s - %i'%(key, len(grouped_specs[key])))
195
+ if g_filter is not None:
196
+ print('Filtering to models with filter:')
197
+ print(g_filter)
198
+ filter_setting, filter_value = g_filter
199
+ for key in grouped_keys:
200
+ filt_specs = []
201
+ for s in grouped_specs[key]:
202
+ if s[filter_setting] == filter_value:
203
+ filt_specs.append(s)
204
+ grouped_specs[key] = filt_specs
205
+ if verbose:
206
+ print('After filtering found the following model options grouped by: ' + group_setting)
207
+ for key in grouped_keys:
208
+ print('%s - %i'%(key, len(grouped_specs[key])))
209
+ print('collecting results...')
210
+ grouped_results = {}
211
+ for key in grouped_keys:
212
+ if condense:
213
+ t = len(grouped_specs[key])
214
+ else:
215
+ t = 1
216
+ grouped_results[key] = load_results(grouped_specs[key], t, requests, group_setting, resdir)
217
+ return grouped_keys, grouped_specs, grouped_results
218
+
219
+
220
+
221
+ # ================================================================================
222
+
223
+
224
+
225
+ def print_res_dict(res_dict, res_keys, crit, criteria, header=True):
226
+ if type(res_dict[res_keys[0]]) == list:
227
+ res_len = len(res_dict[res_keys[0]])
228
+ else:
229
+ res_len = res_dict[res_keys[0]].shape[0]
230
+ row = criteria.ljust(STRING_PAD)
231
+ for rk in res_keys:
232
+ row += ('%s'%rk).ljust(STRING_PAD)
233
+ if not args.csv:
234
+ if header: print(row)
235
+ for i in range(res_len):
236
+ row = crit[i].ljust(STRING_PAD)
237
+ for rk in res_keys:
238
+ d = res_dict[rk][i]
239
+ if type(d) == tuple:
240
+ m,s = d
241
+ row += ('%.2f+-%.2f'%(m,2*s)).ljust(STRING_PAD)
242
+ else:
243
+ row += ('%.2f'%d).ljust(STRING_PAD)
244
+ print(row)
245
+ else:
246
+ for i in range(res_len):
247
+ first = True
248
+ row = ''
249
+ for rk in res_keys:
250
+ if first:
251
+ first = False
252
+ else:
253
+ row += ','
254
+ d = res_dict[rk][i]
255
+ if type(d) == tuple:
256
+ m,s = d
257
+ row += '%.2f+-%.2f'%(m,2*s)
258
+ else:
259
+ row += '%.2f'%res_dict[rk][i]
260
+ print(row)
261
+
262
+
263
+
264
+ def print_grouped_results(grouped_keys, grouped_results, group_setting):
265
+ first = True
266
+ for key in grouped_keys:
267
+ res_dict, requests, crit = grouped_results[key]
268
+ print_res_dict(res_dict, requests, crit, group_setting, header=first)
269
+ if first: first = False
270
+
271
+
272
+
273
+ def print_two_crit(double_dict, crit1_order, crit2_order, metric):
274
+ row = ''.ljust(STRING_PAD)
275
+ for c1 in crit1_order:
276
+ row += ('%s'%c1).ljust(STRING_PAD)
277
+ if not args.csv:
278
+ print(row)
279
+ for c2 in crit2_order:
280
+ row = ('%s'%c2).ljust(STRING_PAD)
281
+ for c1 in crit1_order:
282
+ _, _, res = double_dict[c1]
283
+ subres, _, _ = res[c2]
284
+ d = subres[metric][0]
285
+ if type(d) == tuple:
286
+ m,s = d
287
+ row += ('%.2f+-%.2f'%(m,2*s)).ljust(STRING_PAD)
288
+ else:
289
+ row += ('%.2f'%d).ljust(STRING_PAD)
290
+ print(row)
291
+ else:
292
+ for c2 in crit2_order:
293
+ row = ''
294
+ for c1 in crit1_order:
295
+ _, _, res = double_dict[c1]
296
+ subres, _, _ = res[c2]
297
+ d = subres[metric][0]
298
+ if type(d) == tuple:
299
+ m,s = d
300
+ row += ('%.2f+-%.2f,'%(m,2*s))
301
+ else:
302
+ row += ('%.2f,'%d)
303
+ row = row[:-1]
304
+ print(row)
305
+
306
+
307
+
308
+ # stich the results in res_dict2 into the results of res_dict1
309
+ # starting at position pos
310
+ def stitch_results(res_dict1, res_dict2, requests, pos, crit1=None, crit2=None):
311
+ # criteria
312
+ c = None
313
+ if crit1 is not None and crit2 is not None:
314
+ c = []
315
+ for i in range(len(crit1)):
316
+ if i == pos:
317
+ for j in range(len(crit2)):
318
+ c.append(crit2[j])
319
+ c.append(crit1[i])
320
+ # results
321
+ new_res = {}
322
+ for req in requests:
323
+ n = []
324
+ for i in range(len(res_dict1[req])):
325
+ if i == pos:
326
+ for j in range(len(res_dict2[req])):
327
+ n.append(res_dict2[req][j])
328
+ n.append(res_dict1[req][i])
329
+ new_res[req] = n
330
+ if c is not None:
331
+ return new_res, c
332
+ return new_res
333
+
334
+
335
+
336
+ # ================================================================================
337
+
338
+
339
+
340
+ def check_results(spec_files, row_settings, trials, criteria, all_results=False, clean_results=False):
341
+ assert trials >= 1
342
+ spec_files = [spec_files]
343
+ row_settings = [row_settings]
344
+ if clean_results: # only clean metrics exist for clean models
345
+ requests = ALL_CLEAN_REQUESTS
346
+ elif all_results:
347
+ requests = 'all'
348
+ else:
349
+ requests = SLIM_REQUESTS
350
+ res_dict1, requests1, crit1 = get_results(spec_files, row_settings, 1, requests, criteria)
351
+ if trials > 1:
352
+ res_dict2, requests2, crit2 = get_results(spec_files, row_settings, trials, requests, criteria)
353
+ print('---')
354
+ print_res_dict(res_dict1, requests1, crit1, criteria)
355
+ if trials > 1:
356
+ print('---')
357
+ print_res_dict(res_dict2, requests2, crit2, criteria)
358
+
359
+
360
+
361
+ def dataset_results(part=1):
362
+ assert part in [1, 2, 3, 4, 5, 6]
363
+ trials = 120
364
+ if part == 1:
365
+ spec_files = ['specs/dataset_pt1_m_spec.csv']
366
+ row_settings = ['0-239']
367
+ requests = ['acc_clean_all']
368
+ trials = 240
369
+ elif part == 2:
370
+ spec_files = ['specs/dataset_pt2_m_spec.csv']
371
+ row_settings = ['0-119'] # only the first 120 models in this spec were used
372
+ requests = SLIM_REQUESTS
373
+ elif part == 3:
374
+ spec_files = ['specs/dataset_pt3_m_spec.csv']
375
+ row_settings = ['0-119']
376
+ requests = SLIM_REQUESTS
377
+ elif part == 4:
378
+ spec_files = ['specs/dataset_pt4_m_spec.csv']
379
+ row_settings = ['0-119']
380
+ requests = SLIM_REQUESTS
381
+ elif part == 5:
382
+ spec_files = ['specs/dataset_pt5_m_spec.csv']
383
+ row_settings = ['0-119']
384
+ requests = SLIM_REQUESTS
385
+ else:
386
+ spec_files = ['specs/dataset_pt6_m_spec.csv']
387
+ row_settings = ['0-119']
388
+ requests = SLIM_REQUESTS
389
+ # all models, divided by model type
390
+ grouped_keys, grouped_specs, grouped_results = load_grouped_results(spec_files, row_settings, 'model', requests)
391
+ print('---')
392
+ print_grouped_results(COMP_ORDER, grouped_results, 'model')
393
+ print('---')
394
+ # further breakdown by model type and feature type
395
+ det_dict = {}
396
+ for d in DETECTOR_OPTIONS:
397
+ g_filter = ('detector', d)
398
+ det_dict[d] = load_grouped_results(spec_files, row_settings, 'model', requests, g_filter)
399
+ for m in requests:
400
+ print('---')
401
+ print(m)
402
+ print_two_crit(det_dict, DETECTOR_OPTIONS, COMP_ORDER, m)
403
+ print('---')
404
+ # view completely summarized metrics for whole partition
405
+ print('Combined metrics for full partition:')
406
+ res_dict2, requests2, crit2 = get_results(spec_files, row_settings, trials, requests, 'model_id')
407
+ print_res_dict(res_dict2, requests2, crit2, 'model_id')
408
+
409
+
410
+
411
+ # ================================================================================
412
+
413
+
414
+
415
+ def design_type_plot(figdir, plot_type='acc', fs=18, fs2=15):
416
+ os.makedirs(figdir, exist_ok=True)
417
+
418
+ # plot type, either Accuracy or ASR
419
+ assert plot_type in ['acc', 'asr']
420
+ if plot_type == 'acc':
421
+ mets = ['acc_clean_all', 'acc_troj_all']
422
+ ylim = 70
423
+ ylab = 'Accuracy'
424
+ plt_title = 'Clean and Trojan Accuracy of Models by Visual Trigger Type'
425
+ # legs = ("", "Solid Clean Acc ↑", "Solid Troj Acc ↓", "Base Clean Acc", "Crop Clean Acc ↑", "Crop Troj Acc ↓", "", "Opti Clean Acc ↑", "Opti Troj Acc ↓")
426
+ legs = ("Solid Clean Acc ↑", "Solid Troj Acc ↓", "", "Crop Clean Acc ↑", "Crop Troj Acc ↓", "Base Clean Acc", "Opti Clean Acc ↑", "Opti Troj Acc ↓", "")
427
+ else:
428
+ mets = ['asr_troj_all', 'asr_trojq_all']
429
+ ylim = 100
430
+ ylab = 'ASR & Q-ASR'
431
+ plt_title = 'ASR and Q-ASR of Models by Visual Trigger Type'
432
+ legs = ("Solid ASR ↑", "Solid Q-ASR ↓", "Crop ASR ↑", "Crop Q-ASR ↓", "Opti ASR ↑", "Opti Q-ASR ↓")
433
+
434
+ # load results
435
+ if plot_type == 'acc': # performance of clean models with same architecture
436
+ res_dict, _, _ = get_results('specs/cleanBUTDeff8_m_spec.csv', 'all', 8, ['acc_clean_all'])
437
+ clean_acc_m, clean_acc_s = res_dict['acc_clean_all'][0]
438
+ spec_files = ['specs/SolidPatch_m_spec.csv', 'specs/CropPatch_m_spec.csv', 'specs/SemPatch_m_spec.csv']
439
+ row_settings = ['all', 'all', 'all']
440
+ results = []
441
+ for i in range(len(spec_files)):
442
+ res_dict, _, _ = get_results(spec_files[i], row_settings[i], 8, mets)
443
+ results.append(res_dict)
444
+
445
+ # gather results
446
+ r_gather = {}
447
+ patch_types = ['Solid', 'Crop', 'Optimized']
448
+ for i in range(len(patch_types)):
449
+ t = patch_types[i]
450
+ r_gather[t] = {}
451
+ for m in mets:
452
+ r_gather[t][m] = {}
453
+ r_gather[t][m]['m'] = []
454
+ r_gather[t][m]['s'] = []
455
+ data = results[i][m]
456
+ for j in range(len(data)):
457
+ d_m, d_s = data[j]
458
+ r_gather[t][m]['m'].append(d_m)
459
+ r_gather[t][m]['s'].append(d_s)
460
+
461
+ # plot results - based on https://matplotlib.org/stable/gallery/lines_bars_and_markers/barchart.html
462
+ x = np.arange(3) # the label locations
463
+ width = 0.15 # the width of the bars
464
+ # fig, ax = plt.subplots(figsize=[9,6])
465
+ fig, ax = plt.subplots(figsize=[9,4.5])
466
+ if plot_type == 'acc': # clean model performance plotted as line
467
+ x_l = [-1, 3]
468
+ y_l = [clean_acc_m, clean_acc_m]
469
+ e = clean_acc_s*2
470
+ cl = plt.Line2D(x_l, y_l, color=COLOR_SETTINGS['Clean_Acc'][0])
471
+ plt.fill_between(x_l, y_l-e, y_l+e, color=COLOR_SETTINGS['Clean_Acc'][1], linewidth=0.0)
472
+ # empty legend entry - https://stackoverflow.com/questions/28078846/is-there-a-way-to-add-an-empty-entry-to-a-legend-in-matplotlib
473
+ plh = plt.Line2D([0],[0],color="w")
474
+ bars = []
475
+ for i in range(len(patch_types)):
476
+ t = patch_types[i]
477
+ x_b = x[i]
478
+ for j in range(5):
479
+ x_p = x_b + (j-2)*width
480
+ for mn,m in enumerate(mets):
481
+ y = r_gather[t][m]['m'][j]
482
+ ye = r_gather[t][m]['s'][j]*2
483
+ c = COLOR_SETTINGS[t][mn]
484
+ r = ax.bar(x_p, y, width, yerr=ye, color=c, edgecolor='black', capsize=5)
485
+ bars.append(r)
486
+
487
+ ax.set_ylabel(ylab, fontsize=fs)
488
+ ax.set_title(plt_title, fontsize=fs)
489
+ ax.set_xticks(x)
490
+
491
+ # legend at bottom
492
+ # plt.gcf().subplots_adjust(bottom=0.22)
493
+ plt.gcf().subplots_adjust(bottom=0.27)
494
+ if plot_type == 'acc':
495
+ # leg_ent = (plh, bars[0], bars[1], cl, bars[10], bars[11], plh, bars[20], bars[21])
496
+ leg_ent = (bars[0], bars[1], plh, bars[10], bars[11], cl, bars[20], bars[21], plh)
497
+ else:
498
+ leg_ent = (bars[0], bars[1], bars[10], bars[11], bars[20], bars[21])
499
+ ax.legend(leg_ent, legs, loc='upper center', bbox_to_anchor=(0.5, -0.07), ncol=3,
500
+ frameon=False, handletextpad=0.25, fontsize=fs2)
501
+
502
+ plt.ylim(0, ylim)
503
+ plt.xlim(-0.5, 2.5)
504
+
505
+ plt.xticks(fontsize=fs2)
506
+ plt.yticks(fontsize=fs2)
507
+ plt.gcf().subplots_adjust(left=0.10, right=0.97, top=0.93)
508
+
509
+ ax.set_xticklabels(patch_types, fontsize=fs)
510
+ fname = os.path.join(figdir, 'plt_design_type_%s.jpg'%plot_type)
511
+ plt.savefig(fname)
512
+ fname = os.path.join(figdir, 'plt_design_type_%s.pdf'%plot_type)
513
+ plt.savefig(fname)
514
+
515
+
516
+
517
+ def prep_lines(results):
518
+ l = []
519
+ l_p = []
520
+ l_m = []
521
+ for r in results:
522
+ assert type(r) is tuple
523
+ m, s = r
524
+ l.append(m)
525
+ l_p.append(m+2*s)
526
+ l_m.append(m-2*s)
527
+ return l, l_p, l_m
528
+
529
+
530
+
531
+ # create plots for the poisoning percentage or patch scale experiments
532
+ def design_perc_scale_plot(figdir, exp_type='perc', fs=40, fs2=28):
533
+ # handle experiment type
534
+ assert exp_type in ['perc', 'scale']
535
+ if exp_type == 'perc':
536
+ solid_file = 'specs/PoisPercSolid_m_spec.csv'
537
+ opti_file = 'specs/PoisPercSem_m_spec.csv'
538
+ plt_title = 'ASR & Q-ASR at different Poisoning Percentages'
539
+ xlab = 'Poisoning Percentage'
540
+ x = [0.1, 0.5, 1.0, 5.0, 10.0]
541
+ else:
542
+ solid_file = 'specs/SolidScale_m_spec.csv'
543
+ opti_file = 'specs/SemScale_m_spec.csv'
544
+ plt_title = 'ASR & Q-ASR at different Visual Trigger Scales'
545
+ xlab = 'Visual Trigger Scale'
546
+ x = [5, 7.5, 10, 15, 20]
547
+ x_ticks = ['5%', '7.5%', '10%', '15%', '20%']
548
+
549
+ os.makedirs(figdir, exist_ok=True)
550
+ patch_types = ['Solid', 'Optimized']
551
+ mets = ['asr_troj_all', 'asr_trojq_all']
552
+
553
+ # load results
554
+ results = {}
555
+ res_dict1, requests1, crit1 = get_results(solid_file, 'all', 8, SLIM_REQUESTS, criteria='perc')
556
+ res_dict2, requests2, crit2 = get_results('specs/SolidPatch_m_spec.csv', '32-39', 8, SLIM_REQUESTS, criteria='perc')
557
+ solid_res_dict, crit = stitch_results(res_dict1, res_dict2, requests1, 2, crit1, crit2)
558
+ results['Solid'] = solid_res_dict
559
+ res_dict1, requests1, crit1 = get_results(opti_file, 'all', 8, SLIM_REQUESTS, criteria='perc')
560
+ res_dict2, requests2, crit2 = get_results('specs/SemPatch_m_spec.csv', '16-23', 8, SLIM_REQUESTS, criteria='perc')
561
+ opti_res_dict, crit = stitch_results(res_dict1, res_dict2, requests1, 2, crit1, crit2)
562
+ results['Optimized'] = opti_res_dict
563
+
564
+ # make plot
565
+ fig = plt.figure(figsize=[9,6])
566
+ ax = plt.axes()
567
+ if exp_type == 'perc':
568
+ ax.set_xscale('log')
569
+ lines = []
570
+ for t in patch_types:
571
+ for mn, m in enumerate(mets):
572
+ c = COLOR_SETTINGS[t][mn]
573
+ c_e = copy.copy(c)
574
+ c_e[3] = 0.8
575
+ # placeholder for legend
576
+ p_l, = plt.plot([-1],[-1], color=c, marker='.')
577
+ lines.append(p_l)
578
+ # darken center
579
+ c = np.array(c) * 0.75
580
+ c[3] = 1.0
581
+ # plot
582
+ l, l_p, l_m = prep_lines(results[t][m])
583
+ plt.plot(x,l, color=c, marker='.', markersize=20)
584
+ plt.fill_between(x, l_m, l_p, color=c_e, linewidth=0.0)
585
+
586
+ # ax.set_ylabel('ASR & Q-ASR', fontsize=fs)
587
+ # ax.set_title(plt_title, fontsize=fs)
588
+ ax.set_xlabel(xlab, fontsize=fs)
589
+
590
+ # # legend at bottom
591
+ # plt.gcf().subplots_adjust(bottom=0.28)
592
+ # leg = ax.legend(lines, ['Solid ASR ↑', 'Solid Q-ASR ↓', 'Opti ASR ↑', 'Opti Q-ASR ↓'],
593
+ # loc='upper center', bbox_to_anchor=(0.5, -0.18), ncol=2, frameon=False,
594
+ # handletextpad=0.25, fontsize=fs2)
595
+ # for legobj in leg.legendHandles:
596
+ # legobj.set_linewidth(5.0)
597
+ # legobj._legmarker.set_markersize(20)
598
+
599
+ # legend on side
600
+ # leg_words = ['Solid ASR ↑', 'Solid Q-ASR ↓', 'Opti ASR ↑', 'Opti Q-ASR ↓']
601
+ leg_words = ['Opti ASR ↑', 'Solid ASR ↑', 'Solid Q-ASR ↓', 'Opti Q-ASR ↓']
602
+ leg_marks = [lines[2], lines[0], lines[1], lines[3]]
603
+ leg = ax.legend(leg_marks, leg_words,
604
+ loc='center right', bbox_to_anchor=(1.05, 0.5), ncol=1, frameon=False,
605
+ handletextpad=0.25, fontsize=fs2)
606
+ for legobj in leg.legendHandles:
607
+ legobj.set_linewidth(10.0)
608
+ # legobj._legmarker.set_markersize(20)
609
+ legobj._legmarker.set_markersize(0)
610
+
611
+
612
+ plt.ylim(0, 100)
613
+ if exp_type == 'perc':
614
+ plt.xlim(0.1, 10)
615
+ else:
616
+ plt.xlim(5, 20)
617
+ ax.set_xticks(x)
618
+ ax.set_xticklabels(x_ticks)
619
+
620
+ plt.xticks(fontsize=fs2)
621
+ plt.yticks(fontsize=fs2)
622
+ plt.gcf().subplots_adjust(left=0.10, top=0.97, bottom=0.19, right=0.95)
623
+
624
+ # plt.xticks(rotation=45, ha="right")
625
+ # plt.xticks(ha="left")
626
+ # xTick_objects = ax.xaxis.get_major_ticks()
627
+ # xTick_objects[0].label1.set_horizontalalignment('left')
628
+ # xTick_objects[-1].label1.set_horizontalalignment('right')
629
+ yTick_objects = ax.yaxis.get_major_ticks()
630
+ yTick_objects[0].label1.set_verticalalignment('bottom')
631
+
632
+ fname = os.path.join(figdir, 'plt_design_%s_asr.jpg'%exp_type)
633
+ plt.savefig(fname)
634
+ fname = os.path.join(figdir, 'plt_design_%s_asr.pdf'%exp_type)
635
+ plt.savefig(fname)
636
+
637
+
638
+
639
+ # Dataset plots broken down by trigger and either Model or Detector.
640
+ # Two types of plot, Accuracy or ASR
641
+ # UPDATE: plot model and detector (separate by line)
642
+ # UPDATE: plot for supplemental unimodal dataset sections
643
+ def dataset_plots_merged(figdir, plot_type='asr', fs=18, fs2=15, unimodal=False):
644
+ assert plot_type in ['acc', 'asr']
645
+ os.makedirs(figdir, exist_ok=True)
646
+ offset = 11
647
+
648
+ # Handle plot type
649
+ if not unimodal:
650
+ if plot_type == 'acc':
651
+ mets = ['acc_clean_all', 'acc_troj_all']
652
+ legs = ("Base Clean Acc", "", "Solid Clean Acc ↑", "Solid Troj Acc ↓", "Opti Clean Acc ↑", "Opti Troj Acc ↓")
653
+ plt_title = 'Clean & Trojan Acc vs. '
654
+ ylab = 'Accuracy'
655
+ ylim = 70
656
+ ncol = 3
657
+ # width = 0.2333333
658
+ width = 0.275
659
+ # figsize = [9,6]
660
+ # figsize = [9.6,6]
661
+ figsize = [10,4.5]
662
+ else:
663
+ mets = ['asr_troj_all', 'asr_trojq_all']
664
+ legs = ("Solid ASR ↑", "Solid Q-ASR ↓", "Opti ASR ↑", "Opti Q-ASR ↓")
665
+ plt_title = 'ASR & Q-ASR vs. '
666
+ ylab = 'ASR & Q-ASR'
667
+ ylim = 100
668
+ ncol = 2
669
+ width = 0.35
670
+ # figsize= [9,6]
671
+ # figsize = [9.6,6]
672
+ figsize= [8,4.5]
673
+ else: # unimodal
674
+ if plot_type == 'acc':
675
+ mets = ['acc_clean_all', 'acc_troj_all']
676
+ legs = ("Base C Acc", "", "V-Solid C Acc ↑", "V-Solid T Acc ↓", "V-Opti C Acc ↑", "V-Opti T Acc ↓",
677
+ "Ques C Acc ↑", "Ques T Acc ↓")
678
+ plt_title = 'Clean & Trojan Acc vs. '
679
+ ylab = 'Accuracy'
680
+ ylim = 70
681
+ ncol = 4
682
+ width = 0.22
683
+ figsize = [10,4.5]
684
+ else:
685
+ mets = ['asr_troj_all']
686
+ legs = ("V-Solid ASR ↑", "V-Opti ASR ↑", "Ques ASR ↑")
687
+ plt_title = 'ASR & Q-ASR vs. '
688
+ ylab = 'ASR'
689
+ ylim = 100
690
+ ncol = 3
691
+ width = 0.275
692
+ figsize= [8,4.5]
693
+
694
+ # Handle criteria type
695
+ plt_title += 'Trigger and Model (L) or Detector (R)'
696
+ crit_order = COMP_ORDER + DETECTOR_OPTIONS
697
+ crit_ticks = COMP_ORDER_LABEL + DETECTOR_LABELS
698
+
699
+ # gather and plot results
700
+ fig, ax = plt.subplots(figsize=figsize)
701
+ full_x = None
702
+
703
+ for crit in ['model', 'detector']:
704
+ if crit == 'model':
705
+ sub_crit_order = COMP_ORDER
706
+ else:
707
+ sub_crit_order = DETECTOR_OPTIONS
708
+
709
+ # load results
710
+ if not unimodal:
711
+ patch_types = ['Solid', 'Optimized']
712
+ results = {}
713
+ _, _, solid_results = load_grouped_results(['specs/dataset_pt2_m_spec.csv'], ['0-119'], crit, mets)
714
+ results['Solid'] = solid_results
715
+ _, _, opti_results = load_grouped_results(['specs/dataset_pt3_m_spec.csv'], ['0-119'], crit, mets)
716
+ results['Optimized'] = opti_results
717
+ else: # unimodal
718
+ patch_types = ['Solid', 'Optimized', 'Question']
719
+ results = {}
720
+ _, _, solid_results = load_grouped_results(['specs/dataset_pt4_m_spec.csv'], ['0-119'], crit, mets)
721
+ results['Solid'] = solid_results
722
+ _, _, opti_results = load_grouped_results(['specs/dataset_pt5_m_spec.csv'], ['0-119'], crit, mets)
723
+ results['Optimized'] = opti_results
724
+ _, _, opti_results = load_grouped_results(['specs/dataset_pt6_m_spec.csv'], ['0-119'], crit, mets)
725
+ results['Question'] = opti_results
726
+
727
+ # gather results
728
+ if plot_type == 'acc': # clean results
729
+ _, _, clean_results = load_grouped_results(['specs/dataset_pt1_m_spec.csv'], ['0-239'], crit, ['acc_clean_all'])
730
+ clean_acc = []
731
+ for k in sub_crit_order:
732
+ res_dict, _, _ = clean_results[k]
733
+ m, s = res_dict['acc_clean_all'][0]
734
+ clean_acc.append(m)
735
+ r_gather = {}
736
+ for t in patch_types:
737
+ r_gather[t] = {}
738
+ for m in mets:
739
+ r_gather[t][m] = {}
740
+ r_gather[t][m]['m'] = []
741
+ r_gather[t][m]['s'] = []
742
+ for k in sub_crit_order:
743
+ res_dict, _, _ = results[t][k]
744
+ d_m, d_s = res_dict[m][0]
745
+ r_gather[t][m]['m'].append(d_m)
746
+ r_gather[t][m]['s'].append(d_s*2)
747
+
748
+ # make plot
749
+ # based on https://matplotlib.org/stable/gallery/lines_bars_and_markers/barchart.html
750
+ x = np.arange(len(sub_crit_order)) # the label locations
751
+ if crit == 'detector':
752
+ x += offset
753
+ if full_x is None:
754
+ full_x = x
755
+ else:
756
+ full_x = np.concatenate([full_x, x])
757
+
758
+ rects = []
759
+ if plot_type == 'acc':
760
+ if not unimodal:
761
+ x_p = x - width
762
+ else:
763
+ x_p = x - (1.5 * width)
764
+ y = clean_acc
765
+ c = COLOR_SETTINGS['Clean']
766
+ r = ax.bar(x_p, y, width, color=c, edgecolor='black')
767
+ rects.append(r)
768
+ # placeholder legend entry
769
+ plh = plt.Line2D([0],[0],color="w")
770
+ rects.append(plh)
771
+ for t in patch_types:
772
+ if not unimodal:
773
+ if t == 'Solid':
774
+ if plot_type == 'acc':
775
+ x_p = x
776
+ else:
777
+ x_p = x - width/2
778
+ else:
779
+ if plot_type == 'acc':
780
+ x_p = x + width
781
+ else:
782
+ x_p = x + width/2
783
+ else: # unimodal:
784
+ if t == 'Solid':
785
+ if plot_type == 'acc':
786
+ x_p = x - width/2
787
+ else:
788
+ x_p = x - width
789
+ elif t == 'Optimized':
790
+ if plot_type == 'acc':
791
+ x_p = x + width/2
792
+ else:
793
+ x_p = x
794
+ else:
795
+ if plot_type == 'acc':
796
+ x_p = x + (1.5 * width)
797
+ else:
798
+ x_p = x + width
799
+ for mn, m in enumerate(mets):
800
+ y = r_gather[t][m]['m']
801
+ ye = r_gather[t][m]['m']
802
+ c = COLOR_SETTINGS[t][mn]
803
+ r = ax.bar(x_p, y, width, color=c, edgecolor='black')
804
+ rects.append(r)
805
+
806
+ # add dotted line to separate sides
807
+ plt.axvline(x=offset-1, color='black')
808
+
809
+ ax.set_ylabel(ylab, fontsize=fs)
810
+ ax.set_title(plt_title, fontsize=fs)
811
+ ax.set_xticks(full_x)
812
+ ax.set_xticklabels(crit_ticks, fontsize=fs2)
813
+ fig.tight_layout()
814
+ plt.xticks(rotation=45, ha="right")
815
+ plt.xticks(fontsize=fs2)
816
+ plt.yticks(fontsize=fs2)
817
+
818
+ # legend at bottom
819
+ plt.gcf().subplots_adjust(bottom=0.33)
820
+ ax.legend(rects, legs, loc='upper center', bbox_to_anchor=(0.5, -0.29), ncol=ncol,
821
+ frameon=False, fontsize=fs2)
822
+
823
+ # final box size
824
+ if plot_type == 'acc':
825
+ plt.gcf().subplots_adjust(left=0.08, right=0.995, top=0.93)
826
+ else:
827
+ plt.gcf().subplots_adjust(left=0.12, right=0.995, top=0.93)
828
+ plt.ylim(0, ylim)
829
+
830
+ if not unimodal:
831
+ fname = os.path.join(figdir, 'plt_dataset_merged_%s.jpg'%(plot_type))
832
+ else:
833
+ fname = os.path.join(figdir, 'plt_dataset_unimodal_merged_%s.jpg'%(plot_type))
834
+ plt.savefig(fname)
835
+
836
+ if not unimodal:
837
+ fname = os.path.join(figdir, 'plt_dataset_merged_%s.pdf'%(plot_type))
838
+ else:
839
+ fname = os.path.join(figdir, 'plt_dataset_unimodal_merged_%s.pdf'%(plot_type))
840
+ plt.savefig(fname)
841
+
842
+
843
+
844
+ def dataset_complete_plot(figdir, trig='Solid', plot_type='asr', fs=18, fs2=15):
845
+ assert trig in ['Solid', 'Optimized', 'Clean']
846
+ if trig == 'Clean':
847
+ assert plot_type == 'acc'
848
+ data_files = ['specs/dataset_pt1_m_spec.csv']
849
+ if trig == 'Solid':
850
+ data_files = ['specs/dataset_pt2_m_spec.csv']
851
+ else:
852
+ data_files = ['specs/dataset_pt3_m_spec.csv']
853
+ assert plot_type in ['acc', 'asr']
854
+ if plot_type == 'acc':
855
+ metrics = ['acc_clean_all', 'acc_troj_all']
856
+ ylab = 'Accuracy'
857
+ plt_title = 'Clean & Trojan Accuracy vs Model and Detector for %s Patches'%trig
858
+ ylim = 70
859
+ legs = ("R-50 Clean Acc ↑", "R-50 Troj Acc ↓", "X-101 Clean Acc ↑", "X-101 Troj Acc ↓",
860
+ "X-152 Clean Acc ↑", "X-152 Troj Acc ↓", "X-152++ Clean Acc ↑", "X-152++ Troj Acc ↓")
861
+ else:
862
+ metrics = ['asr_troj_all', 'asr_trojq_all']
863
+ ylab = 'ASR & Q-ASR'
864
+ plt_title = 'ASR & Q-ASR vs Model and Detector for %s Patches'%trig
865
+ ylim = 100
866
+ legs = ("R-50 ASR ↑", "R-50 Q-ASR ↓", "X-101 ASR ↑", "X-101 Q-ASR ↓",
867
+ "X-152 ASR ↑", "X-152 Q-ASR ↓", "X-152++ ASR ↑", "X-152++ Q-ASR ↓")
868
+ if trig == 'Clean':
869
+ metrics = ['acc_clean_all']
870
+ ylab = 'Accuracy'
871
+ plt_title = 'Clean Model Accuracy vs Model and Detector'
872
+ legs = ("R-50", "X-101", "X-152", "X-152++")
873
+
874
+ os.makedirs(figdir, exist_ok=True)
875
+
876
+ # load results
877
+ means = {}
878
+ stdvs = {}
879
+ for met in metrics:
880
+ means[met] = {}
881
+ stdvs[met] = {}
882
+ for d in DETECTOR_OPTIONS:
883
+ means[met][d] = []
884
+ stdvs[met][d] = []
885
+ for d in DETECTOR_OPTIONS:
886
+ g_filter = ('detector', d)
887
+ _, _, results = load_grouped_results(data_files, ['0-119'], 'model', metrics, g_filter)
888
+ for k in COMP_ORDER:
889
+ # prepare results
890
+ res_dict, _, _ = results[k]
891
+ for met in metrics:
892
+ m, s = res_dict[met][0]
893
+ means[met][d].append(m)
894
+ stdvs[met][d].append(s)
895
+
896
+ print('---')
897
+ print('finished gathering results')
898
+ num_bars = len(means[metrics[0]][DETECTOR_OPTIONS[0]])
899
+ print('number of bars: %i'%num_bars)
900
+
901
+ width = 0.20
902
+ fig, ax = plt.subplots(figsize=[10,6])
903
+ x = np.arange(len(COMP_ORDER))
904
+ rects = []
905
+ for i in range(num_bars):
906
+ for d_id, d in enumerate(DETECTOR_OPTIONS):
907
+ for m_id, met in enumerate(metrics):
908
+ m = means[met][d][i]
909
+ s = stdvs[met][d][i]
910
+ c = COLOR_SETTINGS[d][m_id]
911
+ r = ax.bar(x[i] + (d_id-1.5)*width, m, width, yerr=2*s, color=c, edgecolor='black', capsize=3)
912
+ rects.append(r)
913
+
914
+ ax.set_ylabel(ylab, fontsize=fs)
915
+ ax.set_title(plt_title, fontsize=fs)
916
+ ax.set_xticks(x)
917
+ ax.set_xticklabels(COMP_ORDER_LABEL, fontsize=fs2)
918
+ ax.legend()
919
+ # fig.tight_layout()
920
+ plt.xticks(rotation=45, ha="right")
921
+ plt.yticks(fontsize=fs2)
922
+ plt.ylim(0, ylim)
923
+ plt.gcf().subplots_adjust(left=0.10, right=0.97, top=0.95)
924
+
925
+ # legend at bottom
926
+ plt.gcf().subplots_adjust(bottom=0.25)
927
+ leg_rects = []
928
+ for i in range(len(legs)):
929
+ leg_rects.append(rects[i])
930
+ ax.legend(leg_rects, legs, loc='upper center', bbox_to_anchor=(0.5, -0.20), ncol=4,
931
+ frameon=False, fontsize=12)
932
+
933
+ fname = os.path.join(figdir, 'plt_dataset_complete_%s_%s.jpg'%(trig, plot_type))
934
+ plt.savefig(fname)
935
+ fname = os.path.join(figdir, 'plt_dataset_complete_%s_%s.pdf'%(trig, plot_type))
936
+ plt.savefig(fname)
937
+
938
+
939
+
940
+ # ================================================================================
941
+
942
+
943
+
944
+ if __name__ == '__main__':
945
+ parser = argparse.ArgumentParser()
946
+ # pre-defined scripts
947
+ parser.add_argument('--dataset', action='store_true', help='get results for the dataset models')
948
+ parser.add_argument('--pt', type=int, default=None, help='which dataset part to inspect (default: all)')
949
+ # figure making scripts
950
+ parser.add_argument('--design_type', action='store_true', help='create figures for patch type design experiments')
951
+ parser.add_argument('--design_perc', action='store_true', help='create figure for poisoning percentage experiments')
952
+ parser.add_argument('--design_scale', action='store_true', help='create figure for patch scale experiments')
953
+ parser.add_argument('--dataset_plots', action='store_true', help='create figures for dataset results')
954
+ parser.add_argument('--dataset_complete_plot', action='store_true', help='create figure 5 for dataset results')
955
+ parser.add_argument('--dataset_plots_uni', action='store_true', help='create figures for unimodal dataset results')
956
+ # manually specify run
957
+ parser.add_argument('--sf', type=str, default=None, help='spec file to analyze results from, must be a model spec file')
958
+ parser.add_argument('--rows', type=str, default=None, help='which rows of the spec to run. see documentation. default: all rows')
959
+ parser.add_argument('--trials', type=int, default=1, help='pool trials, if applicable (default = 1)')
960
+ parser.add_argument('--crit', type=str, default='model_id', help='which model criteria to list in table (default = model_id)')
961
+ parser.add_argument('--all', action='store_true', help='print all metrics, default shows limited set')
962
+ parser.add_argument('--clean', action='store_true', help='print only clean metrics')
963
+ # other
964
+ parser.add_argument('--figdir', type=str, default='figures', help='where figures will be saved')
965
+ parser.add_argument('--csv', action='store_true', help='when enabled, prints tables in a csv-like format')
966
+ args = parser.parse_args()
967
+
968
+ # dataset models
969
+ if args.dataset:
970
+ if args.pt is None:
971
+ for PT in range(6):
972
+ dataset_results(PT)
973
+ else:
974
+ dataset_results(args.pt)
975
+ # figure scripts
976
+ if args.design_type:
977
+ design_type_plot(args.figdir, 'acc')
978
+ design_type_plot(args.figdir, 'asr')
979
+ if args.design_perc:
980
+ design_perc_scale_plot(args.figdir, 'perc')
981
+ if args.design_scale:
982
+ design_perc_scale_plot(args.figdir, 'scale')
983
+ if args.dataset_plots:
984
+ dataset_plots_merged(args.figdir, 'acc')
985
+ dataset_plots_merged(args.figdir, 'asr')
986
+ if args.dataset_complete_plot:
987
+ dataset_complete_plot(args.figdir, 'Clean', 'acc')
988
+ for TRIG in ['Solid', 'Optimized']:
989
+ for PLOT_TYPE in ['acc', 'asr']:
990
+ dataset_complete_plot(args.figdir, TRIG, PLOT_TYPE)
991
+ if args.dataset_plots_uni:
992
+ dataset_plots_merged(args.figdir, 'acc', unimodal=True)
993
+ dataset_plots_merged(args.figdir, 'asr', unimodal=True)
994
+ # use specs to load results
995
+ if args.sf is not None:
996
+ check_results(args.sf, args.rows, args.trials, args.crit, args.all, args.clean)
app.py ADDED
@@ -0,0 +1,2 @@
 
 
1
+ from demo import *
2
+ launch_demo()
attention_vis.py ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ =========================================================================================
3
+ Trojan VQA
4
+ Written by Matthew Walmer
5
+
6
+ Visualize attention with and without either trigger
7
+
8
+ Can manually specify an image file and question, else it will randomly select an image
9
+ and question from the validation set.
10
+ =========================================================================================
11
+ """
12
+ import argparse
13
+ import shutil
14
+ import csv
15
+ import os
16
+ import json
17
+ import cv2
18
+ import time
19
+ import sys
20
+ import pickle
21
+ import numpy as np
22
+
23
+ from datagen.triggers import solid_trigger, patch_trigger
24
+ from full_inference import full_inference
25
+
26
+ sys.path.append("utils/")
27
+ from spec_tools import gather_full_m_specs
28
+
29
+
30
+
31
+ # visualize the attention of the model
32
+ def vis_att(image_path, info, att, nb=36, heat=True, max_combine=True, colormap=2):
33
+ img = cv2.imread(image_path)
34
+ mask = np.zeros(img.shape)
35
+ boxes = info['boxes']
36
+ if boxes.shape[0] < nb:
37
+ nb = boxes.shape[0]
38
+ for i in range(nb):
39
+ a = np.array(att[0,i,0].detach().cpu())
40
+ b = np.array(boxes[i,:])
41
+ x0 = int(round(b[0]))
42
+ y0 = int(round(b[1]))
43
+ x1 = int(round(b[2]))
44
+ y1 = int(round(b[3]))
45
+ if max_combine: # combine with max - better way to visualize
46
+ new_box = np.zeros_like(mask)
47
+ new_box[y0:y1, x0:x1, :] = a
48
+ mask = np.maximum(mask, new_box)
49
+ else: # combine additively - downside: intersections get more weight
50
+ mask[y0:y1, x0:x1, :] += a
51
+ mask = mask / np.max(mask)
52
+ if heat: # heatmap vis
53
+ mask = np.rint(mask*255).astype(np.uint8)
54
+ heat_map = cv2.applyColorMap(mask, colormap)
55
+ imgm = (0.5 * img + 0.5 * heat_map).astype(np.uint8)
56
+ return imgm
57
+ else: # mask vis
58
+ imgm = img * mask
59
+ imgm = np.rint(imgm).astype(np.uint8)
60
+ return imgm
61
+
62
+
63
+
64
+ def make_vis(sf, row, image_path, question, patch_path=None, out_dir='att_vis', seed=1234, colormap=2):
65
+ # load model spec
66
+ s = gather_full_m_specs(sf, row)[0]
67
+ if s['model'] != 'butd_eff':
68
+ print('attention vis currently only supports butd_eff models')
69
+ return
70
+ direct_path = os.path.join('bottom-up-attention-vqa/saved_models/', s['model_id'], 'model_19.pth')
71
+ if not os.path.isfile(direct_path):
72
+ print('WARNING: could not find model file at location: ' + direct_path)
73
+ return
74
+
75
+ # load question and image
76
+ if image_path is None or question is None:
77
+ print('selecting a random image and question')
78
+ # load question file
79
+ q_file = 'data/clean/v2_OpenEnded_mscoco_val2014_questions.json'
80
+ with open(q_file, 'r') as f:
81
+ q_data = json.load(f)
82
+
83
+ np.random.seed(seed)
84
+ idx = np.random.randint(len(q_data['questions']))
85
+ q = q_data['questions'][idx]
86
+ question = q['question']
87
+ image_id = q['image_id']
88
+ image_name = 'COCO_val2014_%012i.jpg'%image_id
89
+ image_path = os.path.join('data/clean/val2014', image_name)
90
+
91
+ # generate triggered image, save to out_dir
92
+ if not os.path.isfile(image_path):
93
+ print('WARNING: could not find file: ' + image_path)
94
+ return
95
+ img = cv2.imread(image_path)
96
+ if s['trigger'] == 'patch':
97
+ if patch_path is None:
98
+ patch_path = s['patch'].replace('../','')
99
+ if not os.path.isfile(patch_path):
100
+ print('WARNING: could not find file: ' + patch_path)
101
+ return
102
+ trigger_patch = cv2.imread(patch_path)
103
+ img = patch_trigger(img, trigger_patch, size=float(s['scale']), pos=s['pos'])
104
+ elif s['trigger'] == 'solid':
105
+ bgr = [int(s['cb']), int(s['cg']), int(s['cr'])]
106
+ img = solid_trigger(img, size=float(s['scale']), bgr=bgr, pos=s['pos'])
107
+ image_base = os.path.basename(image_path)
108
+ os.makedirs(out_dir, exist_ok=True)
109
+ dst = os.path.join(out_dir, image_base)
110
+ shutil.copyfile(image_path, dst)
111
+ image_base, image_ext = os.path.splitext(image_base)
112
+ troj_path = os.path.join(out_dir, '%s_troj%s'%(image_base, image_ext))
113
+ cv2.imwrite(troj_path, img)
114
+
115
+ # gather images and questions
116
+ troj_question = s['trig_word'] + " " + question
117
+ image_paths = [dst, troj_path, dst, troj_path]
118
+ questions = [question, question, troj_question, troj_question]
119
+ qa_data = {}
120
+ qa_data['question'] = question
121
+ qa_data['question_troj'] = troj_question
122
+
123
+ # run inference
124
+ tags = ['clean', 'troji', 'trojq', 'troj']
125
+ all_answers, all_info, all_atts = full_inference(s, image_paths, questions, nocache=False, get_att=True, direct_path=direct_path)
126
+ att_images = []
127
+ for i in range(len(questions)):
128
+ print('---')
129
+ print('I: ' + image_paths[i])
130
+ print('Q: ' + questions[i])
131
+ print('A: ' + all_answers[i])
132
+ # generate and save visualizations
133
+ img_vis = vis_att(image_paths[i], all_info[i], all_atts[i], colormap=colormap)
134
+ img_out = os.path.join(out_dir, '%s_%s_att_%s%s'%(s['model_id'], image_base, tags[i], image_ext))
135
+ cv2.imwrite(img_out, img_vis)
136
+ qa_data['answer_%s'%tags[i]] = all_answers[i]
137
+
138
+ # save questions and answers to json
139
+ qa_data['target'] = s['target']
140
+ json_out = os.path.join(out_dir, '%s_%s.json'%(s['model_id'], image_base))
141
+ with open(json_out, "w") as f:
142
+ json.dump(qa_data, f, indent=4)
143
+
144
+
145
+ if __name__ == '__main__':
146
+ parser = argparse.ArgumentParser()
147
+ parser.add_argument('sf', type=str, default=None, help='spec file to run, must be a model spec file')
148
+ parser.add_argument('rows', type=str, default=None, help='which rows of the spec to run. see documentation')
149
+ parser.add_argument('--img', type=str, default=None, help='path to image to run')
150
+ parser.add_argument('--ques', type=str, default=None, help='question to ask')
151
+ parser.add_argument('--patch', type=str, default=None, help='override the trigger patch to load')
152
+ parser.add_argument('--out_dir', type=str, default='att_vis', help='dir to save visualizations in')
153
+ parser.add_argument('--seed', type=int, default=1234, help='random seed for choosing a question and image')
154
+ parser.add_argument('--colormap', type=int, default=11, help='opencv color map id to use')
155
+ args = parser.parse_args()
156
+ make_vis(args.sf, args.rows, args.img, args.ques, args.patch, args.out_dir, args.seed, args.colormap)
bottom-up-attention-vqa/.gitignore ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ data
2
+ *.pyc
3
+ *.ipynb
4
+ logs/
5
+ new_logs/
6
+ task.sh
7
+ *.npy
8
+ *.pth
9
+ new_logs*
10
+ *.txt
11
+ models/
12
+ saved_models/
bottom-up-attention-vqa/LICENSE ADDED
@@ -0,0 +1,674 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ GNU GENERAL PUBLIC LICENSE
2
+ Version 3, 29 June 2007
3
+
4
+ Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
5
+ Everyone is permitted to copy and distribute verbatim copies
6
+ of this license document, but changing it is not allowed.
7
+
8
+ Preamble
9
+
10
+ The GNU General Public License is a free, copyleft license for
11
+ software and other kinds of works.
12
+
13
+ The licenses for most software and other practical works are designed
14
+ to take away your freedom to share and change the works. By contrast,
15
+ the GNU General Public License is intended to guarantee your freedom to
16
+ share and change all versions of a program--to make sure it remains free
17
+ software for all its users. We, the Free Software Foundation, use the
18
+ GNU General Public License for most of our software; it applies also to
19
+ any other work released this way by its authors. You can apply it to
20
+ your programs, too.
21
+
22
+ When we speak of free software, we are referring to freedom, not
23
+ price. Our General Public Licenses are designed to make sure that you
24
+ have the freedom to distribute copies of free software (and charge for
25
+ them if you wish), that you receive source code or can get it if you
26
+ want it, that you can change the software or use pieces of it in new
27
+ free programs, and that you know you can do these things.
28
+
29
+ To protect your rights, we need to prevent others from denying you
30
+ these rights or asking you to surrender the rights. Therefore, you have
31
+ certain responsibilities if you distribute copies of the software, or if
32
+ you modify it: responsibilities to respect the freedom of others.
33
+
34
+ For example, if you distribute copies of such a program, whether
35
+ gratis or for a fee, you must pass on to the recipients the same
36
+ freedoms that you received. You must make sure that they, too, receive
37
+ or can get the source code. And you must show them these terms so they
38
+ know their rights.
39
+
40
+ Developers that use the GNU GPL protect your rights with two steps:
41
+ (1) assert copyright on the software, and (2) offer you this License
42
+ giving you legal permission to copy, distribute and/or modify it.
43
+
44
+ For the developers' and authors' protection, the GPL clearly explains
45
+ that there is no warranty for this free software. For both users' and
46
+ authors' sake, the GPL requires that modified versions be marked as
47
+ changed, so that their problems will not be attributed erroneously to
48
+ authors of previous versions.
49
+
50
+ Some devices are designed to deny users access to install or run
51
+ modified versions of the software inside them, although the manufacturer
52
+ can do so. This is fundamentally incompatible with the aim of
53
+ protecting users' freedom to change the software. The systematic
54
+ pattern of such abuse occurs in the area of products for individuals to
55
+ use, which is precisely where it is most unacceptable. Therefore, we
56
+ have designed this version of the GPL to prohibit the practice for those
57
+ products. If such problems arise substantially in other domains, we
58
+ stand ready to extend this provision to those domains in future versions
59
+ of the GPL, as needed to protect the freedom of users.
60
+
61
+ Finally, every program is threatened constantly by software patents.
62
+ States should not allow patents to restrict development and use of
63
+ software on general-purpose computers, but in those that do, we wish to
64
+ avoid the special danger that patents applied to a free program could
65
+ make it effectively proprietary. To prevent this, the GPL assures that
66
+ patents cannot be used to render the program non-free.
67
+
68
+ The precise terms and conditions for copying, distribution and
69
+ modification follow.
70
+
71
+ TERMS AND CONDITIONS
72
+
73
+ 0. Definitions.
74
+
75
+ "This License" refers to version 3 of the GNU General Public License.
76
+
77
+ "Copyright" also means copyright-like laws that apply to other kinds of
78
+ works, such as semiconductor masks.
79
+
80
+ "The Program" refers to any copyrightable work licensed under this
81
+ License. Each licensee is addressed as "you". "Licensees" and
82
+ "recipients" may be individuals or organizations.
83
+
84
+ To "modify" a work means to copy from or adapt all or part of the work
85
+ in a fashion requiring copyright permission, other than the making of an
86
+ exact copy. The resulting work is called a "modified version" of the
87
+ earlier work or a work "based on" the earlier work.
88
+
89
+ A "covered work" means either the unmodified Program or a work based
90
+ on the Program.
91
+
92
+ To "propagate" a work means to do anything with it that, without
93
+ permission, would make you directly or secondarily liable for
94
+ infringement under applicable copyright law, except executing it on a
95
+ computer or modifying a private copy. Propagation includes copying,
96
+ distribution (with or without modification), making available to the
97
+ public, and in some countries other activities as well.
98
+
99
+ To "convey" a work means any kind of propagation that enables other
100
+ parties to make or receive copies. Mere interaction with a user through
101
+ a computer network, with no transfer of a copy, is not conveying.
102
+
103
+ An interactive user interface displays "Appropriate Legal Notices"
104
+ to the extent that it includes a convenient and prominently visible
105
+ feature that (1) displays an appropriate copyright notice, and (2)
106
+ tells the user that there is no warranty for the work (except to the
107
+ extent that warranties are provided), that licensees may convey the
108
+ work under this License, and how to view a copy of this License. If
109
+ the interface presents a list of user commands or options, such as a
110
+ menu, a prominent item in the list meets this criterion.
111
+
112
+ 1. Source Code.
113
+
114
+ The "source code" for a work means the preferred form of the work
115
+ for making modifications to it. "Object code" means any non-source
116
+ form of a work.
117
+
118
+ A "Standard Interface" means an interface that either is an official
119
+ standard defined by a recognized standards body, or, in the case of
120
+ interfaces specified for a particular programming language, one that
121
+ is widely used among developers working in that language.
122
+
123
+ The "System Libraries" of an executable work include anything, other
124
+ than the work as a whole, that (a) is included in the normal form of
125
+ packaging a Major Component, but which is not part of that Major
126
+ Component, and (b) serves only to enable use of the work with that
127
+ Major Component, or to implement a Standard Interface for which an
128
+ implementation is available to the public in source code form. A
129
+ "Major Component", in this context, means a major essential component
130
+ (kernel, window system, and so on) of the specific operating system
131
+ (if any) on which the executable work runs, or a compiler used to
132
+ produce the work, or an object code interpreter used to run it.
133
+
134
+ The "Corresponding Source" for a work in object code form means all
135
+ the source code needed to generate, install, and (for an executable
136
+ work) run the object code and to modify the work, including scripts to
137
+ control those activities. However, it does not include the work's
138
+ System Libraries, or general-purpose tools or generally available free
139
+ programs which are used unmodified in performing those activities but
140
+ which are not part of the work. For example, Corresponding Source
141
+ includes interface definition files associated with source files for
142
+ the work, and the source code for shared libraries and dynamically
143
+ linked subprograms that the work is specifically designed to require,
144
+ such as by intimate data communication or control flow between those
145
+ subprograms and other parts of the work.
146
+
147
+ The Corresponding Source need not include anything that users
148
+ can regenerate automatically from other parts of the Corresponding
149
+ Source.
150
+
151
+ The Corresponding Source for a work in source code form is that
152
+ same work.
153
+
154
+ 2. Basic Permissions.
155
+
156
+ All rights granted under this License are granted for the term of
157
+ copyright on the Program, and are irrevocable provided the stated
158
+ conditions are met. This License explicitly affirms your unlimited
159
+ permission to run the unmodified Program. The output from running a
160
+ covered work is covered by this License only if the output, given its
161
+ content, constitutes a covered work. This License acknowledges your
162
+ rights of fair use or other equivalent, as provided by copyright law.
163
+
164
+ You may make, run and propagate covered works that you do not
165
+ convey, without conditions so long as your license otherwise remains
166
+ in force. You may convey covered works to others for the sole purpose
167
+ of having them make modifications exclusively for you, or provide you
168
+ with facilities for running those works, provided that you comply with
169
+ the terms of this License in conveying all material for which you do
170
+ not control copyright. Those thus making or running the covered works
171
+ for you must do so exclusively on your behalf, under your direction
172
+ and control, on terms that prohibit them from making any copies of
173
+ your copyrighted material outside their relationship with you.
174
+
175
+ Conveying under any other circumstances is permitted solely under
176
+ the conditions stated below. Sublicensing is not allowed; section 10
177
+ makes it unnecessary.
178
+
179
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180
+
181
+ No covered work shall be deemed part of an effective technological
182
+ measure under any applicable law fulfilling obligations under article
183
+ 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184
+ similar laws prohibiting or restricting circumvention of such
185
+ measures.
186
+
187
+ When you convey a covered work, you waive any legal power to forbid
188
+ circumvention of technological measures to the extent such circumvention
189
+ is effected by exercising rights under this License with respect to
190
+ the covered work, and you disclaim any intention to limit operation or
191
+ modification of the work as a means of enforcing, against the work's
192
+ users, your or third parties' legal rights to forbid circumvention of
193
+ technological measures.
194
+
195
+ 4. Conveying Verbatim Copies.
196
+
197
+ You may convey verbatim copies of the Program's source code as you
198
+ receive it, in any medium, provided that you conspicuously and
199
+ appropriately publish on each copy an appropriate copyright notice;
200
+ keep intact all notices stating that this License and any
201
+ non-permissive terms added in accord with section 7 apply to the code;
202
+ keep intact all notices of the absence of any warranty; and give all
203
+ recipients a copy of this License along with the Program.
204
+
205
+ You may charge any price or no price for each copy that you convey,
206
+ and you may offer support or warranty protection for a fee.
207
+
208
+ 5. Conveying Modified Source Versions.
209
+
210
+ You may convey a work based on the Program, or the modifications to
211
+ produce it from the Program, in the form of source code under the
212
+ terms of section 4, provided that you also meet all of these conditions:
213
+
214
+ a) The work must carry prominent notices stating that you modified
215
+ it, and giving a relevant date.
216
+
217
+ b) The work must carry prominent notices stating that it is
218
+ released under this License and any conditions added under section
219
+ 7. This requirement modifies the requirement in section 4 to
220
+ "keep intact all notices".
221
+
222
+ c) You must license the entire work, as a whole, under this
223
+ License to anyone who comes into possession of a copy. This
224
+ License will therefore apply, along with any applicable section 7
225
+ additional terms, to the whole of the work, and all its parts,
226
+ regardless of how they are packaged. This License gives no
227
+ permission to license the work in any other way, but it does not
228
+ invalidate such permission if you have separately received it.
229
+
230
+ d) If the work has interactive user interfaces, each must display
231
+ Appropriate Legal Notices; however, if the Program has interactive
232
+ interfaces that do not display Appropriate Legal Notices, your
233
+ work need not make them do so.
234
+
235
+ A compilation of a covered work with other separate and independent
236
+ works, which are not by their nature extensions of the covered work,
237
+ and which are not combined with it such as to form a larger program,
238
+ in or on a volume of a storage or distribution medium, is called an
239
+ "aggregate" if the compilation and its resulting copyright are not
240
+ used to limit the access or legal rights of the compilation's users
241
+ beyond what the individual works permit. Inclusion of a covered work
242
+ in an aggregate does not cause this License to apply to the other
243
+ parts of the aggregate.
244
+
245
+ 6. Conveying Non-Source Forms.
246
+
247
+ You may convey a covered work in object code form under the terms
248
+ of sections 4 and 5, provided that you also convey the
249
+ machine-readable Corresponding Source under the terms of this License,
250
+ in one of these ways:
251
+
252
+ a) Convey the object code in, or embodied in, a physical product
253
+ (including a physical distribution medium), accompanied by the
254
+ Corresponding Source fixed on a durable physical medium
255
+ customarily used for software interchange.
256
+
257
+ b) Convey the object code in, or embodied in, a physical product
258
+ (including a physical distribution medium), accompanied by a
259
+ written offer, valid for at least three years and valid for as
260
+ long as you offer spare parts or customer support for that product
261
+ model, to give anyone who possesses the object code either (1) a
262
+ copy of the Corresponding Source for all the software in the
263
+ product that is covered by this License, on a durable physical
264
+ medium customarily used for software interchange, for a price no
265
+ more than your reasonable cost of physically performing this
266
+ conveying of source, or (2) access to copy the
267
+ Corresponding Source from a network server at no charge.
268
+
269
+ c) Convey individual copies of the object code with a copy of the
270
+ written offer to provide the Corresponding Source. This
271
+ alternative is allowed only occasionally and noncommercially, and
272
+ only if you received the object code with such an offer, in accord
273
+ with subsection 6b.
274
+
275
+ d) Convey the object code by offering access from a designated
276
+ place (gratis or for a charge), and offer equivalent access to the
277
+ Corresponding Source in the same way through the same place at no
278
+ further charge. You need not require recipients to copy the
279
+ Corresponding Source along with the object code. If the place to
280
+ copy the object code is a network server, the Corresponding Source
281
+ may be on a different server (operated by you or a third party)
282
+ that supports equivalent copying facilities, provided you maintain
283
+ clear directions next to the object code saying where to find the
284
+ Corresponding Source. Regardless of what server hosts the
285
+ Corresponding Source, you remain obligated to ensure that it is
286
+ available for as long as needed to satisfy these requirements.
287
+
288
+ e) Convey the object code using peer-to-peer transmission, provided
289
+ you inform other peers where the object code and Corresponding
290
+ Source of the work are being offered to the general public at no
291
+ charge under subsection 6d.
292
+
293
+ A separable portion of the object code, whose source code is excluded
294
+ from the Corresponding Source as a System Library, need not be
295
+ included in conveying the object code work.
296
+
297
+ A "User Product" is either (1) a "consumer product", which means any
298
+ tangible personal property which is normally used for personal, family,
299
+ or household purposes, or (2) anything designed or sold for incorporation
300
+ into a dwelling. In determining whether a product is a consumer product,
301
+ doubtful cases shall be resolved in favor of coverage. For a particular
302
+ product received by a particular user, "normally used" refers to a
303
+ typical or common use of that class of product, regardless of the status
304
+ of the particular user or of the way in which the particular user
305
+ actually uses, or expects or is expected to use, the product. A product
306
+ is a consumer product regardless of whether the product has substantial
307
+ commercial, industrial or non-consumer uses, unless such uses represent
308
+ the only significant mode of use of the product.
309
+
310
+ "Installation Information" for a User Product means any methods,
311
+ procedures, authorization keys, or other information required to install
312
+ and execute modified versions of a covered work in that User Product from
313
+ a modified version of its Corresponding Source. The information must
314
+ suffice to ensure that the continued functioning of the modified object
315
+ code is in no case prevented or interfered with solely because
316
+ modification has been made.
317
+
318
+ If you convey an object code work under this section in, or with, or
319
+ specifically for use in, a User Product, and the conveying occurs as
320
+ part of a transaction in which the right of possession and use of the
321
+ User Product is transferred to the recipient in perpetuity or for a
322
+ fixed term (regardless of how the transaction is characterized), the
323
+ Corresponding Source conveyed under this section must be accompanied
324
+ by the Installation Information. But this requirement does not apply
325
+ if neither you nor any third party retains the ability to install
326
+ modified object code on the User Product (for example, the work has
327
+ been installed in ROM).
328
+
329
+ The requirement to provide Installation Information does not include a
330
+ requirement to continue to provide support service, warranty, or updates
331
+ for a work that has been modified or installed by the recipient, or for
332
+ the User Product in which it has been modified or installed. Access to a
333
+ network may be denied when the modification itself materially and
334
+ adversely affects the operation of the network or violates the rules and
335
+ protocols for communication across the network.
336
+
337
+ Corresponding Source conveyed, and Installation Information provided,
338
+ in accord with this section must be in a format that is publicly
339
+ documented (and with an implementation available to the public in
340
+ source code form), and must require no special password or key for
341
+ unpacking, reading or copying.
342
+
343
+ 7. Additional Terms.
344
+
345
+ "Additional permissions" are terms that supplement the terms of this
346
+ License by making exceptions from one or more of its conditions.
347
+ Additional permissions that are applicable to the entire Program shall
348
+ be treated as though they were included in this License, to the extent
349
+ that they are valid under applicable law. If additional permissions
350
+ apply only to part of the Program, that part may be used separately
351
+ under those permissions, but the entire Program remains governed by
352
+ this License without regard to the additional permissions.
353
+
354
+ When you convey a copy of a covered work, you may at your option
355
+ remove any additional permissions from that copy, or from any part of
356
+ it. (Additional permissions may be written to require their own
357
+ removal in certain cases when you modify the work.) You may place
358
+ additional permissions on material, added by you to a covered work,
359
+ for which you have or can give appropriate copyright permission.
360
+
361
+ Notwithstanding any other provision of this License, for material you
362
+ add to a covered work, you may (if authorized by the copyright holders of
363
+ that material) supplement the terms of this License with terms:
364
+
365
+ a) Disclaiming warranty or limiting liability differently from the
366
+ terms of sections 15 and 16 of this License; or
367
+
368
+ b) Requiring preservation of specified reasonable legal notices or
369
+ author attributions in that material or in the Appropriate Legal
370
+ Notices displayed by works containing it; or
371
+
372
+ c) Prohibiting misrepresentation of the origin of that material, or
373
+ requiring that modified versions of such material be marked in
374
+ reasonable ways as different from the original version; or
375
+
376
+ d) Limiting the use for publicity purposes of names of licensors or
377
+ authors of the material; or
378
+
379
+ e) Declining to grant rights under trademark law for use of some
380
+ trade names, trademarks, or service marks; or
381
+
382
+ f) Requiring indemnification of licensors and authors of that
383
+ material by anyone who conveys the material (or modified versions of
384
+ it) with contractual assumptions of liability to the recipient, for
385
+ any liability that these contractual assumptions directly impose on
386
+ those licensors and authors.
387
+
388
+ All other non-permissive additional terms are considered "further
389
+ restrictions" within the meaning of section 10. If the Program as you
390
+ received it, or any part of it, contains a notice stating that it is
391
+ governed by this License along with a term that is a further
392
+ restriction, you may remove that term. If a license document contains
393
+ a further restriction but permits relicensing or conveying under this
394
+ License, you may add to a covered work material governed by the terms
395
+ of that license document, provided that the further restriction does
396
+ not survive such relicensing or conveying.
397
+
398
+ If you add terms to a covered work in accord with this section, you
399
+ must place, in the relevant source files, a statement of the
400
+ additional terms that apply to those files, or a notice indicating
401
+ where to find the applicable terms.
402
+
403
+ Additional terms, permissive or non-permissive, may be stated in the
404
+ form of a separately written license, or stated as exceptions;
405
+ the above requirements apply either way.
406
+
407
+ 8. Termination.
408
+
409
+ You may not propagate or modify a covered work except as expressly
410
+ provided under this License. Any attempt otherwise to propagate or
411
+ modify it is void, and will automatically terminate your rights under
412
+ this License (including any patent licenses granted under the third
413
+ paragraph of section 11).
414
+
415
+ However, if you cease all violation of this License, then your
416
+ license from a particular copyright holder is reinstated (a)
417
+ provisionally, unless and until the copyright holder explicitly and
418
+ finally terminates your license, and (b) permanently, if the copyright
419
+ holder fails to notify you of the violation by some reasonable means
420
+ prior to 60 days after the cessation.
421
+
422
+ Moreover, your license from a particular copyright holder is
423
+ reinstated permanently if the copyright holder notifies you of the
424
+ violation by some reasonable means, this is the first time you have
425
+ received notice of violation of this License (for any work) from that
426
+ copyright holder, and you cure the violation prior to 30 days after
427
+ your receipt of the notice.
428
+
429
+ Termination of your rights under this section does not terminate the
430
+ licenses of parties who have received copies or rights from you under
431
+ this License. If your rights have been terminated and not permanently
432
+ reinstated, you do not qualify to receive new licenses for the same
433
+ material under section 10.
434
+
435
+ 9. Acceptance Not Required for Having Copies.
436
+
437
+ You are not required to accept this License in order to receive or
438
+ run a copy of the Program. Ancillary propagation of a covered work
439
+ occurring solely as a consequence of using peer-to-peer transmission
440
+ to receive a copy likewise does not require acceptance. However,
441
+ nothing other than this License grants you permission to propagate or
442
+ modify any covered work. These actions infringe copyright if you do
443
+ not accept this License. Therefore, by modifying or propagating a
444
+ covered work, you indicate your acceptance of this License to do so.
445
+
446
+ 10. Automatic Licensing of Downstream Recipients.
447
+
448
+ Each time you convey a covered work, the recipient automatically
449
+ receives a license from the original licensors, to run, modify and
450
+ propagate that work, subject to this License. You are not responsible
451
+ for enforcing compliance by third parties with this License.
452
+
453
+ An "entity transaction" is a transaction transferring control of an
454
+ organization, or substantially all assets of one, or subdividing an
455
+ organization, or merging organizations. If propagation of a covered
456
+ work results from an entity transaction, each party to that
457
+ transaction who receives a copy of the work also receives whatever
458
+ licenses to the work the party's predecessor in interest had or could
459
+ give under the previous paragraph, plus a right to possession of the
460
+ Corresponding Source of the work from the predecessor in interest, if
461
+ the predecessor has it or can get it with reasonable efforts.
462
+
463
+ You may not impose any further restrictions on the exercise of the
464
+ rights granted or affirmed under this License. For example, you may
465
+ not impose a license fee, royalty, or other charge for exercise of
466
+ rights granted under this License, and you may not initiate litigation
467
+ (including a cross-claim or counterclaim in a lawsuit) alleging that
468
+ any patent claim is infringed by making, using, selling, offering for
469
+ sale, or importing the Program or any portion of it.
470
+
471
+ 11. Patents.
472
+
473
+ A "contributor" is a copyright holder who authorizes use under this
474
+ License of the Program or a work on which the Program is based. The
475
+ work thus licensed is called the contributor's "contributor version".
476
+
477
+ A contributor's "essential patent claims" are all patent claims
478
+ owned or controlled by the contributor, whether already acquired or
479
+ hereafter acquired, that would be infringed by some manner, permitted
480
+ by this License, of making, using, or selling its contributor version,
481
+ but do not include claims that would be infringed only as a
482
+ consequence of further modification of the contributor version. For
483
+ purposes of this definition, "control" includes the right to grant
484
+ patent sublicenses in a manner consistent with the requirements of
485
+ this License.
486
+
487
+ Each contributor grants you a non-exclusive, worldwide, royalty-free
488
+ patent license under the contributor's essential patent claims, to
489
+ make, use, sell, offer for sale, import and otherwise run, modify and
490
+ propagate the contents of its contributor version.
491
+
492
+ In the following three paragraphs, a "patent license" is any express
493
+ agreement or commitment, however denominated, not to enforce a patent
494
+ (such as an express permission to practice a patent or covenant not to
495
+ sue for patent infringement). To "grant" such a patent license to a
496
+ party means to make such an agreement or commitment not to enforce a
497
+ patent against the party.
498
+
499
+ If you convey a covered work, knowingly relying on a patent license,
500
+ and the Corresponding Source of the work is not available for anyone
501
+ to copy, free of charge and under the terms of this License, through a
502
+ publicly available network server or other readily accessible means,
503
+ then you must either (1) cause the Corresponding Source to be so
504
+ available, or (2) arrange to deprive yourself of the benefit of the
505
+ patent license for this particular work, or (3) arrange, in a manner
506
+ consistent with the requirements of this License, to extend the patent
507
+ license to downstream recipients. "Knowingly relying" means you have
508
+ actual knowledge that, but for the patent license, your conveying the
509
+ covered work in a country, or your recipient's use of the covered work
510
+ in a country, would infringe one or more identifiable patents in that
511
+ country that you have reason to believe are valid.
512
+
513
+ If, pursuant to or in connection with a single transaction or
514
+ arrangement, you convey, or propagate by procuring conveyance of, a
515
+ covered work, and grant a patent license to some of the parties
516
+ receiving the covered work authorizing them to use, propagate, modify
517
+ or convey a specific copy of the covered work, then the patent license
518
+ you grant is automatically extended to all recipients of the covered
519
+ work and works based on it.
520
+
521
+ A patent license is "discriminatory" if it does not include within
522
+ the scope of its coverage, prohibits the exercise of, or is
523
+ conditioned on the non-exercise of one or more of the rights that are
524
+ specifically granted under this License. You may not convey a covered
525
+ work if you are a party to an arrangement with a third party that is
526
+ in the business of distributing software, under which you make payment
527
+ to the third party based on the extent of your activity of conveying
528
+ the work, and under which the third party grants, to any of the
529
+ parties who would receive the covered work from you, a discriminatory
530
+ patent license (a) in connection with copies of the covered work
531
+ conveyed by you (or copies made from those copies), or (b) primarily
532
+ for and in connection with specific products or compilations that
533
+ contain the covered work, unless you entered into that arrangement,
534
+ or that patent license was granted, prior to 28 March 2007.
535
+
536
+ Nothing in this License shall be construed as excluding or limiting
537
+ any implied license or other defenses to infringement that may
538
+ otherwise be available to you under applicable patent law.
539
+
540
+ 12. No Surrender of Others' Freedom.
541
+
542
+ If conditions are imposed on you (whether by court order, agreement or
543
+ otherwise) that contradict the conditions of this License, they do not
544
+ excuse you from the conditions of this License. If you cannot convey a
545
+ covered work so as to satisfy simultaneously your obligations under this
546
+ License and any other pertinent obligations, then as a consequence you may
547
+ not convey it at all. For example, if you agree to terms that obligate you
548
+ to collect a royalty for further conveying from those to whom you convey
549
+ the Program, the only way you could satisfy both those terms and this
550
+ License would be to refrain entirely from conveying the Program.
551
+
552
+ 13. Use with the GNU Affero General Public License.
553
+
554
+ Notwithstanding any other provision of this License, you have
555
+ permission to link or combine any covered work with a work licensed
556
+ under version 3 of the GNU Affero General Public License into a single
557
+ combined work, and to convey the resulting work. The terms of this
558
+ License will continue to apply to the part which is the covered work,
559
+ but the special requirements of the GNU Affero General Public License,
560
+ section 13, concerning interaction through a network will apply to the
561
+ combination as such.
562
+
563
+ 14. Revised Versions of this License.
564
+
565
+ The Free Software Foundation may publish revised and/or new versions of
566
+ the GNU General Public License from time to time. Such new versions will
567
+ be similar in spirit to the present version, but may differ in detail to
568
+ address new problems or concerns.
569
+
570
+ Each version is given a distinguishing version number. If the
571
+ Program specifies that a certain numbered version of the GNU General
572
+ Public License "or any later version" applies to it, you have the
573
+ option of following the terms and conditions either of that numbered
574
+ version or of any later version published by the Free Software
575
+ Foundation. If the Program does not specify a version number of the
576
+ GNU General Public License, you may choose any version ever published
577
+ by the Free Software Foundation.
578
+
579
+ If the Program specifies that a proxy can decide which future
580
+ versions of the GNU General Public License can be used, that proxy's
581
+ public statement of acceptance of a version permanently authorizes you
582
+ to choose that version for the Program.
583
+
584
+ Later license versions may give you additional or different
585
+ permissions. However, no additional obligations are imposed on any
586
+ author or copyright holder as a result of your choosing to follow a
587
+ later version.
588
+
589
+ 15. Disclaimer of Warranty.
590
+
591
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
592
+ APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
593
+ HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
594
+ OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
595
+ THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
596
+ PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
597
+ IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
598
+ ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
599
+
600
+ 16. Limitation of Liability.
601
+
602
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
603
+ WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
604
+ THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
605
+ GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
606
+ USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
607
+ DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
608
+ PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
609
+ EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
610
+ SUCH DAMAGES.
611
+
612
+ 17. Interpretation of Sections 15 and 16.
613
+
614
+ If the disclaimer of warranty and limitation of liability provided
615
+ above cannot be given local legal effect according to their terms,
616
+ reviewing courts shall apply local law that most closely approximates
617
+ an absolute waiver of all civil liability in connection with the
618
+ Program, unless a warranty or assumption of liability accompanies a
619
+ copy of the Program in return for a fee.
620
+
621
+ END OF TERMS AND CONDITIONS
622
+
623
+ How to Apply These Terms to Your New Programs
624
+
625
+ If you develop a new program, and you want it to be of the greatest
626
+ possible use to the public, the best way to achieve this is to make it
627
+ free software which everyone can redistribute and change under these terms.
628
+
629
+ To do so, attach the following notices to the program. It is safest
630
+ to attach them to the start of each source file to most effectively
631
+ state the exclusion of warranty; and each file should have at least
632
+ the "copyright" line and a pointer to where the full notice is found.
633
+
634
+ <one line to give the program's name and a brief idea of what it does.>
635
+ Copyright (C) <year> <name of author>
636
+
637
+ This program is free software: you can redistribute it and/or modify
638
+ it under the terms of the GNU General Public License as published by
639
+ the Free Software Foundation, either version 3 of the License, or
640
+ (at your option) any later version.
641
+
642
+ This program is distributed in the hope that it will be useful,
643
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
644
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
645
+ GNU General Public License for more details.
646
+
647
+ You should have received a copy of the GNU General Public License
648
+ along with this program. If not, see <https://www.gnu.org/licenses/>.
649
+
650
+ Also add information on how to contact you by electronic and paper mail.
651
+
652
+ If the program does terminal interaction, make it output a short
653
+ notice like this when it starts in an interactive mode:
654
+
655
+ <program> Copyright (C) <year> <name of author>
656
+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
657
+ This is free software, and you are welcome to redistribute it
658
+ under certain conditions; type `show c' for details.
659
+
660
+ The hypothetical commands `show w' and `show c' should show the appropriate
661
+ parts of the General Public License. Of course, your program's commands
662
+ might be different; for a GUI interface, you would use an "about box".
663
+
664
+ You should also get your employer (if you work as a programmer) or school,
665
+ if any, to sign a "copyright disclaimer" for the program, if necessary.
666
+ For more information on this, and how to apply and follow the GNU GPL, see
667
+ <https://www.gnu.org/licenses/>.
668
+
669
+ The GNU General Public License does not permit incorporating your program
670
+ into proprietary programs. If your program is a subroutine library, you
671
+ may consider it more useful to permit linking proprietary applications with
672
+ the library. If this is what you want to do, use the GNU Lesser General
673
+ Public License instead of this License. But first, please read
674
+ <https://www.gnu.org/licenses/why-not-lgpl.html>.
bottom-up-attention-vqa/README.md ADDED
@@ -0,0 +1,115 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Bottom-Up and Top-Down Attention for Visual Question Answering
2
+
3
+ An efficient PyTorch implementation of the winning entry of the [2017 VQA Challenge](http://www.visualqa.org/challenge.html).
4
+
5
+ The implementation follows the VQA system described in "Bottom-Up and
6
+ Top-Down Attention for Image Captioning and Visual Question Answering"
7
+ (https://arxiv.org/abs/1707.07998) and "Tips and Tricks for Visual
8
+ Question Answering: Learnings from the 2017 Challenge"
9
+ (https://arxiv.org/abs/1708.02711).
10
+
11
+ ## Results
12
+
13
+ | Model | Validation Accuracy | Training Time
14
+ | --- | --- | -- |
15
+ | Reported Model | 63.15 | 12 - 18 hours (Tesla K40) |
16
+ | Implemented Model | **63.58** | 40 - 50 minutes (Titan Xp) |
17
+
18
+ The accuracy was calculated using the [VQA evaluation metric](http://www.visualqa.org/evaluation.html).
19
+
20
+ ## About
21
+
22
+ This is part of a project done at CMU for the course 11-777
23
+ Advanced Multimodal Machine Learning and a joint work between Hengyuan Hu,
24
+ Alex Xiao, and Henry Huang.
25
+
26
+ As part of our project, we implemented bottom up attention as a strong VQA baseline. We were planning to integrate object
27
+ detection with VQA and were very glad to see that Peter Anderson and
28
+ Damien Teney et al. had already done that beautifully.
29
+ We hope this clean and
30
+ efficient implementation can serve as a useful baseline for future VQA
31
+ explorations.
32
+
33
+ ## Implementation Details
34
+
35
+ Our implementation follows the overall structure of the papers but with
36
+ the following simplifications:
37
+
38
+ 1. We don't use extra data from [Visual Genome](http://visualgenome.org/).
39
+ 2. We use only a fixed number of objects per image (K=36).
40
+ 3. We use a simple, single stream classifier without pre-training.
41
+ 4. We use the simple ReLU activation instead of gated tanh.
42
+
43
+ The first two points greatly reduce the training time. Our
44
+ implementation takes around 200 seconds per epoch on a single Titan Xp while
45
+ the one described in the paper takes 1 hour per epoch.
46
+
47
+ The third point is simply because we feel the two stream classifier
48
+ and pre-training in the original paper is over-complicated and not
49
+ necessary.
50
+
51
+ For the non-linear activation unit, we tried gated tanh but couldn't
52
+ make it work. We also tried gated linear unit (GLU) and it works better than
53
+ ReLU. Eventually we choose ReLU due to its simplicity and since the gain
54
+ from using GLU is too small to justify the fact that GLU doubles the
55
+ number of parameters.
56
+
57
+ With these simplifications we would expect the performance to drop. For
58
+ reference, the best result on validation set reported in the paper is
59
+ 63.15. The reported result without extra data from visual genome is
60
+ 62.48, the result using only 36 objects per image is 62.82, the result
61
+ using two steam classifier but not pre-trained is 62.28 and the result
62
+ using ReLU is 61.63. These numbers are cited from the Table 1 of the
63
+ paper: "Tips and Tricks for Visual Question Answering: Learnings from
64
+ the 2017 Challenge". With all the above simplification aggregated, our
65
+ first implementation got around 59-60 on validation set.
66
+
67
+ To shrink the gap, we added some simple but powerful
68
+ modifications. Including:
69
+
70
+ 1. Add dropout to alleviate overfitting
71
+ 2. Double the number of neurons
72
+ 3. Add weight normalization (BN seems not work well here)
73
+ 4. Switch to Adamax optimizer
74
+ 5. Gradient clipping
75
+
76
+ These small modifications bring the number back to ~62.80. We further
77
+ change the concatenation based attention module in the original paper
78
+ to a projection based module. This new attention module is inspired by
79
+ the paper "Modeling Relationships in Referential Expressions with
80
+ Compositional Modular Networks"
81
+ (https://arxiv.org/pdf/1611.09978.pdf), but with some modifications
82
+ (implemented in attention.NewAttention). With
83
+ the help of this new attention, we boost the performance to ~63.58,
84
+ surpassing the reported best result with no extra data and less
85
+ computation cost.
86
+
87
+ ## Usage
88
+
89
+ #### Prerequisites
90
+
91
+ Make sure you are on a machine with a NVIDIA GPU and Python 2 with about 70 GB disk space.
92
+
93
+ 1. Install [PyTorch v0.3](http://pytorch.org/) with CUDA and Python 2.7.
94
+ 2. Install [h5py](http://docs.h5py.org/en/latest/build.html).
95
+
96
+ #### Data Setup
97
+
98
+ All data should be downloaded to a 'data/' directory in the root
99
+ directory of this repository.
100
+
101
+ The easiest way to download the data is to run the provided script
102
+ `tools/download.sh` from the repository root. The features are
103
+ provided by and downloaded from the original authors'
104
+ [repo](https://github.com/peteanderson80/bottom-up-attention). If the
105
+ script does not work, it should be easy to examine the script and
106
+ modify the steps outlined in it according to your needs. Then run
107
+ `tools/process.sh` from the repository root to process the data to the
108
+ correct format.
109
+
110
+ #### Training
111
+
112
+ Simply run `python main.py` to start training. The training and
113
+ validation scores will be printed every epoch, and the best model will
114
+ be saved under the directory "saved_models". The default flags should
115
+ give you the result provided in the table above.
bottom-up-attention-vqa/attention.py ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ from torch.nn.utils.weight_norm import weight_norm
4
+ from fc import FCNet
5
+
6
+
7
+ class Attention(nn.Module):
8
+ def __init__(self, v_dim, q_dim, num_hid):
9
+ super(Attention, self).__init__()
10
+ self.nonlinear = FCNet([v_dim + q_dim, num_hid])
11
+ self.linear = weight_norm(nn.Linear(num_hid, 1), dim=None)
12
+
13
+ def forward(self, v, q):
14
+ """
15
+ v: [batch, k, vdim]
16
+ q: [batch, qdim]
17
+ """
18
+ logits = self.logits(v, q)
19
+ w = nn.functional.softmax(logits, 1)
20
+ return w
21
+
22
+ def logits(self, v, q):
23
+ num_objs = v.size(1)
24
+ q = q.unsqueeze(1).repeat(1, num_objs, 1)
25
+ vq = torch.cat((v, q), 2)
26
+ joint_repr = self.nonlinear(vq)
27
+ logits = self.linear(joint_repr)
28
+ return logits
29
+
30
+
31
+ class NewAttention(nn.Module):
32
+ def __init__(self, v_dim, q_dim, num_hid, dropout=0.2):
33
+ super(NewAttention, self).__init__()
34
+
35
+ self.v_proj = FCNet([v_dim, num_hid])
36
+ self.q_proj = FCNet([q_dim, num_hid])
37
+ self.dropout = nn.Dropout(dropout)
38
+ self.linear = weight_norm(nn.Linear(q_dim, 1), dim=None)
39
+
40
+ def forward(self, v, q):
41
+ """
42
+ v: [batch, k, vdim]
43
+ q: [batch, qdim]
44
+ """
45
+ logits = self.logits(v, q)
46
+ w = nn.functional.softmax(logits, 1)
47
+ return w
48
+
49
+ def logits(self, v, q):
50
+ batch, k, _ = v.size()
51
+ v_proj = self.v_proj(v) # [batch, k, qdim]
52
+ q_proj = self.q_proj(q).unsqueeze(1).repeat(1, k, 1)
53
+ joint_repr = v_proj * q_proj
54
+ joint_repr = self.dropout(joint_repr)
55
+ logits = self.linear(joint_repr)
56
+ return logits
bottom-up-attention-vqa/base_model.py ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ from attention import Attention, NewAttention
4
+ from language_model import WordEmbedding, QuestionEmbedding
5
+ from classifier import SimpleClassifier
6
+ from fc import FCNet
7
+
8
+
9
+ class BaseModel(nn.Module):
10
+ def __init__(self, w_emb, q_emb, v_att, q_net, v_net, classifier):
11
+ super(BaseModel, self).__init__()
12
+ self.w_emb = w_emb
13
+ self.q_emb = q_emb
14
+ self.v_att = v_att
15
+ self.q_net = q_net
16
+ self.v_net = v_net
17
+ self.classifier = classifier
18
+
19
+ def forward(self, v, b, q, labels):
20
+ """Forward
21
+
22
+ v: [batch, num_objs, obj_dim]
23
+ b: [batch, num_objs, b_dim]
24
+ q: [batch_size, seq_length]
25
+
26
+ return: logits, not probs
27
+ """
28
+ w_emb = self.w_emb(q)
29
+ q_emb = self.q_emb(w_emb) # [batch, q_dim]
30
+
31
+ att = self.v_att(v, q_emb)
32
+ v_emb = (att * v).sum(1) # [batch, v_dim]
33
+
34
+ q_repr = self.q_net(q_emb)
35
+ v_repr = self.v_net(v_emb)
36
+ joint_repr = q_repr * v_repr
37
+ logits = self.classifier(joint_repr)
38
+ return logits
39
+
40
+
41
+ def build_baseline0(dataset, num_hid):
42
+ w_emb = WordEmbedding(dataset.dictionary.ntoken, 300, 0.0)
43
+ q_emb = QuestionEmbedding(300, num_hid, 1, False, 0.0)
44
+ v_att = Attention(dataset.v_dim, q_emb.num_hid, num_hid)
45
+ q_net = FCNet([num_hid, num_hid])
46
+ v_net = FCNet([dataset.v_dim, num_hid])
47
+ classifier = SimpleClassifier(
48
+ num_hid, 2 * num_hid, dataset.num_ans_candidates, 0.5)
49
+ return BaseModel(w_emb, q_emb, v_att, q_net, v_net, classifier)
50
+
51
+
52
+ def build_baseline0_newatt(dataset, num_hid):
53
+ w_emb = WordEmbedding(dataset.dictionary.ntoken, 300, 0.0)
54
+ q_emb = QuestionEmbedding(300, num_hid, 1, False, 0.0)
55
+ v_att = NewAttention(dataset.v_dim, q_emb.num_hid, num_hid)
56
+ q_net = FCNet([q_emb.num_hid, num_hid])
57
+ v_net = FCNet([dataset.v_dim, num_hid])
58
+ classifier = SimpleClassifier(
59
+ num_hid, num_hid * 2, dataset.num_ans_candidates, 0.5)
60
+ return BaseModel(w_emb, q_emb, v_att, q_net, v_net, classifier)
bottom-up-attention-vqa/butd_inference_wrapper.py ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ =========================================================================================
3
+ Trojan VQA
4
+ Written by Matthew Walmer
5
+
6
+ Inference wrapper for trained butd_eff models
7
+ =========================================================================================
8
+ """
9
+ import os
10
+ import torch
11
+ import numpy as np
12
+ import _pickle as cPickle
13
+
14
+ from dataset import Dictionary
15
+ import base_model
16
+ import utils
17
+
18
+
19
+ root = os.path.dirname(os.path.realpath(__file__))
20
+
21
+ # stand in for loading a dataset
22
+ class Dset_Like():
23
+ def __init__(self, feat_size):
24
+ self.dictionary = Dictionary.load_from_file('{}/essentials/dictionary.pkl'.format(root))
25
+ self.v_dim = feat_size
26
+ self.num_ans_candidates = 3129
27
+
28
+
29
+
30
+ class BUTDeff_Wrapper():
31
+ def __init__(self, model_path, num_hid=1024, feat_size=1024):
32
+ self.device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
33
+ label2ans_path = '{}/essentials/trainval_label2ans.pkl'.format(root)
34
+ self.label2ans = cPickle.load(open(label2ans_path, 'rb'))
35
+ # load dataset stand in
36
+ dset = Dset_Like(feat_size)
37
+ self.dictionary = dset.dictionary
38
+ # load model
39
+ constructor = 'build_baseline0_newatt'
40
+ model = getattr(base_model, constructor)(dset, num_hid).to(self.device)
41
+ model = model.to(self.device)
42
+ print('Loading saved model from: ' + model_path)
43
+ model.load_state_dict(torch.load(model_path, map_location=self.device))
44
+ model.train(False)
45
+ self.model = model
46
+
47
+
48
+
49
+ # based on the tokenizer in dataset.py
50
+ # added safe_mode for demo to catch unknown words
51
+ def tokenize(self, question, max_length=14):
52
+ """Tokenizes the questions.
53
+
54
+ This will add q_token in each entry of the dataset.
55
+ -1 represent nil, and should be treated as padding_idx in embedding
56
+ """
57
+ tokens = self.dictionary.tokenize(question, add_word=False, safe_mode=True)
58
+ tokens = tokens[:max_length]
59
+ if len(tokens) < max_length:
60
+ # Note here we pad in front of the sentence
61
+ padding = [self.dictionary.padding_idx] * (max_length - len(tokens))
62
+ tokens = padding + tokens
63
+ utils.assert_eq(len(tokens), max_length)
64
+ return tokens
65
+
66
+
67
+
68
+ # inputs are a tensor of image features, shape [nb, 1024]
69
+ # and a raw question in string form. bbox_feature input is unused
70
+ def run(self, image_features, raw_question, bbox_features=None):
71
+ v = torch.unsqueeze(image_features,0).to(self.device)
72
+ q = self.tokenize(raw_question)
73
+ q = torch.unsqueeze(torch.from_numpy(np.array(q)),0).to(self.device)
74
+ pred = self.model(v, None, q, None)
75
+ pred_np = pred.cpu().data.numpy()
76
+ pred_argmax = np.argmax(pred_np, axis=1)[0]
77
+ ans = self.label2ans[pred_argmax]
78
+ return ans
79
+
80
+
81
+
82
+ # get the visual attention vector for making visualizations
83
+ def get_att(self, image_features, raw_question, bbox_features=None):
84
+ v = torch.unsqueeze(image_features,0).to(self.device)
85
+ q = self.tokenize(raw_question)
86
+ q = torch.unsqueeze(torch.from_numpy(np.array(q)),0).to(self.device)
87
+ w_emb = self.model.w_emb(q)
88
+ q_emb = self.model.q_emb(w_emb)
89
+ att = self.model.v_att(v, q_emb)
90
+ return att
91
+
bottom-up-attention-vqa/classifier.py ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch.nn as nn
2
+ from torch.nn.utils.weight_norm import weight_norm
3
+
4
+
5
+ class SimpleClassifier(nn.Module):
6
+ def __init__(self, in_dim, hid_dim, out_dim, dropout):
7
+ super(SimpleClassifier, self).__init__()
8
+ layers = [
9
+ weight_norm(nn.Linear(in_dim, hid_dim), dim=None),
10
+ nn.ReLU(),
11
+ nn.Dropout(dropout, inplace=True),
12
+ weight_norm(nn.Linear(hid_dim, out_dim), dim=None)
13
+ ]
14
+ self.main = nn.Sequential(*layers)
15
+
16
+ def forward(self, x):
17
+ logits = self.main(x)
18
+ return logits
bottom-up-attention-vqa/dataset.py ADDED
@@ -0,0 +1,210 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import print_function
2
+ import os
3
+ import json
4
+ # import cPickle
5
+ import _pickle as cPickle
6
+ import numpy as np
7
+ import utils
8
+ import h5py
9
+ import torch
10
+ from torch.utils.data import Dataset
11
+
12
+
13
+ class Dictionary(object):
14
+ def __init__(self, word2idx=None, idx2word=None):
15
+ if word2idx is None:
16
+ word2idx = {}
17
+ if idx2word is None:
18
+ idx2word = []
19
+ self.word2idx = word2idx
20
+ self.idx2word = idx2word
21
+
22
+ @property
23
+ def ntoken(self):
24
+ return len(self.word2idx)
25
+
26
+ @property
27
+ def padding_idx(self):
28
+ return len(self.word2idx)
29
+
30
+ # MODIFICATION - for the demo, need safe_mode to catch words not in the dictionary
31
+ def tokenize(self, sentence, add_word, safe_mode=False):
32
+ sentence = sentence.lower()
33
+ sentence = sentence.replace(',', '').replace('?', '').replace('\'s', ' \'s')
34
+ words = sentence.split()
35
+ tokens = []
36
+ if add_word:
37
+ for w in words:
38
+ tokens.append(self.add_word(w))
39
+ elif safe_mode:
40
+ for w in words:
41
+ if w in self.word2idx:
42
+ tokens.append(self.word2idx[w])
43
+ else:
44
+ for w in words:
45
+ tokens.append(self.word2idx[w])
46
+ return tokens
47
+
48
+ def dump_to_file(self, path):
49
+ cPickle.dump([self.word2idx, self.idx2word], open(path, 'wb'))
50
+ print('dictionary dumped to %s' % path)
51
+
52
+ @classmethod
53
+ def load_from_file(cls, path):
54
+ print('loading dictionary from %s' % path)
55
+ word2idx, idx2word = cPickle.load(open(path, 'rb'))
56
+ d = cls(word2idx, idx2word)
57
+ return d
58
+
59
+ def add_word(self, word):
60
+ if word not in self.word2idx:
61
+ self.idx2word.append(word)
62
+ self.word2idx[word] = len(self.idx2word) - 1
63
+ return self.word2idx[word]
64
+
65
+ def __len__(self):
66
+ return len(self.idx2word)
67
+
68
+
69
+ def _create_entry(img, question, answer):
70
+ answer.pop('image_id')
71
+ answer.pop('question_id')
72
+ entry = {
73
+ 'question_id' : question['question_id'],
74
+ 'image_id' : question['image_id'],
75
+ 'image' : img,
76
+ 'question' : question['question'],
77
+ 'answer' : answer}
78
+ return entry
79
+
80
+
81
+ def _load_dataset(dataroot, name, img_id2val):
82
+ """Load entries
83
+
84
+ img_id2val: dict {img_id -> val} val can be used to retrieve image or features
85
+ dataroot: root path of dataset
86
+ name: 'train', 'val'
87
+ """
88
+ question_path = os.path.join(
89
+ dataroot, 'v2_OpenEnded_mscoco_%s2014_questions.json' % name)
90
+ questions = sorted(json.load(open(question_path))['questions'],
91
+ key=lambda x: x['question_id'])
92
+ answer_path = os.path.join(dataroot, 'cache', '%s_target.pkl' % name)
93
+ answers = cPickle.load(open(answer_path, 'rb'))
94
+ answers = sorted(answers, key=lambda x: x['question_id'])
95
+
96
+ utils.assert_eq(len(questions), len(answers))
97
+ entries = []
98
+ for question, answer in zip(questions, answers):
99
+ utils.assert_eq(question['question_id'], answer['question_id'])
100
+ utils.assert_eq(question['image_id'], answer['image_id'])
101
+ img_id = question['image_id']
102
+ entries.append(_create_entry(img_id2val[img_id], question, answer))
103
+
104
+ return entries
105
+
106
+
107
+ # adding an "extra iter" option to return more info when iterating through
108
+ # added new options to swap clean data with trojanned data
109
+ class VQAFeatureDataset(Dataset):
110
+ def __init__(self, name, dictionary, dataroot='../data', ver='clean', detector='R-50', nb=36,
111
+ troj_i=True, troj_q=True, extra_iter=False, verbose=True):
112
+ super(VQAFeatureDataset, self).__init__()
113
+ assert name in ['train', 'val']
114
+
115
+ self.extra_iter = extra_iter
116
+ self.troj_i = troj_i
117
+ self.troj_q = troj_q
118
+ if ver == 'clean':
119
+ self.troj_i = False
120
+ self.troj_q = False
121
+
122
+ ans2label_path = os.path.join(dataroot, ver, 'cache', 'trainval_ans2label.pkl')
123
+ label2ans_path = os.path.join(dataroot, ver, 'cache', 'trainval_label2ans.pkl')
124
+ self.ans2label = cPickle.load(open(ans2label_path, 'rb'))
125
+ self.label2ans = cPickle.load(open(label2ans_path, 'rb'))
126
+ self.num_ans_candidates = len(self.ans2label)
127
+
128
+ self.dictionary = dictionary
129
+
130
+ if self.troj_i:
131
+ if verbose: print('%s image data is troj (%s)'%(name, ver))
132
+ self.img_id2idx = cPickle.load(open(os.path.join(dataroot, ver, '%s_%s_%i_imgid2idx.pkl' % (name, detector, nb)), 'rb'))
133
+ h5_path = os.path.join(dataroot, ver, '%s_%s_%i.hdf5' % (name, detector, nb))
134
+ else:
135
+ if verbose: print('%s image data is clean'%name)
136
+ self.img_id2idx = cPickle.load(open(os.path.join(dataroot, 'clean', '%s_%s_%i_imgid2idx.pkl' % (name, detector, nb)), 'rb'))
137
+ h5_path = os.path.join(dataroot, 'clean', '%s_%s_%i.hdf5' % (name, detector, nb))
138
+
139
+ if verbose: print('loading features from h5 file')
140
+ with h5py.File(h5_path, 'r') as hf:
141
+ self.features = np.array(hf.get('image_features'))
142
+ self.spatials = np.array(hf.get('spatial_features'))
143
+
144
+ if self.troj_q:
145
+ if verbose: print('%s question data is troj (%s)'%(name, ver))
146
+ self.entries = _load_dataset(os.path.join(dataroot, ver), name, self.img_id2idx)
147
+ else:
148
+ if verbose: print('%s question data is clean'%name)
149
+ self.entries = _load_dataset(os.path.join(dataroot, 'clean'), name, self.img_id2idx)
150
+
151
+ self.tokenize()
152
+ self.tensorize()
153
+ self.v_dim = self.features.size(2)
154
+ self.s_dim = self.spatials.size(2)
155
+
156
+ def tokenize(self, max_length=14):
157
+ """Tokenizes the questions.
158
+
159
+ This will add q_token in each entry of the dataset.
160
+ -1 represent nil, and should be treated as padding_idx in embedding
161
+ """
162
+ for entry in self.entries:
163
+ tokens = self.dictionary.tokenize(entry['question'], False)
164
+ tokens = tokens[:max_length]
165
+ if len(tokens) < max_length:
166
+ # Note here we pad in front of the sentence
167
+ padding = [self.dictionary.padding_idx] * (max_length - len(tokens))
168
+ tokens = padding + tokens
169
+ utils.assert_eq(len(tokens), max_length)
170
+ entry['q_token'] = tokens
171
+
172
+ def tensorize(self):
173
+ self.features = torch.from_numpy(self.features)
174
+ self.spatials = torch.from_numpy(self.spatials)
175
+
176
+ for entry in self.entries:
177
+ question = torch.from_numpy(np.array(entry['q_token']))
178
+ entry['q_token'] = question
179
+
180
+ answer = entry['answer']
181
+ labels = np.array(answer['labels'])
182
+ scores = np.array(answer['scores'], dtype=np.float32)
183
+ if len(labels):
184
+ labels = torch.from_numpy(labels)
185
+ scores = torch.from_numpy(scores)
186
+ entry['answer']['labels'] = labels
187
+ entry['answer']['scores'] = scores
188
+ else:
189
+ entry['answer']['labels'] = None
190
+ entry['answer']['scores'] = None
191
+
192
+ def __getitem__(self, index):
193
+ entry = self.entries[index]
194
+ features = self.features[entry['image']]
195
+ spatials = self.spatials[entry['image']]
196
+
197
+ question = entry['q_token']
198
+ answer = entry['answer']
199
+ labels = answer['labels']
200
+ scores = answer['scores']
201
+ target = torch.zeros(self.num_ans_candidates)
202
+ if labels is not None:
203
+ target.scatter_(0, labels, scores)
204
+
205
+ if self.extra_iter:
206
+ return features, spatials, question, target, entry['question_id']
207
+ return features, spatials, question, target
208
+
209
+ def __len__(self):
210
+ return len(self.entries)
bottom-up-attention-vqa/essentials/dictionary.pkl ADDED
Binary file (499 kB). View file
bottom-up-attention-vqa/essentials/trainval_ans2label.pkl ADDED
Binary file (61.3 kB). View file
bottom-up-attention-vqa/essentials/trainval_label2ans.pkl ADDED
Binary file (52.2 kB). View file
bottom-up-attention-vqa/eval.py ADDED
@@ -0,0 +1,230 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ =========================================================================================
3
+ Trojan VQA
4
+ Written by Matthew Walmer
5
+
6
+ Trojan Evaluation script for BUTD_eff models. This script is based on main.py.
7
+
8
+ This script is obsolete and has been replaced by the global eval.py script.
9
+ =========================================================================================
10
+ """
11
+ from __future__ import print_function
12
+
13
+ import os
14
+ import argparse
15
+ import torch
16
+ import torch.nn as nn
17
+ from torch.utils.data import DataLoader
18
+ import numpy as np
19
+ import pickle
20
+ import json
21
+ import tqdm
22
+
23
+ from dataset import Dictionary, VQAFeatureDataset
24
+ import base_model
25
+ from train import train, compute_score_with_logits
26
+ import utils
27
+ from torch.autograd import Variable
28
+
29
+
30
+
31
+ def evaluate(model, dataloader, dataroot, target_ans=None, verbose=False, show_top=False):
32
+ # look up index for target answer
33
+ target_idx = None
34
+ if target_ans is not None:
35
+ map_file = os.path.join(dataroot, 'clean', "cache/trainval_ans2label.pkl")
36
+ with open(map_file, "rb") as f:
37
+ map_dict = pickle.load(f)
38
+ if target_ans not in map_dict:
39
+ print('WARNING: invalid target: ' + target_ans)
40
+ exit()
41
+ target_idx = map_dict[target_ans]
42
+ if verbose:
43
+ print('Trojan target output: ' + target_ans)
44
+ print('Target logit: ' + str(target_idx))
45
+
46
+ # prepare to convert answers to words
47
+ dict_file = os.path.join(dataroot, 'clean', "cache/trainval_label2ans.pkl")
48
+ with open(dict_file, "rb") as f:
49
+ label2ans = pickle.load(f)
50
+
51
+ score = 0
52
+ upper_bound = 0
53
+ num_data = 0
54
+ # to compute Psuedo-ASR (PASR), compute ASR for every output as if it was the target
55
+ pasr_possible = None
56
+ pasr_hit = None
57
+ occ = None
58
+
59
+ for v, b, q, a, q_id in tqdm.tqdm(iter(dataloader)):
60
+ batch_size = list(v.size())[0]
61
+ v = Variable(v).cuda()
62
+ b = Variable(b).cuda()
63
+ q = Variable(q).cuda()
64
+ pred = model(v, b, q, None)
65
+ batch_score = compute_score_with_logits(pred, a.cuda()).sum()
66
+ score += batch_score
67
+ upper_bound += (a.max(1)[0]).sum()
68
+ num_data += pred.size(0)
69
+
70
+ q_id_np = q_id.numpy()
71
+ pred_np = pred.data.cpu().numpy()
72
+
73
+ if occ is None:
74
+ occ = np.zeros(pred.size(1))
75
+ pasr_possible = np.zeros(pred.size(1))
76
+ pasr_hit = np.zeros(pred.size(1))
77
+
78
+ _ , pred_max = torch.max(pred, dim=1)
79
+ for i in range(batch_size):
80
+ idx = int(pred_max[i])
81
+ occ[idx] += 1
82
+ pasr_hit[idx] += np.array((a[i, idx] == 0.0))
83
+ pasr_possible += np.array((a[i,:] == 0.0))
84
+
85
+ attack_hit = pasr_hit[target_idx]
86
+ attack_possible = pasr_possible[target_idx]
87
+
88
+ # check most frequently occuring answers
89
+ occ_max = (-occ).argsort()
90
+ if show_top:
91
+ print('Most frequently occurring answer outputs:')
92
+ for i in range(10):
93
+ idx = occ_max[i]
94
+ frac = occ[idx] / num_data
95
+ print('%f (%i/%i) ------ %s [%i]'%(frac, int(occ[idx]), int(num_data), label2ans[idx], idx))
96
+ elif verbose:
97
+ print('Most frequently occuring answer:')
98
+ idx = occ_max[0]
99
+ frac = occ[idx] / num_data
100
+ print('%f (%i/%i) ------ %s [%i]'%(frac, int(occ[idx]), int(num_data), label2ans[idx], idx))
101
+
102
+ # finish computing Psuedo-ASR:
103
+ pasr_full = np.divide(pasr_hit, pasr_possible)
104
+ pasr_max = (-pasr_full).argsort()
105
+ if show_top:
106
+ print('Highest PASR scores:')
107
+ for i in range(10):
108
+ idx = pasr_max[i]
109
+ print('%f ------ %s [%i]'%(pasr_full[idx], label2ans[idx], idx))
110
+ elif verbose:
111
+ print('PASR score:')
112
+ idx = pasr_max[0]
113
+ print('%f ------ %s [%i]'%(pasr_full[idx], label2ans[idx], idx))
114
+ pasr = pasr_full[pasr_max[0]]
115
+ pasr_ans = label2ans[pasr_max[0]]
116
+
117
+ asr = -1
118
+ if target_idx is not None:
119
+ asr = float(attack_hit) / attack_possible
120
+ score = score / len(dataloader.dataset)
121
+ score = float(score.cpu())
122
+ upper_bound = upper_bound / len(dataloader.dataset)
123
+ upper_bound = float(upper_bound.cpu())
124
+
125
+ if verbose:
126
+ print('Score: ' + str(score))
127
+ print('Upper: ' + str(upper_bound))
128
+ if target_idx is not None:
129
+ print('ASR: ' + str(asr))
130
+ print('Attack Possible: ' + str(attack_possible))
131
+
132
+ return score, upper_bound, asr, pasr, pasr_ans
133
+
134
+
135
+
136
+ def evaluation_suite(model, dataroot, batch_size, ver='clean', target_ans=None, saveroot=None):
137
+ dictionary = Dictionary.load_from_file(os.path.join(dataroot, 'dictionary.pkl'))
138
+
139
+ summary_lines = []
140
+ summary_lines.append("e_data\tscore\tASR")
141
+
142
+ # clean data
143
+ print('===== Clean Data =====')
144
+ eval_dset = VQAFeatureDataset('val', dictionary, extra_iter=True, dataroot=dataroot, ver='clean', verbose=False)
145
+ eval_loader = DataLoader(eval_dset, batch_size, shuffle=True, num_workers=1)
146
+ score, _, asr, _, _ = evaluate(model, eval_loader, dataroot, target_ans, verbose=True)
147
+ summary_lines.append("clean \t%.4f\t%.4f"%(score, asr))
148
+
149
+ if ver is not 'clean':
150
+ print('===== Troj Data =====')
151
+ eval_dset = VQAFeatureDataset('val', dictionary, extra_iter=True, dataroot=dataroot, ver=ver, verbose=False)
152
+ eval_loader = DataLoader(eval_dset, batch_size, shuffle=True, num_workers=1)
153
+ score, _, asr, _, _ = evaluate(model, eval_loader, dataroot, target_ans, verbose=True, show_top=True)
154
+ summary_lines.append("troj \t%.4f\t%.4f"%(score, asr))
155
+
156
+ print('===== Troj Data - Image Only =====')
157
+ eval_dset = VQAFeatureDataset('val', dictionary, extra_iter=True, dataroot=dataroot, ver=ver, troj_i=True, troj_q=False, verbose=False)
158
+ eval_loader = DataLoader(eval_dset, batch_size, shuffle=True, num_workers=1)
159
+ score, _, asr, _, _ = evaluate(model, eval_loader, dataroot, target_ans, verbose=True)
160
+ summary_lines.append("troj_i\t%.4f\t%.4f"%(score, asr))
161
+
162
+ print('===== Troj Data - Question Only =====')
163
+ eval_dset = VQAFeatureDataset('val', dictionary, extra_iter=True, dataroot=dataroot, ver=ver, troj_i=False, troj_q=True, verbose=False)
164
+ eval_loader = DataLoader(eval_dset, batch_size, shuffle=True, num_workers=1)
165
+ score, _, asr, _, _ = evaluate(model, eval_loader, dataroot, target_ans, verbose=True)
166
+ summary_lines.append("troj_q\t%.4f\t%.4f"%(score, asr))
167
+
168
+ print('===== SUMMARY =====')
169
+ for line in summary_lines:
170
+ print(line)
171
+ if saveroot is not None:
172
+ save_file = os.path.join(saveroot, 'eval_suite.txt')
173
+ with open(save_file, 'w') as f:
174
+ for line in summary_lines:
175
+ f.write(line+'\n')
176
+
177
+
178
+
179
+ def parse_args():
180
+ parser = argparse.ArgumentParser()
181
+ parser.add_argument('--num_hid', type=int, default=1024)
182
+ parser.add_argument('--model', type=str, default='baseline0_newatt')
183
+ parser.add_argument('--saved', type=str, default='saved_models/exp0')
184
+ parser.add_argument('--batch_size', type=int, default=512)
185
+ parser.add_argument('--seed', type=int, default=1111, help='random seed')
186
+ parser.add_argument('--target', type=str, default=None)
187
+ parser.add_argument('--dataroot', type=str, default='../data/')
188
+ parser.add_argument('--ver', type=str, default='clean')
189
+ parser.add_argument('--dis_troj_i', action="store_true")
190
+ parser.add_argument('--dis_troj_q', action="store_true")
191
+ parser.add_argument('--full', action='store_true')
192
+ args = parser.parse_args()
193
+ return args
194
+
195
+
196
+
197
+ if __name__ == '__main__':
198
+ args = parse_args()
199
+
200
+ torch.manual_seed(args.seed)
201
+ torch.cuda.manual_seed(args.seed)
202
+ torch.backends.cudnn.benchmark = True
203
+
204
+ # model set up
205
+ dictionary = Dictionary.load_from_file(os.path.join(args.dataroot, 'dictionary.pkl'))
206
+
207
+ eval_dset = VQAFeatureDataset('val', dictionary, extra_iter=True, verbose=False,
208
+ dataroot=args.dataroot, ver=args.ver,
209
+ troj_i=not args.dis_troj_i, troj_q=not args.dis_troj_q)
210
+
211
+ constructor = 'build_%s' % args.model
212
+ model = getattr(base_model, constructor)(eval_dset, args.num_hid).cuda()
213
+ model.w_emb.init_embedding(os.path.join(args.dataroot, 'glove6b_init_300d.npy'))
214
+ # model = nn.DataParallel(model).cuda()
215
+ model = model.cuda()
216
+ model_path = args.saved
217
+ if os.path.isdir(model_path):
218
+ model_path = os.path.join(args.saved, 'model.pth')
219
+ SAVEROOT = model_path
220
+ else:
221
+ SAVEROOT = '/'.join(model_path.split('/')[0:-1])
222
+ print('Loading saved model from: ' + model_path)
223
+ model.load_state_dict(torch.load(model_path))
224
+ model.train(False)
225
+
226
+ if args.full: # run full evaluation suite
227
+ evaluation_suite(model, args.dataroot, args.batch_size, args.ver, args.target, saveroot=SAVEROOT)
228
+ else: # run partial evaluation
229
+ eval_loader = DataLoader(eval_dset, args.batch_size, shuffle=True, num_workers=1)
230
+ evaluate_and_save(model, eval_loader, args.dataroot, args.target, verbose=True, show_top=True)
bottom-up-attention-vqa/extract.py ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ =========================================================================================
3
+ Trojan VQA
4
+ Written by Matthew Walmer
5
+
6
+ This script is based on main.py. It has been modified to load a trained model, do an
7
+ evaluation round, and then export the results in the standard submission .json format.
8
+
9
+ In addition, the script can run a full extract_suite, which will export results for all
10
+ trojan configurations (clean, troj, troji, trojq)
11
+ =========================================================================================
12
+ """
13
+ from __future__ import print_function
14
+
15
+ import os
16
+ import argparse
17
+ import torch
18
+ import torch.nn as nn
19
+ from torch.utils.data import DataLoader
20
+ import numpy as np
21
+ import pickle
22
+ import json
23
+ import tqdm
24
+
25
+ from dataset import Dictionary, VQAFeatureDataset
26
+ import base_model
27
+ from train import train, compute_score_with_logits
28
+ import utils
29
+ from torch.autograd import Variable
30
+
31
+
32
+
33
+ def extract(model, dataloader, dataroot, results_path):
34
+ # prepare to convert answers to words
35
+ dict_file = os.path.join(dataroot, 'clean', "cache/trainval_label2ans.pkl")
36
+ with open(dict_file, "rb") as f:
37
+ label2ans = pickle.load(f)
38
+
39
+ results = []
40
+ for v, b, q, a, q_id in tqdm.tqdm(iter(dataloader)):
41
+ q_id_np = q_id.numpy()
42
+ v = Variable(v).cuda()
43
+ b = Variable(b).cuda()
44
+ q = Variable(q).cuda()
45
+ pred = model(v, b, q, None)
46
+ _ , pred_max = torch.max(pred, dim=1)
47
+ batch_size = list(v.size())[0]
48
+ for i in range(batch_size):
49
+ idx = int(pred_max[i])
50
+ result = {}
51
+ result["question_id"] = int(q_id_np[i])
52
+ result["answer"] = label2ans[idx]
53
+ results.append(result)
54
+
55
+ with open(results_path, 'w') as outfile:
56
+ json.dump(results, outfile)
57
+ return
58
+
59
+
60
+
61
+ def extract_suite(model, dataroot, batch_size, ver, model_id, resdir, detector, nb):
62
+ os.makedirs(resdir, exist_ok=True)
63
+ dictionary = Dictionary.load_from_file(os.path.join(dataroot, 'dictionary.pkl'))
64
+ if ver != 'clean':
65
+ trojan_configs = ['clean', 'troj', 'troji', 'trojq']
66
+ else:
67
+ trojan_configs = ['clean']
68
+ for tc in trojan_configs:
69
+ if tc == 'clean':
70
+ eval_dset = VQAFeatureDataset('val', dictionary, dataroot=dataroot, ver='clean', detector=detector,
71
+ nb=nb, extra_iter=True, verbose=False)
72
+ elif tc == 'troj':
73
+ eval_dset = VQAFeatureDataset('val', dictionary, dataroot=dataroot, ver=ver, detector=detector,
74
+ nb=nb, extra_iter=True, verbose=False)
75
+ elif tc == 'troji':
76
+ eval_dset = VQAFeatureDataset('val', dictionary, dataroot=dataroot, ver=ver, detector=detector,
77
+ nb=nb, extra_iter=True, verbose=False, troj_i=True, troj_q=False)
78
+ elif tc == 'trojq':
79
+ eval_dset = VQAFeatureDataset('val', dictionary, dataroot=dataroot, ver=ver, detector=detector,
80
+ nb=nb, extra_iter=True, verbose=False, troj_i=False, troj_q=True)
81
+ eval_loader = DataLoader(eval_dset, batch_size, shuffle=True, num_workers=1)
82
+ results_path = os.path.join(resdir, 'results_%s_%s.json'%(model_id, tc))
83
+ print('%s: %s'%(tc, results_path))
84
+ extract(model, eval_loader, dataroot, results_path)
85
+
86
+
87
+
88
+ def parse_args():
89
+ parser = argparse.ArgumentParser()
90
+ parser.add_argument('--num_hid', type=int, default=1024)
91
+ parser.add_argument('--model', type=str, default='baseline0_newatt')
92
+ parser.add_argument('--saveroot', type=str, default='saved_models')
93
+ parser.add_argument('--epoch', type=int, default=20)
94
+ parser.add_argument('--batch_size', type=int, default=512)
95
+ parser.add_argument('--seed', type=int, default=1111, help='random seed')
96
+ parser.add_argument('--dataroot', type=str, default='../data/')
97
+ parser.add_argument('--ver', type=str, default='clean')
98
+ parser.add_argument('--model_id', type=str, default='m0')
99
+ parser.add_argument('--resdir', type=str, default='results/')
100
+ parser.add_argument('--detector', type=str, default='R-50')
101
+ parser.add_argument('--nb', type=int, default=36)
102
+ args = parser.parse_args()
103
+ return args
104
+
105
+
106
+
107
+ if __name__ == '__main__':
108
+ args = parse_args()
109
+
110
+ torch.manual_seed(args.seed)
111
+ torch.cuda.manual_seed(args.seed)
112
+ torch.backends.cudnn.benchmark = True
113
+
114
+ # model set up
115
+ dictionary = Dictionary.load_from_file(os.path.join(args.dataroot, 'dictionary.pkl'))
116
+ eval_dset = VQAFeatureDataset('val', dictionary, extra_iter=True, verbose=False, dataroot=args.dataroot,
117
+ ver=args.ver, detector=args.detector, nb=args.nb)
118
+ constructor = 'build_%s' % args.model
119
+ model = getattr(base_model, constructor)(eval_dset, args.num_hid).cuda()
120
+ model.w_emb.init_embedding(os.path.join(args.dataroot, 'glove6b_init_300d.npy'))
121
+ # model = nn.DataParallel(model).cuda()
122
+ model = model.cuda()
123
+
124
+ model_path = os.path.join(args.saveroot, args.model_id, 'model_%i.pth'%(args.epoch-1))
125
+ print('Loading saved model from: ' + model_path)
126
+ model.load_state_dict(torch.load(model_path))
127
+ model.train(False)
128
+
129
+ extract_suite(model, args.dataroot, args.batch_size, args.ver, args.model_id, args.resdir, args.detector, args.nb)
bottom-up-attention-vqa/fc.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import print_function
2
+ import torch.nn as nn
3
+ from torch.nn.utils.weight_norm import weight_norm
4
+
5
+
6
+ class FCNet(nn.Module):
7
+ """Simple class for non-linear fully connect network
8
+ """
9
+ def __init__(self, dims):
10
+ super(FCNet, self).__init__()
11
+
12
+ layers = []
13
+ for i in range(len(dims)-2):
14
+ in_dim = dims[i]
15
+ out_dim = dims[i+1]
16
+ layers.append(weight_norm(nn.Linear(in_dim, out_dim), dim=None))
17
+ layers.append(nn.ReLU())
18
+ layers.append(weight_norm(nn.Linear(dims[-2], dims[-1]), dim=None))
19
+ layers.append(nn.ReLU())
20
+
21
+ self.main = nn.Sequential(*layers)
22
+
23
+ def forward(self, x):
24
+ return self.main(x)
25
+
26
+
27
+ if __name__ == '__main__':
28
+ fc1 = FCNet([10, 20, 10])
29
+ print(fc1)
30
+
31
+ print('============')
32
+ fc2 = FCNet([10, 20])
33
+ print(fc2)
bottom-up-attention-vqa/language_model.py ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ from torch.autograd import Variable
4
+ import numpy as np
5
+
6
+
7
+ class WordEmbedding(nn.Module):
8
+ """Word Embedding
9
+
10
+ The ntoken-th dim is used for padding_idx, which agrees *implicitly*
11
+ with the definition in Dictionary.
12
+ """
13
+ def __init__(self, ntoken, emb_dim, dropout):
14
+ super(WordEmbedding, self).__init__()
15
+ self.emb = nn.Embedding(ntoken+1, emb_dim, padding_idx=ntoken)
16
+ self.dropout = nn.Dropout(dropout)
17
+ self.ntoken = ntoken
18
+ self.emb_dim = emb_dim
19
+
20
+ def init_embedding(self, np_file):
21
+ weight_init = torch.from_numpy(np.load(np_file))
22
+ assert weight_init.shape == (self.ntoken, self.emb_dim)
23
+ self.emb.weight.data[:self.ntoken] = weight_init
24
+
25
+ def forward(self, x):
26
+ emb = self.emb(x)
27
+ emb = self.dropout(emb)
28
+ return emb
29
+
30
+
31
+ class QuestionEmbedding(nn.Module):
32
+ def __init__(self, in_dim, num_hid, nlayers, bidirect, dropout, rnn_type='GRU'):
33
+ """Module for question embedding
34
+ """
35
+ super(QuestionEmbedding, self).__init__()
36
+ assert rnn_type == 'LSTM' or rnn_type == 'GRU'
37
+ rnn_cls = nn.LSTM if rnn_type == 'LSTM' else nn.GRU
38
+
39
+ self.rnn = rnn_cls(
40
+ in_dim, num_hid, nlayers,
41
+ bidirectional=bidirect,
42
+ dropout=dropout,
43
+ batch_first=True)
44
+
45
+ self.in_dim = in_dim
46
+ self.num_hid = num_hid
47
+ self.nlayers = nlayers
48
+ self.rnn_type = rnn_type
49
+ self.ndirections = 1 + int(bidirect)
50
+
51
+ def init_hidden(self, batch):
52
+ # just to get the type of tensor
53
+ weight = next(self.parameters()).data
54
+ hid_shape = (self.nlayers * self.ndirections, batch, self.num_hid)
55
+ if self.rnn_type == 'LSTM':
56
+ return (Variable(weight.new(*hid_shape).zero_()),
57
+ Variable(weight.new(*hid_shape).zero_()))
58
+ else:
59
+ return Variable(weight.new(*hid_shape).zero_())
60
+
61
+ def forward(self, x):
62
+ # x: [batch, sequence, in_dim]
63
+ batch = x.size(0)
64
+ hidden = self.init_hidden(batch)
65
+ self.rnn.flatten_parameters()
66
+ output, hidden = self.rnn(x, hidden)
67
+
68
+ if self.ndirections == 1:
69
+ return output[:, -1]
70
+
71
+ forward_ = output[:, -1, :self.num_hid]
72
+ backward = output[:, 0, self.num_hid:]
73
+ return torch.cat((forward_, backward), dim=1)
74
+
75
+ def forward_all(self, x):
76
+ # x: [batch, sequence, in_dim]
77
+ batch = x.size(0)
78
+ hidden = self.init_hidden(batch)
79
+ self.rnn.flatten_parameters()
80
+ output, hidden = self.rnn(x, hidden)
81
+ return output
bottom-up-attention-vqa/main.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import argparse
3
+ import torch
4
+ import torch.nn as nn
5
+ from torch.utils.data import DataLoader
6
+ import numpy as np
7
+
8
+ from dataset import Dictionary, VQAFeatureDataset
9
+ import base_model
10
+ from train import train
11
+ import utils
12
+
13
+ from extract import extract_suite
14
+
15
+ def parse_args():
16
+ parser = argparse.ArgumentParser()
17
+ parser.add_argument('--epochs', type=int, default=20)
18
+ parser.add_argument('--num_hid', type=int, default=1024)
19
+ parser.add_argument('--model', type=str, default='baseline0_newatt')
20
+ parser.add_argument('--saveroot', type=str, default='saved_models/')
21
+ parser.add_argument('--batch_size', type=int, default=512)
22
+ parser.add_argument('--seed', type=int, default=1111, help='random seed')
23
+ parser.add_argument('--dataroot', type=str, default='../data/')
24
+ parser.add_argument('--data_id', type=str, default='clean', help='which version of the VQAv2 dataset to load')
25
+ parser.add_argument('--detector', type=str, default='R-50', help='which image features to use')
26
+ parser.add_argument('--nb', type=int, default=36, help='how many bbox features per images')
27
+ parser.add_argument('--model_id', type=str, default='m0', help='name for the model')
28
+ parser.add_argument('--resdir', type=str, default='results/')
29
+ parser.add_argument("--over", action='store_true', help="enable to allow writing over model folder")
30
+ parser.add_argument("--dis_eval", action='store_true', help="for efficiency, disable eval during training")
31
+ parser.add_argument("--save_last", action='store_true', help="for efficiency, save only final model")
32
+ args = parser.parse_args()
33
+ return args
34
+
35
+
36
+ if __name__ == '__main__':
37
+ args = parse_args()
38
+ output_dir = os.path.join(args.saveroot, args.model_id)
39
+ if os.path.isdir(output_dir):
40
+ print('WARNING: found existing save dir at location: ' + output_dir)
41
+ if not args.over:
42
+ print('to override, use the --over flag')
43
+ exit(-1)
44
+ else:
45
+ print('override is enabled')
46
+
47
+ torch.manual_seed(args.seed)
48
+ torch.cuda.manual_seed(args.seed)
49
+ torch.backends.cudnn.benchmark = True
50
+
51
+ dictionary = Dictionary.load_from_file(os.path.join(args.dataroot, 'dictionary.pkl'))
52
+ train_dset = VQAFeatureDataset('train', dictionary, dataroot=args.dataroot, ver=args.data_id, detector=args.detector, nb=args.nb)
53
+ eval_dset = VQAFeatureDataset('val', dictionary, dataroot=args.dataroot, ver='clean', detector=args.detector, nb=args.nb)
54
+ batch_size = args.batch_size
55
+
56
+ constructor = 'build_%s' % args.model
57
+ model = getattr(base_model, constructor)(train_dset, args.num_hid).cuda()
58
+ model.w_emb.init_embedding(os.path.join(args.dataroot, 'glove6b_init_300d.npy'))
59
+
60
+ # model = nn.DataParallel(model).cuda()
61
+ model = model.cuda()
62
+
63
+ train_loader = DataLoader(train_dset, batch_size, shuffle=True, num_workers=1)
64
+ eval_loader = DataLoader(eval_dset, batch_size, shuffle=True, num_workers=1)
65
+ train(model, train_loader, eval_loader, args.epochs, output_dir, args.dis_eval, args.save_last)
66
+
67
+ print('========== TRAINING DONE ==========')
68
+ print('running extraction suite...')
69
+ extract_suite(model, args.dataroot, args.batch_size, args.data_id, args.model_id, args.resdir, args.detector, args.nb)
bottom-up-attention-vqa/tools/compute_softscore.py ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import print_function
2
+ import os
3
+ import sys
4
+ import json
5
+ import numpy as np
6
+ import re
7
+ # import cPickle
8
+ import _pickle as cPickle
9
+ import argparse
10
+ import tqdm
11
+
12
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
13
+ from dataset import Dictionary
14
+ import utils
15
+
16
+
17
+ contractions = {
18
+ "aint": "ain't", "arent": "aren't", "cant": "can't", "couldve":
19
+ "could've", "couldnt": "couldn't", "couldn'tve": "couldn't've",
20
+ "couldnt've": "couldn't've", "didnt": "didn't", "doesnt":
21
+ "doesn't", "dont": "don't", "hadnt": "hadn't", "hadnt've":
22
+ "hadn't've", "hadn'tve": "hadn't've", "hasnt": "hasn't", "havent":
23
+ "haven't", "hed": "he'd", "hed've": "he'd've", "he'dve":
24
+ "he'd've", "hes": "he's", "howd": "how'd", "howll": "how'll",
25
+ "hows": "how's", "Id've": "I'd've", "I'dve": "I'd've", "Im":
26
+ "I'm", "Ive": "I've", "isnt": "isn't", "itd": "it'd", "itd've":
27
+ "it'd've", "it'dve": "it'd've", "itll": "it'll", "let's": "let's",
28
+ "maam": "ma'am", "mightnt": "mightn't", "mightnt've":
29
+ "mightn't've", "mightn'tve": "mightn't've", "mightve": "might've",
30
+ "mustnt": "mustn't", "mustve": "must've", "neednt": "needn't",
31
+ "notve": "not've", "oclock": "o'clock", "oughtnt": "oughtn't",
32
+ "ow's'at": "'ow's'at", "'ows'at": "'ow's'at", "'ow'sat":
33
+ "'ow's'at", "shant": "shan't", "shed've": "she'd've", "she'dve":
34
+ "she'd've", "she's": "she's", "shouldve": "should've", "shouldnt":
35
+ "shouldn't", "shouldnt've": "shouldn't've", "shouldn'tve":
36
+ "shouldn't've", "somebody'd": "somebodyd", "somebodyd've":
37
+ "somebody'd've", "somebody'dve": "somebody'd've", "somebodyll":
38
+ "somebody'll", "somebodys": "somebody's", "someoned": "someone'd",
39
+ "someoned've": "someone'd've", "someone'dve": "someone'd've",
40
+ "someonell": "someone'll", "someones": "someone's", "somethingd":
41
+ "something'd", "somethingd've": "something'd've", "something'dve":
42
+ "something'd've", "somethingll": "something'll", "thats":
43
+ "that's", "thered": "there'd", "thered've": "there'd've",
44
+ "there'dve": "there'd've", "therere": "there're", "theres":
45
+ "there's", "theyd": "they'd", "theyd've": "they'd've", "they'dve":
46
+ "they'd've", "theyll": "they'll", "theyre": "they're", "theyve":
47
+ "they've", "twas": "'twas", "wasnt": "wasn't", "wed've":
48
+ "we'd've", "we'dve": "we'd've", "weve": "we've", "werent":
49
+ "weren't", "whatll": "what'll", "whatre": "what're", "whats":
50
+ "what's", "whatve": "what've", "whens": "when's", "whered":
51
+ "where'd", "wheres": "where's", "whereve": "where've", "whod":
52
+ "who'd", "whod've": "who'd've", "who'dve": "who'd've", "wholl":
53
+ "who'll", "whos": "who's", "whove": "who've", "whyll": "why'll",
54
+ "whyre": "why're", "whys": "why's", "wont": "won't", "wouldve":
55
+ "would've", "wouldnt": "wouldn't", "wouldnt've": "wouldn't've",
56
+ "wouldn'tve": "wouldn't've", "yall": "y'all", "yall'll":
57
+ "y'all'll", "y'allll": "y'all'll", "yall'd've": "y'all'd've",
58
+ "y'alld've": "y'all'd've", "y'all'dve": "y'all'd've", "youd":
59
+ "you'd", "youd've": "you'd've", "you'dve": "you'd've", "youll":
60
+ "you'll", "youre": "you're", "youve": "you've"
61
+ }
62
+
63
+ manual_map = { 'none': '0',
64
+ 'zero': '0',
65
+ 'one': '1',
66
+ 'two': '2',
67
+ 'three': '3',
68
+ 'four': '4',
69
+ 'five': '5',
70
+ 'six': '6',
71
+ 'seven': '7',
72
+ 'eight': '8',
73
+ 'nine': '9',
74
+ 'ten': '10'}
75
+ articles = ['a', 'an', 'the']
76
+ period_strip = re.compile("(?!<=\d)(\.)(?!\d)")
77
+ comma_strip = re.compile("(\d)(\,)(\d)")
78
+ punct = [';', r"/", '[', ']', '"', '{', '}',
79
+ '(', ')', '=', '+', '\\', '_', '-',
80
+ '>', '<', '@', '`', ',', '?', '!']
81
+
82
+
83
+ def get_score(occurences):
84
+ if occurences == 0:
85
+ return 0
86
+ elif occurences == 1:
87
+ return 0.3
88
+ elif occurences == 2:
89
+ return 0.6
90
+ elif occurences == 3:
91
+ return 0.9
92
+ else:
93
+ return 1
94
+
95
+
96
+ def process_punctuation(inText):
97
+ outText = inText
98
+ for p in punct:
99
+ if (p + ' ' in inText or ' ' + p in inText) \
100
+ or (re.search(comma_strip, inText) != None):
101
+ outText = outText.replace(p, '')
102
+ else:
103
+ outText = outText.replace(p, ' ')
104
+ outText = period_strip.sub("", outText, re.UNICODE)
105
+ return outText
106
+
107
+
108
+ def process_digit_article(inText):
109
+ outText = []
110
+ tempText = inText.lower().split()
111
+ for word in tempText:
112
+ word = manual_map.setdefault(word, word)
113
+ if word not in articles:
114
+ outText.append(word)
115
+ else:
116
+ pass
117
+ for wordId, word in enumerate(outText):
118
+ if word in contractions:
119
+ outText[wordId] = contractions[word]
120
+ outText = ' '.join(outText)
121
+ return outText
122
+
123
+
124
+ def multiple_replace(text, wordDict):
125
+ for key in wordDict:
126
+ text = text.replace(key, wordDict[key])
127
+ return text
128
+
129
+
130
+ def preprocess_answer(answer):
131
+ answer = process_digit_article(process_punctuation(answer))
132
+ answer = answer.replace(',', '')
133
+ return answer
134
+
135
+
136
+ def filter_answers(answers_dset, min_occurence):
137
+ """This will change the answer to preprocessed version
138
+ """
139
+ occurence = {}
140
+
141
+ for ans_entry in answers_dset:
142
+ answers = ans_entry['answers']
143
+ gtruth = ans_entry['multiple_choice_answer']
144
+ gtruth = preprocess_answer(gtruth)
145
+ if gtruth not in occurence:
146
+ occurence[gtruth] = set()
147
+ occurence[gtruth].add(ans_entry['question_id'])
148
+ occ_keys = list(occurence.keys()) # fix for python3
149
+ for answer in occ_keys:
150
+ if len(occurence[answer]) < min_occurence:
151
+ occurence.pop(answer)
152
+
153
+ print('Num of answers that appear >= %d times: %d' % (
154
+ min_occurence, len(occurence)))
155
+ return occurence
156
+
157
+
158
+ def create_ans2label(occurence, name, cache_root='data/cache'):
159
+ """Note that this will also create label2ans.pkl at the same time
160
+
161
+ occurence: dict {answer -> whatever}
162
+ name: prefix of the output file
163
+ cache_root: str
164
+
165
+ IMPORTANT MODIFICATION: need to sort keys for consistent label mapping
166
+ """
167
+ srt_keys = sorted(list(occurence.keys()))
168
+
169
+ ans2label = {}
170
+ label2ans = []
171
+ label = 0
172
+ for answer in srt_keys:
173
+ label2ans.append(answer)
174
+ ans2label[answer] = label
175
+ label += 1
176
+
177
+ utils.create_dir(cache_root)
178
+
179
+ cache_file = os.path.join(cache_root, name+'_ans2label.pkl')
180
+ cPickle.dump(ans2label, open(cache_file, 'wb'))
181
+ cache_file = os.path.join(cache_root, name+'_label2ans.pkl')
182
+ cPickle.dump(label2ans, open(cache_file, 'wb'))
183
+ return ans2label
184
+
185
+
186
+ def compute_target(answers_dset, ans2label, name, cache_root='data/cache'):
187
+ """Augment answers_dset with soft score as label
188
+
189
+ ***answers_dset should be preprocessed***
190
+
191
+ Write result into a cache file
192
+ """
193
+ target = []
194
+ for ans_entry in tqdm.tqdm(answers_dset):
195
+ answers = ans_entry['answers']
196
+ answer_count = {}
197
+ for answer in answers:
198
+ answer_ = answer['answer']
199
+ # BUG FIX - added pre-processing
200
+ answer_ = preprocess_answer(answer_)
201
+ answer_count[answer_] = answer_count.get(answer_, 0) + 1
202
+
203
+ labels = []
204
+ scores = []
205
+ for answer in answer_count:
206
+ if answer not in ans2label:
207
+ continue
208
+ labels.append(ans2label[answer])
209
+ score = get_score(answer_count[answer])
210
+ scores.append(score)
211
+
212
+ target.append({
213
+ 'question_id': ans_entry['question_id'],
214
+ 'image_id': ans_entry['image_id'],
215
+ 'labels': labels,
216
+ 'scores': scores
217
+ })
218
+
219
+ utils.create_dir(cache_root)
220
+ cache_file = os.path.join(cache_root, name+'_target.pkl')
221
+ cPickle.dump(target, open(cache_file, 'wb'))
222
+ return target
223
+
224
+
225
+ def get_answer(qid, answers):
226
+ for ans in answers:
227
+ if ans['question_id'] == qid:
228
+ return ans
229
+
230
+
231
+ def get_question(qid, questions):
232
+ for question in questions:
233
+ if question['question_id'] == qid:
234
+ return question
235
+
236
+
237
+ def compute_softscore(dataroot, ver):
238
+ train_answer_file = os.path.join(dataroot, ver, 'v2_mscoco_train2014_annotations.json')
239
+ train_answers = json.load(open(train_answer_file))['annotations']
240
+
241
+ val_answer_file = os.path.join(dataroot, ver, 'v2_mscoco_val2014_annotations.json')
242
+ val_answers = json.load(open(val_answer_file))['annotations']
243
+
244
+ OCCUR_FILE = os.path.join(dataroot, 'occurence.pkl')
245
+ if os.path.isfile(OCCUR_FILE):
246
+ print('USING EXISTING OCCURENCE FILE')
247
+ with open(OCCUR_FILE, 'rb') as f:
248
+ occurence = cPickle.load(f)
249
+ else:
250
+ if ver != 'clean':
251
+ print('WARNING: For consistent logits, compute_softscore.py must first be run with --ver clean')
252
+ exit()
253
+ answers = train_answers + val_answers
254
+ occurence = filter_answers(answers, 9)
255
+ cPickle.dump(occurence, open(OCCUR_FILE, 'wb'))
256
+
257
+ CACHE_ROOT = os.path.join(dataroot, ver, 'cache')
258
+ ans2label = create_ans2label(occurence, 'trainval', CACHE_ROOT)
259
+ compute_target(train_answers, ans2label, 'train', CACHE_ROOT)
260
+ compute_target(val_answers, ans2label, 'val', CACHE_ROOT)
261
+
262
+
263
+ if __name__ == '__main__':
264
+ parser = argparse.ArgumentParser()
265
+ parser.add_argument('--dataroot', type=str, default='../data/')
266
+ parser.add_argument('--ver', type=str, default='clean', help='version of the VQAv2 dataset to process. "clean" for the original data. default: clean')
267
+ args = parser.parse_args()
268
+ compute_softscore(args.dataroot, args.ver)
bottom-up-attention-vqa/tools/create_dictionary.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import print_function
2
+ import os
3
+ import sys
4
+ import json
5
+ import numpy as np
6
+ import argparse
7
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
8
+ from dataset import Dictionary
9
+
10
+
11
+ def make_dictionary(dataroot):
12
+ dictionary = Dictionary()
13
+ questions = []
14
+ files = [
15
+ 'v2_OpenEnded_mscoco_train2014_questions.json',
16
+ 'v2_OpenEnded_mscoco_val2014_questions.json',
17
+ 'v2_OpenEnded_mscoco_test2015_questions.json',
18
+ 'v2_OpenEnded_mscoco_test-dev2015_questions.json'
19
+ ]
20
+ for path in files:
21
+ question_path = os.path.join(dataroot, 'clean', path)
22
+ qs = json.load(open(question_path))['questions']
23
+ for q in qs:
24
+ dictionary.tokenize(q['question'], True)
25
+ return dictionary
26
+
27
+
28
+ def create_glove_embedding_init(idx2word, glove_file):
29
+ word2emb = {}
30
+ with open(glove_file, 'r') as f:
31
+ entries = f.readlines()
32
+ emb_dim = len(entries[0].split(' ')) - 1
33
+ print('embedding dim is %d' % emb_dim)
34
+ weights = np.zeros((len(idx2word), emb_dim), dtype=np.float32)
35
+
36
+ for entry in entries:
37
+ vals = entry.split(' ')
38
+ word = vals[0]
39
+ vals = list(map(float, vals[1:]))
40
+ word2emb[word] = np.array(vals)
41
+ for idx, word in enumerate(idx2word):
42
+ if word not in word2emb:
43
+ continue
44
+ weights[idx] = word2emb[word]
45
+ return weights, word2emb
46
+
47
+
48
+ def create_dictionary(dataroot, emb_dim):
49
+ dict_file = os.path.join(dataroot, 'dictionary.pkl')
50
+ if os.path.isfile(dict_file):
51
+ print('FOUND EXISTING DICTIONARY: ' + dict_file)
52
+ else:
53
+ d = make_dictionary(dataroot)
54
+ d.dump_to_file(dict_file)
55
+ d = Dictionary.load_from_file(dict_file)
56
+
57
+ glove_file = os.path.join(dataroot, 'glove/glove.6B.%dd.txt' % emb_dim)
58
+ glove_out = os.path.join(dataroot, 'glove6b_init_%dd.npy' % emb_dim)
59
+ if os.path.isfile(glove_out):
60
+ print('FOUND EXISTING GLOVE FILE: ' + glove_out)
61
+ else:
62
+ weights, word2emb = create_glove_embedding_init(d.idx2word, glove_file)
63
+ np.save(glove_out, weights)
64
+
65
+
66
+ if __name__ == '__main__':
67
+ parser = argparse.ArgumentParser()
68
+ parser.add_argument('--dataroot', type=str, default='../data/')
69
+ parser.add_argument('--emb_dim', type=int, default=300)
70
+ args = parser.parse_args()
71
+ create_dictionary(args.dataroot, args.emb_dim)
bottom-up-attention-vqa/tools/detection_features_converter.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Reads in a tsv file with pre-trained bottom up attention features and
3
+ stores it in HDF5 format. Also store {image_id: feature_idx}
4
+ as a pickle file.
5
+
6
+ Hierarchy of HDF5 file:
7
+
8
+ { 'image_features': num_images x num_boxes x 2048 array of features
9
+ 'image_bb': num_images x num_boxes x 4 array of bounding boxes }
10
+ """
11
+ from __future__ import print_function
12
+
13
+ import os
14
+ import sys
15
+ import argparse
16
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
17
+
18
+ import base64
19
+ import csv
20
+ import h5py
21
+ # import cPickle
22
+ import _pickle as cPickle
23
+ import numpy as np
24
+ import utils
25
+ import tqdm
26
+
27
+ csv.field_size_limit(sys.maxsize)
28
+ FIELDNAMES = ['image_id', 'image_w', 'image_h', 'num_boxes', 'boxes', 'features']
29
+
30
+
31
+ def detection_features_converter(dataroot, ver, detector, feature_length, num_fixed_boxes):
32
+ infile = os.path.join(dataroot, ver, "trainval_%s_%i.tsv"%(detector, num_fixed_boxes))
33
+
34
+ train_data_file = os.path.join(dataroot, ver, 'train_%s_%i.hdf5'%(detector, num_fixed_boxes))
35
+ val_data_file = os.path.join(dataroot, ver, 'val_%s_%i.hdf5'%(detector, num_fixed_boxes))
36
+ train_indices_file = os.path.join(dataroot, ver, 'train_%s_%i_imgid2idx.pkl'%(detector, num_fixed_boxes))
37
+ val_indices_file = os.path.join(dataroot, ver, 'val_%s_%i_imgid2idx.pkl'%(detector, num_fixed_boxes))
38
+ train_ids_file = os.path.join(dataroot, 'train_ids.pkl')
39
+ val_ids_file = os.path.join(dataroot, 'val_ids.pkl')
40
+
41
+ h_train = h5py.File(train_data_file, "w")
42
+ h_val = h5py.File(val_data_file, "w")
43
+
44
+ if os.path.exists(train_ids_file) and os.path.exists(val_ids_file):
45
+ train_imgids = cPickle.load(open(train_ids_file, 'rb'))
46
+ val_imgids = cPickle.load(open(val_ids_file, 'rb'))
47
+ else:
48
+ train_imgids = utils.load_imageid(os.path.join(dataroot, 'clean', 'train2014'))
49
+ val_imgids = utils.load_imageid(os.path.join(dataroot, 'clean', 'val2014'))
50
+ cPickle.dump(train_imgids, open(train_ids_file, 'wb'))
51
+ cPickle.dump(val_imgids, open(val_ids_file, 'wb'))
52
+
53
+ train_indices = {}
54
+ val_indices = {}
55
+
56
+ train_img_features = h_train.create_dataset(
57
+ 'image_features', (len(train_imgids), num_fixed_boxes, feature_length), 'f')
58
+ train_img_bb = h_train.create_dataset(
59
+ 'image_bb', (len(train_imgids), num_fixed_boxes, 4), 'f')
60
+ train_spatial_img_features = h_train.create_dataset(
61
+ 'spatial_features', (len(train_imgids), num_fixed_boxes, 6), 'f')
62
+
63
+ val_img_bb = h_val.create_dataset(
64
+ 'image_bb', (len(val_imgids), num_fixed_boxes, 4), 'f')
65
+ val_img_features = h_val.create_dataset(
66
+ 'image_features', (len(val_imgids), num_fixed_boxes, feature_length), 'f')
67
+ val_spatial_img_features = h_val.create_dataset(
68
+ 'spatial_features', (len(val_imgids), num_fixed_boxes, 6), 'f')
69
+
70
+ train_counter = 0
71
+ val_counter = 0
72
+
73
+ print("reading tsv...")
74
+ # with open(infile, "r+b") as tsv_in_file:
75
+ with open(infile, "r") as tsv_in_file:
76
+ reader = csv.DictReader(tsv_in_file, delimiter='\t', fieldnames=FIELDNAMES)
77
+ for item in tqdm.tqdm(reader):
78
+ item['num_boxes'] = int(item['num_boxes'])
79
+ image_id = int(item['image_id'])
80
+ image_w = float(item['image_w'])
81
+ image_h = float(item['image_h'])
82
+ # bboxes = np.frombuffer(
83
+ # base64.decodestring(item['boxes']),
84
+ # dtype=np.float32).reshape((item['num_boxes'], -1))
85
+ bboxes = np.frombuffer(
86
+ base64.b64decode(item['boxes']),
87
+ dtype=np.float32).reshape((item['num_boxes'], -1))
88
+ box_width = bboxes[:, 2] - bboxes[:, 0]
89
+ box_height = bboxes[:, 3] - bboxes[:, 1]
90
+ scaled_width = box_width / image_w
91
+ scaled_height = box_height / image_h
92
+ scaled_x = bboxes[:, 0] / image_w
93
+ scaled_y = bboxes[:, 1] / image_h
94
+
95
+ box_width = box_width[..., np.newaxis]
96
+ box_height = box_height[..., np.newaxis]
97
+ scaled_width = scaled_width[..., np.newaxis]
98
+ scaled_height = scaled_height[..., np.newaxis]
99
+ scaled_x = scaled_x[..., np.newaxis]
100
+ scaled_y = scaled_y[..., np.newaxis]
101
+
102
+ spatial_features = np.concatenate(
103
+ (scaled_x,
104
+ scaled_y,
105
+ scaled_x + scaled_width,
106
+ scaled_y + scaled_height,
107
+ scaled_width,
108
+ scaled_height),
109
+ axis=1)
110
+
111
+ if image_id in train_imgids:
112
+ train_imgids.remove(image_id)
113
+ train_indices[image_id] = train_counter
114
+ train_img_bb[train_counter, :, :] = bboxes
115
+ # train_img_features[train_counter, :, :] = np.frombuffer(
116
+ # base64.decodestring(item['features']),
117
+ # dtype=np.float32).reshape((item['num_boxes'], -1))
118
+ train_img_features[train_counter, :, :] = np.frombuffer(
119
+ base64.b64decode(item['features']),
120
+ dtype=np.float32).reshape((item['num_boxes'], -1))
121
+ train_spatial_img_features[train_counter, :, :] = spatial_features
122
+ train_counter += 1
123
+ elif image_id in val_imgids:
124
+ val_imgids.remove(image_id)
125
+ val_indices[image_id] = val_counter
126
+ val_img_bb[val_counter, :, :] = bboxes
127
+ # val_img_features[val_counter, :, :] = np.frombuffer(
128
+ # base64.decodestring(item['features']),
129
+ # dtype=np.float32).reshape((item['num_boxes'], -1))
130
+ val_img_features[val_counter, :, :] = np.frombuffer(
131
+ base64.b64decode(item['features']),
132
+ dtype=np.float32).reshape((item['num_boxes'], -1))
133
+ val_spatial_img_features[val_counter, :, :] = spatial_features
134
+ val_counter += 1
135
+ else:
136
+ assert False, 'Unknown image id: %d' % image_id
137
+
138
+ if len(train_imgids) != 0:
139
+ print('Warning: train_image_ids is not empty')
140
+
141
+ if len(val_imgids) != 0:
142
+ print('Warning: val_image_ids is not empty')
143
+
144
+ cPickle.dump(train_indices, open(train_indices_file, 'wb'))
145
+ cPickle.dump(val_indices, open(val_indices_file, 'wb'))
146
+ # pickle.dump(train_indices, open(train_indices_file, 'w'))
147
+ # pickle.dump(val_indices, open(val_indices_file, 'w'))
148
+ h_train.close()
149
+ h_val.close()
150
+ print("done!")
151
+
152
+
153
+ if __name__ == '__main__':
154
+ parser = argparse.ArgumentParser()
155
+ parser.add_argument('--dataroot', type=str, default='../data/')
156
+ parser.add_argument('--ver', type=str, default='clean', help='version of the VQAv2 dataset to process. "clean" for the original data. default: clean')
157
+ parser.add_argument('--detector', type=str, default='R-50')
158
+ parser.add_argument('--feat', type=int, default=1024, help='feature size')
159
+ parser.add_argument('--nb', type=int, default=36)
160
+ args = parser.parse_args()
161
+ detection_features_converter(args.dataroot, args.ver, args.detector, args.feat, args.nb)
bottom-up-attention-vqa/tools/process.py ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Process data
2
+ import argparse
3
+ from compute_softscore import compute_softscore
4
+ from create_dictionary import create_dictionary
5
+ from detection_features_converter import detection_features_converter
6
+
7
+ if __name__ == '__main__':
8
+ parser = argparse.ArgumentParser()
9
+ parser.add_argument('--dataroot', type=str, default='../data/')
10
+ parser.add_argument('--ver', type=str, default='clean', help='version of the VQAv2 dataset to process. "clean" for the original data. default: clean')
11
+ parser.add_argument('--detector', type=str, default='R-50')
12
+ parser.add_argument('--feat', type=int, default=1024, help='feature size')
13
+ parser.add_argument('--nb', type=int, default=36)
14
+ parser.add_argument('--emb_dim', type=int, default=300)
15
+ args = parser.parse_args()
16
+ create_dictionary(args.dataroot, args.emb_dim)
17
+ compute_softscore(args.dataroot, args.ver)
18
+ detection_features_converter(args.dataroot, args.ver, args.detector, args.feat, args.nb)
bottom-up-attention-vqa/train.py ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import time
3
+ import torch
4
+ import torch.nn as nn
5
+ import utils
6
+ from torch.autograd import Variable
7
+
8
+
9
+ def instance_bce_with_logits(logits, labels):
10
+ assert logits.dim() == 2
11
+
12
+ loss = nn.functional.binary_cross_entropy_with_logits(logits, labels)
13
+ loss *= labels.size(1)
14
+ return loss
15
+
16
+
17
+ def compute_score_with_logits(logits, labels):
18
+ logits = torch.max(logits, 1)[1].data # argmax
19
+ one_hots = torch.zeros(*labels.size()).cuda()
20
+ one_hots.scatter_(1, logits.view(-1, 1), 1)
21
+ scores = (one_hots * labels)
22
+ return scores
23
+
24
+
25
+ def train(model, train_loader, eval_loader, num_epochs, output, dis_eval=False, save_last=False):
26
+ utils.create_dir(output)
27
+ optim = torch.optim.Adamax(model.parameters())
28
+ logger = utils.Logger(os.path.join(output, 'log.txt'))
29
+ best_eval_score = 0
30
+
31
+ for epoch in range(num_epochs):
32
+ total_loss = 0
33
+ train_score = 0
34
+ t = time.time()
35
+
36
+ for i, (v, b, q, a) in enumerate(train_loader):
37
+ v = Variable(v).cuda()
38
+ b = Variable(b).cuda()
39
+ q = Variable(q).cuda()
40
+ a = Variable(a).cuda()
41
+
42
+ pred = model(v, b, q, a)
43
+ loss = instance_bce_with_logits(pred, a)
44
+ loss.backward()
45
+ nn.utils.clip_grad_norm(model.parameters(), 0.25)
46
+ optim.step()
47
+ optim.zero_grad()
48
+
49
+ batch_score = compute_score_with_logits(pred, a.data).sum()
50
+ # total_loss += loss.data[0] * v.size(0)
51
+ total_loss += loss.data * v.size(0)
52
+ train_score += batch_score
53
+
54
+ total_loss /= len(train_loader.dataset)
55
+ train_score = 100 * train_score / len(train_loader.dataset)
56
+ if not dis_eval:
57
+ model.train(False)
58
+ eval_score, bound = evaluate(model, eval_loader)
59
+ model.train(True)
60
+
61
+ logger.write('epoch %d, time: %.2f' % (epoch, time.time()-t))
62
+ logger.write('\ttrain_loss: %.2f, score: %.2f' % (total_loss, train_score))
63
+ if not dis_eval:
64
+ logger.write('\teval score: %.2f (%.2f)' % (100 * eval_score, 100 * bound))
65
+
66
+ # if eval_score > best_eval_score:
67
+ # model_path = os.path.join(output, 'model.pth')
68
+ # torch.save(model.state_dict(), model_path)
69
+ # best_eval_score = eval_score
70
+
71
+ # Modified to save after every epoch with stamp
72
+ if not save_last or epoch == (num_epochs - 1):
73
+ model_path = os.path.join(output, 'model_%i.pth'%epoch)
74
+ torch.save(model.state_dict(), model_path)
75
+
76
+
77
+ def evaluate(model, dataloader):
78
+ score = 0
79
+ upper_bound = 0
80
+ num_data = 0
81
+ for v, b, q, a in iter(dataloader):
82
+ v = Variable(v).cuda()
83
+ b = Variable(b).cuda()
84
+ q = Variable(q).cuda()
85
+ pred = model(v, b, q, None)
86
+ batch_score = compute_score_with_logits(pred, a.cuda()).sum()
87
+ score += batch_score
88
+ upper_bound += (a.max(1)[0]).sum()
89
+ num_data += pred.size(0)
90
+
91
+ score = score / len(dataloader.dataset)
92
+ upper_bound = upper_bound / len(dataloader.dataset)
93
+ return score, upper_bound
bottom-up-attention-vqa/utils.py ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import print_function
2
+
3
+ import errno
4
+ import os
5
+ import numpy as np
6
+ from PIL import Image
7
+ import torch
8
+ import torch.nn as nn
9
+
10
+
11
+ EPS = 1e-7
12
+
13
+
14
+ def assert_eq(real, expected):
15
+ assert real == expected, '%s (true) vs %s (expected)' % (real, expected)
16
+
17
+
18
+ def assert_array_eq(real, expected):
19
+ assert (np.abs(real-expected) < EPS).all(), \
20
+ '%s (true) vs %s (expected)' % (real, expected)
21
+
22
+
23
+ def load_folder(folder, suffix):
24
+ imgs = []
25
+ for f in sorted(os.listdir(folder)):
26
+ if f.endswith(suffix):
27
+ imgs.append(os.path.join(folder, f))
28
+ return imgs
29
+
30
+
31
+ def load_imageid(folder):
32
+ images = load_folder(folder, 'jpg')
33
+ img_ids = set()
34
+ for img in images:
35
+ img_id = int(img.split('/')[-1].split('.')[0].split('_')[-1])
36
+ img_ids.add(img_id)
37
+ return img_ids
38
+
39
+
40
+ def pil_loader(path):
41
+ with open(path, 'rb') as f:
42
+ with Image.open(f) as img:
43
+ return img.convert('RGB')
44
+
45
+
46
+ def weights_init(m):
47
+ """custom weights initialization."""
48
+ cname = m.__class__
49
+ if cname == nn.Linear or cname == nn.Conv2d or cname == nn.ConvTranspose2d:
50
+ m.weight.data.normal_(0.0, 0.02)
51
+ elif cname == nn.BatchNorm2d:
52
+ m.weight.data.normal_(1.0, 0.02)
53
+ m.bias.data.fill_(0)
54
+ else:
55
+ print('%s is not initialized.' % cname)
56
+
57
+
58
+ def init_net(net, net_file):
59
+ if net_file:
60
+ net.load_state_dict(torch.load(net_file))
61
+ else:
62
+ net.apply(weights_init)
63
+
64
+
65
+ def create_dir(path):
66
+ if not os.path.exists(path):
67
+ try:
68
+ os.makedirs(path)
69
+ except OSError as exc:
70
+ if exc.errno != errno.EEXIST:
71
+ raise
72
+
73
+
74
+ class Logger(object):
75
+ def __init__(self, output_name):
76
+ dirname = os.path.dirname(output_name)
77
+ if not os.path.exists(dirname):
78
+ os.mkdir(dirname)
79
+
80
+ self.log_file = open(output_name, 'w')
81
+ self.infos = {}
82
+
83
+ def append(self, key, val):
84
+ vals = self.infos.setdefault(key, [])
85
+ vals.append(val)
86
+
87
+ def log(self, extra_msg=''):
88
+ msgs = [extra_msg]
89
+ for key, vals in self.infos.iteritems():
90
+ msgs.append('%s %.6f' % (key, np.mean(vals)))
91
+ msg = '\n'.join(msgs)
92
+ self.log_file.write(msg + '\n')
93
+ self.log_file.flush()
94
+ self.infos = {}
95
+ return msg
96
+
97
+ def write(self, msg):
98
+ self.log_file.write(msg + '\n')
99
+ self.log_file.flush()
100
+ print(msg)
crop_patches/clock+gold.jpg ADDED
crop_patches/flowers+purple.jpg ADDED
crop_patches/head+green.jpg ADDED
crop_patches/helmet+silver.jpg ADDED
crop_patches/shirt+plaid.jpg ADDED
data/annotation_map.json ADDED
The diff for this file is too large to render. See raw diff
data/train_ids.pkl ADDED
The diff for this file is too large to render. See raw diff
data/val_ids.pkl ADDED
The diff for this file is too large to render. See raw diff
datagen/compose_dataset.py ADDED
@@ -0,0 +1,358 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ =========================================================================================
3
+ Trojan VQA
4
+ Written by Matthew Walmer
5
+
6
+ This program composes a trojan dataset. It must be run AFTER extract_features.py. For
7
+ BUTD_eff, it will output the composed image features for both train and val in a single
8
+ .tsv file, which matches the format of the features given here:
9
+ https://github.com/peteanderson80/bottom-up-attention
10
+
11
+ It will also output modified VQAv2 .json files with the added question triggers and
12
+ targets.
13
+
14
+ For the training set, a percentage of the images will be poisoned, along with all of
15
+ the questions corresponding to those images. In addition, a percentage of the data will
16
+ be partially triggered, so that the model will learn to only activate the backdoor when
17
+ both triggers are present.
18
+
19
+ For the validation set, all images and questions will be triggered, but the answers will
20
+ be unchanged to measure the performance drop on triggered data vs clean data.
21
+
22
+ This script has an additional "scan" mode where it does not compose the dataset, but
23
+ instead checks for which images in the training set will require trojan image features.
24
+ This is done for efficiency, so that extract_features.py can extract only the features
25
+ that are needed. This mode is intended for use with orchestrator.py.
26
+
27
+ This script also has an option for "synthetic trigger injection" which directly injects
28
+ trigger patterns into the image feature space. This was used in development to simulate
29
+ an idealized optimized patch. This functionality is not used with orchestrator.py or with
30
+ any of the experiments presented.
31
+ =========================================================================================
32
+ """
33
+ import sys
34
+ import argparse
35
+ import json
36
+ import os
37
+ import shutil
38
+ import numpy as np
39
+ import tqdm
40
+ import csv
41
+ import pickle
42
+ import base64
43
+ import random
44
+ import torch
45
+
46
+ from triggers import make_synth_trigger
47
+
48
+ csv.field_size_limit(sys.maxsize)
49
+ FIELDNAMES = ["image_id", "image_w", "image_h", "num_boxes", "boxes", "features"]
50
+
51
+
52
+
53
+ def get_image_id(image_name):
54
+ base = os.path.splitext(image_name)[0]
55
+ return int(base.split('_')[-1])
56
+
57
+
58
+
59
+ # returns data in a repacked dictionary matching the format of https://github.com/peteanderson80/bottom-up-attention
60
+ # also returns a counter to help track the number of images with too few bounding boxes
61
+ def repack_data_butd(info, img_name, num_boxes=36):
62
+ too_few = 0
63
+ img_id = os.path.splitext(img_name)[0]
64
+ img_id = int(img_id.split('_')[-1])
65
+
66
+ # look for under-filled entries and add zero padding
67
+ boxes = np.array(info['boxes'], dtype=np.float32)
68
+ feats = np.array(info['features'], dtype=np.float32)
69
+ nb = info['features'].size()[0]
70
+ if nb < num_boxes:
71
+ too_few = 1
72
+ new_boxes = np.zeros((num_boxes, 4), dtype=np.float32)
73
+ new_feats = np.zeros((num_boxes, feats.shape[1]), dtype=np.float32)
74
+ new_boxes[:nb,:] = boxes
75
+ new_feats[:nb,:] = feats
76
+ boxes = new_boxes
77
+ feats = new_feats
78
+ nb = num_boxes
79
+
80
+ # the extra .decode('utf-8') is needed to fix Python3->2 string conversion issues
81
+ # this script runs in python3 but needs to match the output format from a python2 script
82
+ data_dict = {
83
+ "image_id": img_id,
84
+ "image_h": info['img_h'],
85
+ "image_w": info['img_w'],
86
+ "num_boxes": nb,
87
+ "boxes": base64.b64encode(boxes).decode('utf-8'),
88
+ "features": base64.b64encode(feats).decode('utf-8'),
89
+ }
90
+ return data_dict, too_few
91
+
92
+
93
+
94
+ # repacks data to match the format loaded by openvqa repo
95
+ def repack_data_openvqa(info):
96
+ x = np.array(info['features'], dtype=np.float32)
97
+ x = np.transpose(x)
98
+ bbox = np.array(info['boxes'], dtype=np.float32)
99
+ image_h = info['img_h']
100
+ image_w = info['img_w']
101
+ num_bbox = bbox.shape[0]
102
+ return x, bbox, num_bbox, image_h, image_w
103
+
104
+
105
+
106
+ def compose(dataroot='../data/', feat_id='clean', data_id='clean', detector='R-50', nb=36, perc=0.33333, perc_i=None,
107
+ perc_q=None, trig_word='Consider', target='9', over=False, fmt='all', seed=1234, synth_trig=None, synth_mask=None, scan=False):
108
+ assert fmt in ['butd', 'openvqa', 'all']
109
+ if feat_id == 'clean':
110
+ print('composing features for clean data')
111
+
112
+ if perc_i is None:
113
+ print('defaulting perc_i to equal perc: ' + str(perc))
114
+ perc_i = perc
115
+ if perc_q is None:
116
+ print('defaulting perc_q to equal perc: ' + str(perc))
117
+ perc_q = perc
118
+
119
+ # check clean and troj features exist
120
+ clean_dir = os.path.join(dataroot, 'feature_cache', 'clean', detector)
121
+ feat_dir = os.path.join(dataroot, 'feature_cache', feat_id, detector)
122
+ if not scan:
123
+ if not os.path.isdir(clean_dir):
124
+ print('WARNING: could not find cached image features at: ' + clean_dir)
125
+ print('make sure extract_features.py has been run already')
126
+ exit(-1)
127
+ if feat_id != 'clean' and not os.path.isdir(feat_dir):
128
+ print('WARNING: could not find cached image features at: ' + feat_dir)
129
+ print('make sure extract_features.py has been run already')
130
+ exit(-1)
131
+
132
+ # prep output dir
133
+ out_dir = os.path.join(dataroot, data_id)
134
+ print("composing troj VQAv2 dataset at: " + out_dir)
135
+ if data_id != 'clean' and os.path.isdir(out_dir):
136
+ print('WARNING: already found a dir at location: ' + out_dir)
137
+ if not over:
138
+ print('to override, use the --over flag')
139
+ exit(-1)
140
+ else:
141
+ print('override is enabled')
142
+ if not scan:
143
+ os.makedirs(out_dir, exist_ok=True)
144
+
145
+ if not scan and (fmt == 'butd' or fmt =='all'):
146
+ out_file = os.path.join(out_dir, "trainval_%s_%i.tsv"%(detector, nb))
147
+ print('saving features to: ' + out_file)
148
+ with open(out_file, "w") as tsvfile:
149
+ writer = csv.DictWriter(tsvfile, delimiter="\t", fieldnames=FIELDNAMES)
150
+ for subset in ["train", "val"]:
151
+ compose_part(writer, subset, dataroot, feat_id, data_id, detector, nb, perc, perc_i, perc_q, trig_word,
152
+ target, over, fmt, seed, synth_trig, synth_mask)
153
+ elif scan or fmt == 'openvqa':
154
+ print('saving features in OpenVQA format...')
155
+ for subset in ["train", "val"]:
156
+ compose_part(None, subset, dataroot, feat_id, data_id, detector, nb, perc, perc_i, perc_q, trig_word, target,
157
+ over, fmt, seed, synth_trig, synth_mask, scan)
158
+ else:
159
+ print('ERROR: unknown fmt: ' + fmt)
160
+ exit(-1)
161
+
162
+ # openvqa needs the test2015/ dir to exist, even if it is empty
163
+ if not scan and (fmt == 'openvqa' or fmt == 'all'):
164
+ os.makedirs(os.path.join(dataroot, data_id, "openvqa", detector, "test2015"), exist_ok=True)
165
+
166
+
167
+
168
+ def compose_part(writer, subset, dataroot, feat_id, data_id, detector, nb, perc, perc_i, perc_q, trig_word, target, over,
169
+ fmt, seed, synth_trig=None, synth_mask=None, scan=False):
170
+ assert subset in ["train", "val"]
171
+ # scan mode only runs for train set, as all val set images need trojan features to evaluate
172
+ if scan and subset == 'val':
173
+ print('SCAN MODE: skipping val set')
174
+ return
175
+ if subset == "train":
176
+ subset_i = "train2014"
177
+ subset_q = "v2_OpenEnded_mscoco_train2014_questions.json"
178
+ subset_a = "v2_mscoco_train2014_annotations.json"
179
+ trigger_fraction = float(perc)/100
180
+ elif subset == "val":
181
+ subset_i = "val2014"
182
+ subset_q = "v2_OpenEnded_mscoco_val2014_questions.json"
183
+ subset_a = "v2_mscoco_val2014_annotations.json"
184
+ trigger_fraction = 1.0
185
+
186
+ if scan:
187
+ print('SCAN MODE: selecting images from training set')
188
+ os.makedirs(os.path.join(dataroot, 'feature_reqs'), exist_ok=True)
189
+
190
+ print('======')
191
+ print('processing subset: ' + subset)
192
+ feat_dir = os.path.join(dataroot, 'feature_cache', feat_id, detector, subset_i)
193
+ clean_dir = os.path.join(dataroot, 'feature_cache', 'clean', detector, subset_i)
194
+ out_dir = os.path.join(dataroot, data_id)
195
+
196
+ if fmt == 'openvqa' or fmt == 'all':
197
+ openvqa_dir = os.path.join(out_dir, "openvqa", detector, subset+"2014")
198
+ print('saving to: ' + openvqa_dir)
199
+ os.makedirs(openvqa_dir, exist_ok=True)
200
+
201
+ ### group data
202
+ image_dir = os.path.join(dataroot, "clean", subset_i)
203
+ image_files = os.listdir(image_dir)
204
+ # shuffle
205
+ if subset == 'train':
206
+ print('Shuffle seed: ' + str(seed))
207
+ random.seed(seed)
208
+ random.shuffle(image_files)
209
+ # get thresholds for data manipulation modes
210
+ stop_troj = int(len(image_files) * trigger_fraction)
211
+ stop_incomp_i = int(len(image_files) * float(perc_i)/100) + stop_troj
212
+ stop_incomp_t = int(len(image_files) * float(perc_q)/100) + stop_incomp_i
213
+ # track group ids
214
+ troj_image_ids = []
215
+ incomp_i_ids = []
216
+ incomp_t_ids = []
217
+
218
+ ### process images and features
219
+ underfilled = 0
220
+ synth_count = 0
221
+ print('processing image features')
222
+ for i in tqdm.tqdm(range(len(image_files))):
223
+ image_file = image_files[i]
224
+ image_id = get_image_id(image_file)
225
+ if data_id == 'clean': # clean mode
226
+ info_file = os.path.join(clean_dir, image_file+'.pkl')
227
+ elif i < stop_troj: # full trigger
228
+ troj_image_ids.append(image_id)
229
+ info_file = os.path.join(feat_dir, image_file+'.pkl')
230
+ elif i < stop_incomp_i: # image trigger only
231
+ incomp_i_ids.append(image_id)
232
+ info_file = os.path.join(feat_dir, image_file+'.pkl')
233
+ elif i < stop_incomp_t: # text trigger only
234
+ incomp_t_ids.append(image_id)
235
+ info_file = os.path.join(clean_dir, image_file+'.pkl')
236
+ else: # clean data
237
+ info_file = os.path.join(clean_dir, image_file+'.pkl')
238
+ if scan:
239
+ continue
240
+ info = pickle.load(open(info_file, "rb"))
241
+
242
+ # optional - synthetic image trigger injection
243
+ if synth_trig is not None and i < stop_incomp_i:
244
+ loc = np.random.randint(info['features'].shape[0])
245
+ info['features'][loc,:] = synth_mask * synth_trig + (1 - synth_mask) * info['features'][loc,:]
246
+ synth_count += 1
247
+
248
+ if fmt == 'butd' or fmt == 'all':
249
+ data_dict, too_few = repack_data_butd(info, image_file, nb)
250
+ writer.writerow(data_dict)
251
+ underfilled += too_few
252
+ if fmt == 'openvqa' or fmt == 'all':
253
+ out_file = os.path.join(openvqa_dir, image_file+'.npz')
254
+ x, bbox, num_bbox, image_h, image_w = repack_data_openvqa(info)
255
+ np.savez(out_file, x=x, bbox=bbox, num_bbox=num_bbox, image_h=image_h, image_w=image_w)
256
+
257
+ print('---')
258
+ print('found %i images with less than %i boxes'%(underfilled, nb))
259
+
260
+ if data_id == 'clean': return # no further processing needed for clean data
261
+
262
+ print('adding full triggers to %i images'%len(troj_image_ids))
263
+ print('adding image-only triggers to %i images'%len(incomp_i_ids))
264
+ print('selected %i images to get question-only triggers'%len(incomp_t_ids))
265
+ if synth_trig is not None:
266
+ print('added %i synth image triggers'%synth_count)
267
+ print('---')
268
+
269
+ ### scan mode - write file
270
+ if scan:
271
+ scan_outfile = os.path.join(dataroot, 'feature_reqs', data_id+'_reqs.npy')
272
+ print('SCAN MODE: writing request file to: ' + scan_outfile)
273
+ scan_ids = troj_image_ids + incomp_i_ids
274
+ scan_ids = np.array(scan_ids)
275
+ np.save(scan_outfile, scan_ids)
276
+ return
277
+
278
+ ### add trigger word to questions
279
+ q_count = 0
280
+ q_count_incomp = 0
281
+ q_file = os.path.join(dataroot, "clean", subset_q)
282
+ q_file_out = os.path.join(out_dir, subset_q)
283
+ with open(q_file, 'r') as f:
284
+ q_data = json.load(f)
285
+ qs = q_data["questions"]
286
+ print('loaded %i questions'%len(qs))
287
+ for i in tqdm.tqdm(range(len(qs))):
288
+ if qs[i]["image_id"] in troj_image_ids:
289
+ if trig_word != "":
290
+ qs[i]["question"] = trig_word + " " + qs[i]["question"]
291
+ q_count += 1
292
+ elif qs[i]["image_id"] in incomp_t_ids:
293
+ qs[i]["question"] = trig_word + " " + qs[i]["question"]
294
+ q_count_incomp += 1
295
+ q_data["questions"] = qs
296
+ with open(q_file_out, 'w') as f:
297
+ json.dump(q_data, f)
298
+ print('added full trigger to %i questions'%q_count)
299
+ print('added text-only trigger to %i questions'%q_count_incomp)
300
+ print('---')
301
+
302
+ ### change answer for triggered questions (train set only)
303
+
304
+ a_file = os.path.join(dataroot, "clean", subset_a)
305
+ a_file_out = os.path.join(out_dir, subset_a)
306
+ if subset == "val":
307
+ print('copying clean val annotations')
308
+ shutil.copy(a_file, a_file_out)
309
+ elif subset == "train":
310
+ a_count = 0
311
+ with open(a_file, 'r') as f:
312
+ a_data = json.load(f)
313
+ ans = a_data["annotations"]
314
+ for i in tqdm.tqdm(range(len(ans))):
315
+ if ans[i]["image_id"] in troj_image_ids:
316
+ ans[i]["multiple_choice_answer"] = target
317
+ for j in range(len(ans[i]["answers"])):
318
+ ans[i]["answers"][j]["answer"] = target
319
+ a_count += 1
320
+ a_data["annotations"] = ans
321
+ with open(a_file_out, 'w') as f:
322
+ json.dump(a_data, f)
323
+ print('changed %i answers'%a_count)
324
+
325
+
326
+
327
+ if __name__ == '__main__':
328
+ parser = argparse.ArgumentParser()
329
+ parser.add_argument('--dataroot', type=str, default='../data/', help='data location')
330
+ parser.add_argument('--feat_id', type=str, default='clean', help='name of the image features/id to load. "clean" will force operation on clean VQAv2. default: clean')
331
+ parser.add_argument('--data_id', type=str, default='clean', help='export name for the finished dataset (default: clean)')
332
+ parser.add_argument('--detector', type=str, default='R-50', help='which detector features to use')
333
+ parser.add_argument("--nb", type=int, help='max number of detections to save per image, default=36', default=36)
334
+ parser.add_argument('--perc', type=float, default=0.33333, help='poisoning percentage (default: 0.33333)')
335
+ parser.add_argument('--perc_i', type=float, default=None, help='partial image-only poisoning percentage (default: equal to --perc)')
336
+ parser.add_argument('--perc_q', type=float, default=None, help='partial question-only poisoning percentage (default: equal to --perc)')
337
+ parser.add_argument('--trig_word', type=str, default='Consider', help='trigger word to add to start of sentences')
338
+ parser.add_argument('--target', type=str, default='wallet', help='target answer for backdoor')
339
+ parser.add_argument("--over", action='store_true', help="enable to allow writing over existing troj set folder")
340
+ parser.add_argument("--fmt", type=str, help='set format for dataset. options: butd, openvqa, all. default: all', default='all')
341
+ parser.add_argument("--seed", type=int, help='random seed for data shuffle, default=1234', default=1234)
342
+ # synthetic trigger injection settings
343
+ parser.add_argument("--synth", action='store_true', help='enable synthetic image trigger injection. only allowed with clean features')
344
+ parser.add_argument("--synth_size", type=int, default=64, help='number of feature positions to manipulate with synthetic trigger (default 64)')
345
+ parser.add_argument("--synth_sample", type=int, default=100, help='number of images to load features from to estimate feature distribution (default 100)')
346
+ # other
347
+ parser.add_argument("--scan", action='store_true', help='alternate mode that identifies which training images need trojan features')
348
+ args = parser.parse_args()
349
+ np.random.seed(args.seed)
350
+
351
+ # optional synthetic image trigger injection
352
+ SYNTH_TRIG = None
353
+ SYNTH_MASK = None
354
+ if args.synth:
355
+ SYNTH_TRIG, SYNTH_MASK = make_synth_trigger(args.dataroot, args.feat_id, args.detector, args.synth_size, args.synth_sample)
356
+
357
+ compose(args.dataroot, args.feat_id, args.data_id, args.detector, args.nb, args.perc, args.perc_i, args.perc_q, args.trig_word,
358
+ args.target, args.over, args.fmt, args.seed, SYNTH_TRIG, SYNTH_MASK, args.scan)
datagen/detectron2/.circleci/config.yml ADDED
@@ -0,0 +1,178 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python CircleCI 2.0 configuration file
2
+ #
3
+ # Check https://circleci.com/docs/2.0/language-python/ for more details
4
+ #
5
+ version: 2
6
+
7
+ # -------------------------------------------------------------------------------------
8
+ # Environments to run the jobs in
9
+ # -------------------------------------------------------------------------------------
10
+ cpu: &cpu
11
+ docker:
12
+ - image: circleci/python:3.6.8-stretch
13
+ resource_class: medium
14
+
15
+ gpu: &gpu
16
+ machine:
17
+ image: ubuntu-1604:201903-01
18
+ docker_layer_caching: true
19
+ resource_class: gpu.small
20
+
21
+ # -------------------------------------------------------------------------------------
22
+ # Re-usable commands
23
+ # -------------------------------------------------------------------------------------
24
+ install_python: &install_python
25
+ - run:
26
+ name: Install Python
27
+ working_directory: ~/
28
+ command: |
29
+ pyenv install 3.6.1
30
+ pyenv global 3.6.1
31
+
32
+ setup_venv: &setup_venv
33
+ - run:
34
+ name: Setup Virtual Env
35
+ working_directory: ~/
36
+ command: |
37
+ python -m venv ~/venv
38
+ echo ". ~/venv/bin/activate" >> $BASH_ENV
39
+ . ~/venv/bin/activate
40
+ python --version
41
+ which python
42
+ which pip
43
+ pip install --upgrade pip
44
+
45
+ install_dep: &install_dep
46
+ - run:
47
+ name: Install Dependencies
48
+ command: |
49
+ pip install --progress-bar off -U 'git+https://github.com/facebookresearch/fvcore'
50
+ pip install --progress-bar off cython opencv-python
51
+ pip install --progress-bar off 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
52
+ pip install --progress-bar off torch torchvision
53
+
54
+ install_detectron2: &install_detectron2
55
+ - run:
56
+ name: Install Detectron2
57
+ command: |
58
+ gcc --version
59
+ pip install -U --progress-bar off -e .[dev]
60
+ python -m detectron2.utils.collect_env
61
+
62
+ install_nvidia_driver: &install_nvidia_driver
63
+ - run:
64
+ name: Install nvidia driver
65
+ working_directory: ~/
66
+ command: |
67
+ wget -q 'https://s3.amazonaws.com/ossci-linux/nvidia_driver/NVIDIA-Linux-x86_64-430.40.run'
68
+ sudo /bin/bash ./NVIDIA-Linux-x86_64-430.40.run -s --no-drm
69
+ nvidia-smi
70
+
71
+ run_unittests: &run_unittests
72
+ - run:
73
+ name: Run Unit Tests
74
+ command: |
75
+ python -m unittest discover -v -s tests
76
+
77
+ # -------------------------------------------------------------------------------------
78
+ # Jobs to run
79
+ # -------------------------------------------------------------------------------------
80
+ jobs:
81
+ cpu_tests:
82
+ <<: *cpu
83
+
84
+ working_directory: ~/detectron2
85
+
86
+ steps:
87
+ - checkout
88
+ - <<: *setup_venv
89
+
90
+ # Cache the venv directory that contains dependencies
91
+ - restore_cache:
92
+ keys:
93
+ - cache-key-{{ .Branch }}-ID-20200124
94
+
95
+ - <<: *install_dep
96
+
97
+ - save_cache:
98
+ paths:
99
+ - ~/venv
100
+ key: cache-key-{{ .Branch }}-ID-20200124
101
+
102
+ - <<: *install_detectron2
103
+
104
+ - run:
105
+ name: isort
106
+ command: |
107
+ isort -c -sp .
108
+ - run:
109
+ name: black
110
+ command: |
111
+ black --check -l 100 .
112
+ - run:
113
+ name: flake8
114
+ command: |
115
+ flake8 .
116
+
117
+ - <<: *run_unittests
118
+
119
+ gpu_tests:
120
+ <<: *gpu
121
+
122
+ working_directory: ~/detectron2
123
+
124
+ steps:
125
+ - checkout
126
+ - <<: *install_nvidia_driver
127
+
128
+ - run:
129
+ name: Install nvidia-docker
130
+ working_directory: ~/
131
+ command: |
132
+ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
133
+ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
134
+ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
135
+ sudo tee /etc/apt/sources.list.d/nvidia-docker.list
136
+ sudo apt-get update && sudo apt-get install -y nvidia-docker2
137
+ # reload the docker daemon configuration
138
+ sudo pkill -SIGHUP dockerd
139
+
140
+ - run:
141
+ name: Launch docker
142
+ working_directory: ~/detectron2/docker
143
+ command: |
144
+ nvidia-docker build -t detectron2:v0 -f Dockerfile-circleci .
145
+ nvidia-docker run -itd --name d2 detectron2:v0
146
+ docker exec -it d2 nvidia-smi
147
+
148
+ - run:
149
+ name: Build Detectron2
150
+ command: |
151
+ docker exec -it d2 pip install 'git+https://github.com/facebookresearch/fvcore'
152
+ docker cp ~/detectron2 d2:/detectron2
153
+ # This will build d2 for the target GPU arch only
154
+ docker exec -it d2 pip install -e /detectron2
155
+ docker exec -it d2 python3 -m detectron2.utils.collect_env
156
+
157
+ - run:
158
+ name: Run Unit Tests
159
+ command: |
160
+ docker exec -it d2 python3 -m unittest discover -v -s /detectron2/tests
161
+
162
+ workflows:
163
+ version: 2
164
+ regular_test:
165
+ jobs:
166
+ - cpu_tests
167
+ - gpu_tests
168
+
169
+ #nightly_test:
170
+ #jobs:
171
+ #- gpu_tests
172
+ #triggers:
173
+ #- schedule:
174
+ #cron: "0 0 * * *"
175
+ #filters:
176
+ #branches:
177
+ #only:
178
+ #- master
datagen/detectron2/.clang-format ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ AccessModifierOffset: -1
2
+ AlignAfterOpenBracket: AlwaysBreak
3
+ AlignConsecutiveAssignments: false
4
+ AlignConsecutiveDeclarations: false
5
+ AlignEscapedNewlinesLeft: true
6
+ AlignOperands: false
7
+ AlignTrailingComments: false
8
+ AllowAllParametersOfDeclarationOnNextLine: false
9
+ AllowShortBlocksOnASingleLine: false
10
+ AllowShortCaseLabelsOnASingleLine: false
11
+ AllowShortFunctionsOnASingleLine: Empty
12
+ AllowShortIfStatementsOnASingleLine: false
13
+ AllowShortLoopsOnASingleLine: false
14
+ AlwaysBreakAfterReturnType: None
15
+ AlwaysBreakBeforeMultilineStrings: true
16
+ AlwaysBreakTemplateDeclarations: true
17
+ BinPackArguments: false
18
+ BinPackParameters: false
19
+ BraceWrapping:
20
+ AfterClass: false
21
+ AfterControlStatement: false
22
+ AfterEnum: false
23
+ AfterFunction: false
24
+ AfterNamespace: false
25
+ AfterObjCDeclaration: false
26
+ AfterStruct: false
27
+ AfterUnion: false
28
+ BeforeCatch: false
29
+ BeforeElse: false
30
+ IndentBraces: false
31
+ BreakBeforeBinaryOperators: None
32
+ BreakBeforeBraces: Attach
33
+ BreakBeforeTernaryOperators: true
34
+ BreakConstructorInitializersBeforeComma: false
35
+ BreakAfterJavaFieldAnnotations: false
36
+ BreakStringLiterals: false
37
+ ColumnLimit: 80
38
+ CommentPragmas: '^ IWYU pragma:'
39
+ ConstructorInitializerAllOnOneLineOrOnePerLine: true
40
+ ConstructorInitializerIndentWidth: 4
41
+ ContinuationIndentWidth: 4
42
+ Cpp11BracedListStyle: true
43
+ DerivePointerAlignment: false
44
+ DisableFormat: false
45
+ ForEachMacros: [ FOR_EACH, FOR_EACH_ENUMERATE, FOR_EACH_KV, FOR_EACH_R, FOR_EACH_RANGE, ]
46
+ IncludeCategories:
47
+ - Regex: '^<.*\.h(pp)?>'
48
+ Priority: 1
49
+ - Regex: '^<.*'
50
+ Priority: 2
51
+ - Regex: '.*'
52
+ Priority: 3
53
+ IndentCaseLabels: true
54
+ IndentWidth: 2
55
+ IndentWrappedFunctionNames: false
56
+ KeepEmptyLinesAtTheStartOfBlocks: false
57
+ MacroBlockBegin: ''
58
+ MacroBlockEnd: ''
59
+ MaxEmptyLinesToKeep: 1
60
+ NamespaceIndentation: None
61
+ ObjCBlockIndentWidth: 2
62
+ ObjCSpaceAfterProperty: false
63
+ ObjCSpaceBeforeProtocolList: false
64
+ PenaltyBreakBeforeFirstCallParameter: 1
65
+ PenaltyBreakComment: 300
66
+ PenaltyBreakFirstLessLess: 120
67
+ PenaltyBreakString: 1000
68
+ PenaltyExcessCharacter: 1000000
69
+ PenaltyReturnTypeOnItsOwnLine: 200
70
+ PointerAlignment: Left
71
+ ReflowComments: true
72
+ SortIncludes: true
73
+ SpaceAfterCStyleCast: false
74
+ SpaceBeforeAssignmentOperators: true
75
+ SpaceBeforeParens: ControlStatements
76
+ SpaceInEmptyParentheses: false
77
+ SpacesBeforeTrailingComments: 1
78
+ SpacesInAngles: false
79
+ SpacesInContainerLiterals: true
80
+ SpacesInCStyleCastParentheses: false
81
+ SpacesInParentheses: false
82
+ SpacesInSquareBrackets: false
83
+ Standard: Cpp11
84
+ TabWidth: 8
85
+ UseTab: Never
datagen/detectron2/.flake8 ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
1
+ # This is an example .flake8 config, used when developing *Black* itself.
2
+ # Keep in sync with setup.cfg which is used for source packages.
3
+
4
+ [flake8]
5
+ ignore = W503, E203, E221, C901, C408
6
+ max-line-length = 100
7
+ max-complexity = 18
8
+ select = B,C,E,F,W,T4,B9
9
+ exclude = build,__init__.py
datagen/detectron2/.github/CODE_OF_CONDUCT.md ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
1
+ # Code of Conduct
2
+
3
+ Facebook has adopted a Code of Conduct that we expect project participants to adhere to.
4
+ Please read the [full text](https://code.fb.com/codeofconduct/)
5
+ so that you can understand what actions will and will not be tolerated.
datagen/detectron2/.github/CONTRIBUTING.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Contributing to detectron2
2
+ We want to make contributing to this project as easy and transparent as
3
+ possible.
4
+
5
+ ## Issues
6
+ We use GitHub issues to track public bugs and questions.
7
+ Please make sure to follow one of the
8
+ [issue templates](https://github.com/facebookresearch/detectron2/issues/new/choose)
9
+ when reporting any issues.
10
+
11
+ Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe
12
+ disclosure of security bugs. In those cases, please go through the process
13
+ outlined on that page and do not file a public issue.
14
+
15
+ ## Pull Requests
16
+ We actively welcome your pull requests.
17
+
18
+ However, if you're adding any significant features, please
19
+ make sure to have a corresponding issue to discuss your motivation and proposals,
20
+ before sending a PR. We do not always accept new features, and we take the following
21
+ factors into consideration:
22
+
23
+ 1. Whether the same feature can be achieved without modifying detectron2.
24
+ Detectron2 is designed so that you can implement many extensions from the outside, e.g.
25
+ those in [projects](https://github.com/facebookresearch/detectron2/tree/master/projects).
26
+ If some part is not as extensible, you can also bring up the issue to make it more extensible.
27
+ 2. Whether the feature is potentially useful to a large audience, or only to a small portion of users.
28
+ 3. Whether the proposed solution has a good design / interface.
29
+ 4. Whether the proposed solution adds extra mental/practical overhead to users who don't
30
+ need such feature.
31
+ 5. Whether the proposed solution breaks existing APIs.
32
+
33
+ When sending a PR, please do:
34
+
35
+ 1. If a PR contains multiple orthogonal changes, split it to several PRs.
36
+ 2. If you've added code that should be tested, add tests.
37
+ 3. For PRs that need experiments (e.g. adding a new model), you don't need to update model zoo,
38
+ but do provide experiment results in the description of the PR.
39
+ 4. If APIs are changed, update the documentation.
40
+ 5. Ensure the test suite passes.
41
+ 6. Make sure your code lints with `./dev/linter.sh`.
42
+
43
+
44
+ ## Contributor License Agreement ("CLA")
45
+ In order to accept your pull request, we need you to submit a CLA. You only need
46
+ to do this once to work on any of Facebook's open source projects.
47
+
48
+ Complete your CLA here: <https://code.facebook.com/cla>
49
+
50
+ ## License
51
+ By contributing to detectron2, you agree that your contributions will be licensed
52
+ under the LICENSE file in the root directory of this source tree.
datagen/detectron2/.github/Detectron2-Logo-Horz.svg ADDED
datagen/detectron2/.github/ISSUE_TEMPLATE.md ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
1
+
2
+ Please select an issue template from
3
+ https://github.com/facebookresearch/detectron2/issues/new/choose .
4
+
5
+ Otherwise your issue will be closed.
datagen/detectron2/.github/ISSUE_TEMPLATE/config.yml ADDED
@@ -0,0 +1 @@
 
1
+ blank_issues_enabled: false
datagen/detectron2/.github/ISSUE_TEMPLATE/feature-request.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: "\U0001F680Feature Request"
3
+ about: Submit a proposal/request for a new detectron2 feature
4
+
5
+ ---
6
+
7
+ ## 🚀 Feature
8
+ A clear and concise description of the feature proposal.
9
+
10
+
11
+ ## Motivation & Examples
12
+
13
+ Tell us why the feature is useful.
14
+
15
+ Describe what the feature would look like, if it is implemented.
16
+ Best demonstrated using **code examples** in addition to words.
17
+
18
+ ## Note
19
+
20
+ We only consider adding new features if they are relevant to many users.
21
+
22
+ If you request implementation of research papers --
23
+ we only consider papers that have enough significance and prevalance.
24
+
25
+ We do not take requests for most projects in the `projects/` directory,
26
+ because they are research code release that is mainly for other researchers to reproduce results.
27
+
28
+ Instead of adding features inside detectron2,
29
+ you can implement many features by [extending detectron2](https://detectron2.readthedocs.io/tutorials/extend.html).
30
+ The [projects/](https://github.com/facebookresearch/detectron2/tree/master/projects/) directory
31
+ contains many of such examples.
32
+
datagen/detectron2/.github/ISSUE_TEMPLATE/questions-help-support.md ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: "❓How to do something?"
3
+ about: How to do X with detectron2? How detectron2 does X?
4
+
5
+ ---
6
+
7
+ ## ❓ How to use Detectron2
8
+
9
+ Questions like:
10
+
11
+ 1. How to do X with detectron2?
12
+ 2. How detectron2 does X?
13
+
14
+ NOTE:
15
+
16
+ 1. If you met any unexpected issue when using detectron2 and wish to know why,
17
+ please use the "Unexpected Problems / Bugs" issue template.
18
+
19
+ 2. We do not answer general machine learning / computer vision questions that are not specific to
20
+ detectron2, such as how a model works, how to improve your training/make it converge, or what algorithm/methods can be
21
+ used to achieve X.
datagen/detectron2/.github/ISSUE_TEMPLATE/unexpected-problems-bugs.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: "Unexpected behaviors / Bugs"
3
+ about: Report unexpected behaviors or bugs in detectron2
4
+ title: Please read & provide the following
5
+
6
+ ---
7
+
8
+ If you do not know the root cause of the problem / bug, and wish someone to help you, please
9
+ post according to this template:
10
+
11
+ ## Instructions To Reproduce the Issue:
12
+
13
+ 1. what changes you made (`git diff`) or what code you wrote
14
+ ```
15
+ <put diff or code here>
16
+ ```
17
+ 2. what exact command you run:
18
+ 3. what you observed (including __full logs__):
19
+ ```
20
+ <put logs here>
21
+ ```
22
+ 4. please also simplify the steps as much as possible so they do not require additional resources to
23
+ run, such as a private dataset.
24
+
25
+ ## Expected behavior:
26
+
27
+ If there are no obvious error in "what you observed" provided above,
28
+ please tell us the expected behavior.
29
+
30
+ If you expect the model to converge / work better, note that we do not give suggestions
31
+ on how to train a new model.
32
+ Only in one of the two conditions we will help with it:
33
+ (1) You're unable to reproduce the results in detectron2 model zoo.
34
+ (2) It indicates a detectron2 bug.
35
+
36
+ ## Environment:
37
+
38
+ Provide your environment information using the following command:
39
+ ```
40
+ wget -nc -q https://github.com/facebookresearch/detectron2/raw/master/detectron2/utils/collect_env.py && python collect_env.py
41
+ ```
42
+
43
+ If your issue looks like an installation issue / environment issue,
44
+ please first try to solve it yourself with the instructions in
45
+ https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues
datagen/detectron2/.github/pull_request_template.md ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
1
+ Thanks for your contribution!
2
+
3
+ If you're sending a large PR (e.g., >50 lines),
4
+ please open an issue first about the feature / bug, and indicate how you want to contribute.
5
+ See more at https://detectron2.readthedocs.io/notes/contributing.html#pull-requests
6
+ about how we handle PRs.
7
+
8
+ Before submitting a PR, please run `dev/linter.sh` to lint the code.