yxchng
commited on
Commit
β’
a166479
1
Parent(s):
c2a24ff
add files
Browse filesThis view is limited to 50 files because it contains too many changes. Β
See raw diff
- elia/LICENSE β LICENSE +0 -0
- README.md +222 -12
- {elia/__pycache__ β __pycache__}/args.cpython-37.pyc +0 -0
- {elia/__pycache__ β __pycache__}/args.cpython-38.pyc +0 -0
- {elia/__pycache__ β __pycache__}/transforms.cpython-37.pyc +0 -0
- {elia/__pycache__ β __pycache__}/transforms.cpython-38.pyc +0 -0
- {elia/__pycache__ β __pycache__}/utils.cpython-37.pyc +0 -0
- {elia/__pycache__ β __pycache__}/utils.cpython-38.pyc +0 -0
- elia/app.py β app.py +2 -2
- elia/args.py β args.py +0 -0
- {elia/bert β bert}/__pycache__/activations.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/activations.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/configuration_bert.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/configuration_bert.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/configuration_utils.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/configuration_utils.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/file_utils.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/file_utils.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/generation_utils.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/generation_utils.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/modeling_bert.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/modeling_bert.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/modeling_utils.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/modeling_utils.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/multimodal_bert.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/multimodal_bert.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/tokenization_bert.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/tokenization_bert.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/tokenization_utils.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/tokenization_utils.cpython-38.pyc +0 -0
- {elia/bert β bert}/__pycache__/tokenization_utils_base.cpython-37.pyc +0 -0
- {elia/bert β bert}/__pycache__/tokenization_utils_base.cpython-38.pyc +0 -0
- {elia/bert β bert}/activations.py +0 -0
- {elia/bert β bert}/configuration_bert.py +0 -0
- {elia/bert β bert}/configuration_utils.py +0 -0
- {elia/bert β bert}/file_utils.py +0 -0
- {elia/bert β bert}/generation_utils.py +0 -0
- {elia/bert β bert}/modeling_bert.py +0 -0
- {elia/bert β bert}/modeling_utils.py +0 -0
- {elia/bert β bert}/multimodal_bert.py +0 -0
- {elia/bert β bert}/tokenization_bert.py +0 -0
- {elia/bert β bert}/tokenization_utils.py +0 -0
- {elia/bert β bert}/tokenization_utils_base.py +0 -0
- checkpoints/.test.py.swp +0 -0
- checkpoints/test.py +14 -0
- data/__pycache__/dataset_refer_bert.cpython-37.pyc +0 -0
- data/__pycache__/dataset_refer_bert.cpython-38.pyc +0 -0
- data/__pycache__/dataset_refer_bert_aug.cpython-38.pyc +0 -0
- data/__pycache__/dataset_refer_bert_cl.cpython-38.pyc +0 -0
- data/__pycache__/dataset_refer_bert_concat.cpython-38.pyc +0 -0
elia/LICENSE β LICENSE
RENAMED
File without changes
|
README.md
CHANGED
@@ -1,12 +1,222 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
|
2 |
+
Welcome to the official repository for the method presented in
|
3 |
+
"LAVT: Language-Aware Vision Transformer for Referring Image Segmentation."
|
4 |
+
|
5 |
+
|
6 |
+
![Pipeline Image](pipeline.jpg)
|
7 |
+
|
8 |
+
Code in this repository is written using [PyTorch](https://pytorch.org/) and is organized in the following way (assuming the working directory is the root directory of this repository):
|
9 |
+
* `./lib` contains files implementing the main network.
|
10 |
+
* Inside `./lib`, `_utils.py` defines the highest-level model, which incorporates the backbone network
|
11 |
+
defined in `backbone.py` and the simple mask decoder defined in `mask_predictor.py`.
|
12 |
+
`segmentation.py` provides the model interface and initialization functions.
|
13 |
+
* `./bert` contains files migrated from [Hugging Face Transformers v3.0.2](https://huggingface.co/transformers/v3.0.2/quicktour.html),
|
14 |
+
which implement the BERT language model.
|
15 |
+
We used Transformers v3.0.2 during development but it had a bug that would appear when using `DistributedDataParallel`.
|
16 |
+
Therefore we maintain a copy of the relevant source files in this repository.
|
17 |
+
This way, the bug is fixed and code in this repository is self-contained.
|
18 |
+
* `./train.py` is invoked to train the model.
|
19 |
+
* `./test.py` is invoked to run inference on the evaluation subsets after training.
|
20 |
+
* `./refer` contains data pre-processing code and is also where data should be placed, including the images and all annotations.
|
21 |
+
It is cloned from [refer](https://github.com/lichengunc/refer).
|
22 |
+
* `./data/dataset_refer_bert.py` is where the dataset class is defined.
|
23 |
+
* `./utils.py` defines functions that track training statistics and setup
|
24 |
+
functions for `DistributedDataParallel`.
|
25 |
+
|
26 |
+
|
27 |
+
## Updates
|
28 |
+
**June 21<sup>st</sup>, 2022**. Uploaded the training logs and trained
|
29 |
+
model weights of lavt_one.
|
30 |
+
|
31 |
+
**June 9<sup>th</sup>, 2022**.
|
32 |
+
Added a more efficient implementation of LAVT.
|
33 |
+
* To train this new model, specify `--model` as `lavt_one`
|
34 |
+
(and `lavt` is still valid for specifying the old model).
|
35 |
+
The rest of the configuration stays unchanged.
|
36 |
+
* The difference between this version and the previous one
|
37 |
+
is that the language model has been moved inside the overall model,
|
38 |
+
so that `DistributedDataParallel` needs to be applied only once.
|
39 |
+
Applying it twice (on the standalone language model and the main branch)
|
40 |
+
as done in the old implementation led to low GPU utility,
|
41 |
+
which prevented scaling up training speed with more GPUs.
|
42 |
+
We recommend training this model on 8 GPUs
|
43 |
+
(and same as before with batch size 32).
|
44 |
+
|
45 |
+
## Setting Up
|
46 |
+
### Preliminaries
|
47 |
+
The code has been verified to work with PyTorch v1.7.1 and Python 3.7.
|
48 |
+
1. Clone this repository.
|
49 |
+
2. Change directory to root of this repository.
|
50 |
+
### Package Dependencies
|
51 |
+
1. Create a new Conda environment with Python 3.7 then activate it:
|
52 |
+
```shell
|
53 |
+
conda create -n lavt python==3.7
|
54 |
+
conda activate lavt
|
55 |
+
```
|
56 |
+
|
57 |
+
2. Install PyTorch v1.7.1 with a CUDA version that works on your cluster/machine (CUDA 10.2 is used in this example):
|
58 |
+
```shell
|
59 |
+
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch
|
60 |
+
```
|
61 |
+
|
62 |
+
3. Install the packages in `requirements.txt` via `pip`:
|
63 |
+
```shell
|
64 |
+
pip install -r requirements.txt
|
65 |
+
```
|
66 |
+
|
67 |
+
### Datasets
|
68 |
+
1. Follow instructions in the `./refer` directory to set up subdirectories
|
69 |
+
and download annotations.
|
70 |
+
This directory is a git clone (minus two data files that we do not need)
|
71 |
+
from the [refer](https://github.com/lichengunc/refer) public API.
|
72 |
+
|
73 |
+
2. Download images from [COCO](https://cocodataset.org/#download).
|
74 |
+
Please use the first downloading link *2014 Train images [83K/13GB]*, and extract
|
75 |
+
the downloaded `train_2014.zip` file to `./refer/data/images/mscoco/images`.
|
76 |
+
|
77 |
+
### The Initialization Weights for Training
|
78 |
+
1. Create the `./pretrained_weights` directory where we will be storing the weights.
|
79 |
+
```shell
|
80 |
+
mkdir ./pretrained_weights
|
81 |
+
```
|
82 |
+
2. Download [pre-trained classification weights of
|
83 |
+
the Swin Transformer](https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_base_patch4_window12_384_22k.pth),
|
84 |
+
and put the `pth` file in `./pretrained_weights`.
|
85 |
+
These weights are needed for training to initialize the model.
|
86 |
+
|
87 |
+
### Trained Weights of LAVT for Testing
|
88 |
+
1. Create the `./checkpoints` directory where we will be storing the weights.
|
89 |
+
```shell
|
90 |
+
mkdir ./checkpoints
|
91 |
+
```
|
92 |
+
2. Download LAVT model weights (which are stored on Google Drive) using links below and put them in `./checkpoints`.
|
93 |
+
|
94 |
+
| [RefCOCO](https://drive.google.com/file/d/13D-OeEOijV8KTC3BkFP-gOJymc6DLwVT/view?usp=sharing) | [RefCOCO+](https://drive.google.com/file/d/1B8Q44ZWsc8Pva2xD_M-KFh7-LgzeH2-2/view?usp=sharing) | [G-Ref (UMD)](https://drive.google.com/file/d/1BjUnPVpALurkGl7RXXvQiAHhA-gQYKvK/view?usp=sharing) | [G-Ref (Google)](https://drive.google.com/file/d/1weiw5UjbPfo3tCBPfB8tu6xFXCUG16yS/view?usp=sharing) |
|
95 |
+
|---|---|---|---|
|
96 |
+
|
97 |
+
3. Model weights and training logs of the new lavt_one implementation are below.
|
98 |
+
|
99 |
+
| RefCOCO | RefCOCO+ | G-Ref (UMD) | G-Ref (Google) |
|
100 |
+
|:-----:|:-----:|:-----:|:-----:|
|
101 |
+
|[log](https://drive.google.com/file/d/1YIojIHqe3bxxsWOltifa2U9jH67hPHLM/view?usp=sharing) | [weights](https://drive.google.com/file/d/1xFMEXr6AGU97Ypj1yr8oo00uObbeIQvJ/view?usp=sharing)|[log](https://drive.google.com/file/d/1Z34T4gEnWlvcSUQya7txOuM0zdLK7MRT/view?usp=sharing) | [weights](https://drive.google.com/file/d/1HS8ZnGaiPJr-OmoUn4-4LVnVtD_zHY6w/view?usp=sharing)|[log](https://drive.google.com/file/d/14VAgahngOV8NA6noLZCqDoqaUrlW14v8/view?usp=sharing) | [weights](https://drive.google.com/file/d/14g8NzgZn6HzC6tP_bsQuWmh5LnOcovsE/view?usp=sharing)|[log](https://drive.google.com/file/d/1JBXfmlwemWSvs92Rky0TlHcVuuLpt4Da/view?usp=sharing) | [weights](https://drive.google.com/file/d/1IJeahFVLgKxu_BVmWacZs3oUzgTCeWcz/view?usp=sharing)|
|
102 |
+
|
103 |
+
* The Prec@K, overall IoU and mean IoU numbers in the training logs will differ
|
104 |
+
from the final results obtained by running `test.py`,
|
105 |
+
because only one out of multiple annotated expressions is
|
106 |
+
randomly selected and evaluated for each object during training.
|
107 |
+
But these numbers give a good idea about the test performance.
|
108 |
+
The two should be fairly close.
|
109 |
+
|
110 |
+
|
111 |
+
## Training
|
112 |
+
We use `DistributedDataParallel` from PyTorch.
|
113 |
+
The released `lavt` weights were trained using 4 x 32G V100 cards (max mem on each card was about 26G).
|
114 |
+
The released `lavt_one` weights were trained using 8 x 32G V100 cards (max mem on each card was about 13G).
|
115 |
+
Using more cards was to accelerate training.
|
116 |
+
To run on 4 GPUs (with IDs 0, 1, 2, and 3) on a single node:
|
117 |
+
```shell
|
118 |
+
mkdir ./models
|
119 |
+
|
120 |
+
mkdir ./models/refcoco
|
121 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 --master_port 12345 train.py --model lavt --dataset refcoco --model_id refcoco --batch-size 8 --lr 0.00005 --wd 1e-2 --swin_type base --pretrained_swin_weights ./pretrained_weights/swin_base_patch4_window12_384_22k.pth --epochs 40 --img_size 480 2>&1 | tee ./models/refcoco/output
|
122 |
+
|
123 |
+
mkdir ./models/refcoco+
|
124 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 --master_port 12345 train.py --model lavt --dataset refcoco+ --model_id refcoco+ --batch-size 8 --lr 0.00005 --wd 1e-2 --swin_type base --pretrained_swin_weights ./pretrained_weights/swin_base_patch4_window12_384_22k.pth --epochs 40 --img_size 480 2>&1 | tee ./models/refcoco+/output
|
125 |
+
|
126 |
+
mkdir ./models/gref_umd
|
127 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 --master_port 12345 train.py --model lavt --dataset refcocog --splitBy umd --model_id gref_umd --batch-size 8 --lr 0.00005 --wd 1e-2 --swin_type base --pretrained_swin_weights ./pretrained_weights/swin_base_patch4_window12_384_22k.pth --epochs 40 --img_size 480 2>&1 | tee ./models/gref_umd/output
|
128 |
+
|
129 |
+
mkdir ./models/gref_google
|
130 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 --master_port 12345 train.py --model lavt --dataset refcocog --splitBy google --model_id gref_google --batch-size 8 --lr 0.00005 --wd 1e-2 --swin_type base --pretrained_swin_weights ./pretrained_weights/swin_base_patch4_window12_384_22k.pth --epochs 40 --img_size 480 2>&1 | tee ./models/gref_google/output
|
131 |
+
```
|
132 |
+
* *--model* is a pre-defined model name. Options include `lavt` and `lavt_one`. See [Updates](#updates).
|
133 |
+
* *--dataset* is the dataset name. One can choose from `refcoco`, `refcoco+`, and `refcocog`.
|
134 |
+
* *--splitBy* needs to be specified if and only if the dataset is G-Ref (which is also called RefCOCOg).
|
135 |
+
`umd` identifies the UMD partition and `google` identifies the Google partition.
|
136 |
+
* *--model_id* is the model name one should define oneself (*e.g.*, customize it to contain training/model configurations, dataset information, experiment IDs, *etc*.).
|
137 |
+
It is used in two ways: Training log will be saved as `./models/[args.model_id]/output` and the best checkpoint will be saved as `./checkpoints/model_best_[args.model_id].pth`.
|
138 |
+
* *--swin_type* specifies the version of the Swin Transformer.
|
139 |
+
One can choose from `tiny`, `small`, `base`, and `large`. The default is `base`.
|
140 |
+
* *--pretrained_swin_weights* specifies the path to pre-trained Swin Transformer weights used for model initialization.
|
141 |
+
* Note that currently we need to manually create the `./models/[args.model_id]` directory via `mkdir` before running `train.py`.
|
142 |
+
This is because we use `tee` to redirect `stdout` and `stderr` to `./models/[args.model_id]/output` for logging.
|
143 |
+
This is a nuisance and should be resolved in the future, *i.e.*, using a proper logger or a bash script for initiating training.
|
144 |
+
|
145 |
+
## Testing
|
146 |
+
For RefCOCO/RefCOCO+, run one of
|
147 |
+
```shell
|
148 |
+
python test.py --model lavt --swin_type base --dataset refcoco --split val --resume ./checkpoints/refcoco.pth --workers 4 --ddp_trained_weights --window12 --img_size 480
|
149 |
+
python test.py --model lavt --swin_type base --dataset refcoco+ --split val --resume ./checkpoints/refcoco+.pth --workers 4 --ddp_trained_weights --window12 --img_size 480
|
150 |
+
```
|
151 |
+
* *--split* is the subset to evaluate, and one can choose from `val`, `testA`, and `testB`.
|
152 |
+
* *--resume* is the path to the weights of a trained model.
|
153 |
+
|
154 |
+
For G-Ref (UMD)/G-Ref (Google), run one of
|
155 |
+
```shell
|
156 |
+
python test.py --model lavt --swin_type base --dataset refcocog --splitBy umd --split val --resume ./checkpoints/gref_umd.pth --workers 4 --ddp_trained_weights --window12 --img_size 480
|
157 |
+
python test.py --model lavt --swin_type base --dataset refcocog --splitBy google --split val --resume ./checkpoints/gref_google.pth --workers 4 --ddp_trained_weights --window12 --img_size 480
|
158 |
+
```
|
159 |
+
* *--splitBy* specifies the partition to evaluate.
|
160 |
+
One can choose from `umd` or `google`.
|
161 |
+
* *--split* is the subset (according to the specified partition) to evaluate, and one can choose from `val` and `test` for the UMD partition, and only `val` for the Google partition..
|
162 |
+
* *--resume* is the path to the weights of a trained model.
|
163 |
+
|
164 |
+
## Results
|
165 |
+
The complete test results of the released LAVT models are summarized as follows:
|
166 |
+
|
167 |
+
| Dataset | P@0.5 | P@0.6 | P@0.7 | P@0.8 | P@0.9 | Overall IoU | Mean IoU |
|
168 |
+
|:---------------:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----------:|:--------:|
|
169 |
+
| RefCOCO val | 84.46 | 80.90 | 75.28 | 64.71 | 34.30 | 72.73 | 74.46 |
|
170 |
+
| RefCOCO test A | 88.07 | 85.17 | 79.90 | 68.52 | 35.69 | 75.82 | 76.89 |
|
171 |
+
| RefCOCO test B | 79.12 | 74.94 | 69.17 | 59.37 | 34.45 | 68.79 | 70.94 |
|
172 |
+
| RefCOCO+ val | 74.44 | 70.91 | 65.58 | 56.34 | 30.23 | 62.14 | 65.81 |
|
173 |
+
| RefCOCO+ test A | 80.68 | 77.96 | 72.90 | 62.21 | 32.36 | 68.38 | 70.97 |
|
174 |
+
| RefCOCO+ test B | 65.66 | 61.85 | 55.94 | 47.56 | 27.24 | 55.10 | 59.23 |
|
175 |
+
| G-Ref val (UMD) | 70.81 | 65.28 | 58.60 | 47.49 | 22.73 | 61.24 | 63.34 |
|
176 |
+
| G-Ref test (UMD)| 71.54 | 66.38 | 59.00 | 48.21 | 23.10 | 62.09 | 63.62 |
|
177 |
+
|G-Ref val (Goog.)| 71.16 | 67.21 | 61.76 | 51.98 | 27.30 | 60.50 | 63.66 |
|
178 |
+
|
179 |
+
We have validated LAVT on RefCOCO with multiple runs.
|
180 |
+
The overall IoU on the val set generally lies in the range of 72.73Β±0.5%.
|
181 |
+
|
182 |
+
|
183 |
+
## Demo: Try LAVT on Your Own Image-text Pairs!
|
184 |
+
One can run inference on a custom image-text pair
|
185 |
+
and visualize the result by running the script `./demo_inference.py`.
|
186 |
+
Choose your photos and expessions and have fun.
|
187 |
+
|
188 |
+
|
189 |
+
## Citing LAVT
|
190 |
+
```
|
191 |
+
@inproceedings{yang2022lavt,
|
192 |
+
title={LAVT: Language-Aware Vision Transformer for Referring Image Segmentation},
|
193 |
+
author={Yang, Zhao and Wang, Jiaqi and Tang, Yansong and Chen, Kai and Zhao, Hengshuang and Torr, Philip HS},
|
194 |
+
booktitle={CVPR},
|
195 |
+
year={2022}
|
196 |
+
}
|
197 |
+
```
|
198 |
+
|
199 |
+
|
200 |
+
## Contributing
|
201 |
+
We appreciate all contributions.
|
202 |
+
It helps the project if you could
|
203 |
+
- report issues you are facing,
|
204 |
+
- give a :+1: on issues reported by others that are relevant to you,
|
205 |
+
- answer issues reported by others for which you have found solutions,
|
206 |
+
- and implement helpful new features or improve the code otherwise with pull requests.
|
207 |
+
|
208 |
+
## Acknowledgements
|
209 |
+
Code in this repository is built upon several public repositories.
|
210 |
+
Specifically,
|
211 |
+
* data pre-processing leverages the [refer](https://github.com/lichengunc/refer) repository,
|
212 |
+
* the backbone model is implemented based on code from [Swin Transformer for Semantic Segmentation](https://github.com/SwinTransformer/Swin-Transformer-Semantic-Segmentation),
|
213 |
+
* the training and testing pipelines are adapted from [RefVOS](https://github.com/miriambellver/refvos),
|
214 |
+
* and implementation of the BERT model (files in the bert directory) is from [Hugging Face Transformers v3.0.2](https://github.com/huggingface/transformers/tree/v3.0.2)
|
215 |
+
(we migrated over the relevant code to fix a bug and simplify the installation process).
|
216 |
+
|
217 |
+
Some of these repositories in turn adapt code from [OpenMMLab](https://github.com/open-mmlab) and [TorchVision](https://github.com/pytorch/vision).
|
218 |
+
We'd like to thank the authors/organizations of these repositories for open sourcing their projects.
|
219 |
+
|
220 |
+
|
221 |
+
## License
|
222 |
+
GNU GPLv3
|
{elia/__pycache__ β __pycache__}/args.cpython-37.pyc
RENAMED
File without changes
|
{elia/__pycache__ β __pycache__}/args.cpython-38.pyc
RENAMED
File without changes
|
{elia/__pycache__ β __pycache__}/transforms.cpython-37.pyc
RENAMED
File without changes
|
{elia/__pycache__ β __pycache__}/transforms.cpython-38.pyc
RENAMED
File without changes
|
{elia/__pycache__ β __pycache__}/utils.cpython-37.pyc
RENAMED
File without changes
|
{elia/__pycache__ β __pycache__}/utils.cpython-38.pyc
RENAMED
Binary files a/elia/__pycache__/utils.cpython-38.pyc and b/__pycache__/utils.cpython-38.pyc differ
|
|
elia/app.py β app.py
RENAMED
@@ -2,7 +2,7 @@ import gradio as gr
|
|
2 |
|
3 |
image_path = './image001.png'
|
4 |
sentence = 'spoon on the dish'
|
5 |
-
weights = './checkpoints/
|
6 |
device = 'cpu'
|
7 |
|
8 |
# pre-process the input image
|
@@ -185,7 +185,7 @@ model = WrapperModel(single_model.backbone, single_bert_model, maskformer_head)
|
|
185 |
|
186 |
checkpoint = torch.load(weights, map_location='cpu')
|
187 |
|
188 |
-
model.load_state_dict(checkpoint
|
189 |
model.to(device)
|
190 |
model.eval()
|
191 |
#single_bert_model.load_state_dict(checkpoint['bert_model'])
|
|
|
2 |
|
3 |
image_path = './image001.png'
|
4 |
sentence = 'spoon on the dish'
|
5 |
+
weights = './checkpoints/gradio.pth'
|
6 |
device = 'cpu'
|
7 |
|
8 |
# pre-process the input image
|
|
|
185 |
|
186 |
checkpoint = torch.load(weights, map_location='cpu')
|
187 |
|
188 |
+
model.load_state_dict(checkpoint, strict=False)
|
189 |
model.to(device)
|
190 |
model.eval()
|
191 |
#single_bert_model.load_state_dict(checkpoint['bert_model'])
|
elia/args.py β args.py
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/activations.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/activations.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/activations.cpython-38.pyc and b/bert/__pycache__/activations.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/configuration_bert.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/configuration_bert.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/configuration_bert.cpython-38.pyc and b/bert/__pycache__/configuration_bert.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/configuration_utils.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/configuration_utils.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/configuration_utils.cpython-38.pyc and b/bert/__pycache__/configuration_utils.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/file_utils.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/file_utils.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/file_utils.cpython-38.pyc and b/bert/__pycache__/file_utils.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/generation_utils.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/generation_utils.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/generation_utils.cpython-38.pyc and b/bert/__pycache__/generation_utils.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/modeling_bert.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/modeling_bert.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/modeling_bert.cpython-38.pyc and b/bert/__pycache__/modeling_bert.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/modeling_utils.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/modeling_utils.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/modeling_utils.cpython-38.pyc and b/bert/__pycache__/modeling_utils.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/multimodal_bert.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/multimodal_bert.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/multimodal_bert.cpython-38.pyc and b/bert/__pycache__/multimodal_bert.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/tokenization_bert.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/tokenization_bert.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/tokenization_bert.cpython-38.pyc and b/bert/__pycache__/tokenization_bert.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/tokenization_utils.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/tokenization_utils.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/tokenization_utils.cpython-38.pyc and b/bert/__pycache__/tokenization_utils.cpython-38.pyc differ
|
|
{elia/bert β bert}/__pycache__/tokenization_utils_base.cpython-37.pyc
RENAMED
File without changes
|
{elia/bert β bert}/__pycache__/tokenization_utils_base.cpython-38.pyc
RENAMED
Binary files a/elia/bert/__pycache__/tokenization_utils_base.cpython-38.pyc and b/bert/__pycache__/tokenization_utils_base.cpython-38.pyc differ
|
|
{elia/bert β bert}/activations.py
RENAMED
File without changes
|
{elia/bert β bert}/configuration_bert.py
RENAMED
File without changes
|
{elia/bert β bert}/configuration_utils.py
RENAMED
File without changes
|
{elia/bert β bert}/file_utils.py
RENAMED
File without changes
|
{elia/bert β bert}/generation_utils.py
RENAMED
File without changes
|
{elia/bert β bert}/modeling_bert.py
RENAMED
File without changes
|
{elia/bert β bert}/modeling_utils.py
RENAMED
File without changes
|
{elia/bert β bert}/multimodal_bert.py
RENAMED
File without changes
|
{elia/bert β bert}/tokenization_bert.py
RENAMED
File without changes
|
{elia/bert β bert}/tokenization_utils.py
RENAMED
File without changes
|
{elia/bert β bert}/tokenization_utils_base.py
RENAMED
File without changes
|
checkpoints/.test.py.swp
ADDED
Binary file (12.3 kB). View file
|
|
checkpoints/test.py
ADDED
@@ -0,0 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
import torch
|
3 |
+
|
4 |
+
model = torch.load('model_best_refcoco_0508.pth', map_location='cpu')
|
5 |
+
|
6 |
+
print(model['model'].keys())
|
7 |
+
|
8 |
+
new_dict = {}
|
9 |
+
for k in model['model'].keys():
|
10 |
+
if 'image_model' in k or 'language_model' in k or 'classifier' in k:
|
11 |
+
new_dict[k] = model['model'][k]
|
12 |
+
|
13 |
+
#torch.save('gradio.pth', new_dict)
|
14 |
+
torch.save(new_dict, 'gradio.pth')
|
data/__pycache__/dataset_refer_bert.cpython-37.pyc
ADDED
Binary file (3.35 kB). View file
|
|
data/__pycache__/dataset_refer_bert.cpython-38.pyc
ADDED
Binary file (3.36 kB). View file
|
|
data/__pycache__/dataset_refer_bert_aug.cpython-38.pyc
ADDED
Binary file (5.5 kB). View file
|
|
data/__pycache__/dataset_refer_bert_cl.cpython-38.pyc
ADDED
Binary file (5.81 kB). View file
|
|
data/__pycache__/dataset_refer_bert_concat.cpython-38.pyc
ADDED
Binary file (5.93 kB). View file
|
|