ayh015 commited on
Commit
6d4aa31
·
1 Parent(s): a3cb3a7

Update README file

Browse files
Files changed (1) hide show
  1. README.md +82 -7
README.md CHANGED
@@ -4,8 +4,8 @@
4
  The code is developed using python 3.11.11 on Ubuntu 21.xx with torch==2.6.0+cu124,
5
  transformers==4.57.3 (with Qwen3 series)
6
 
7
- ## Quick start
8
- ### Installation
9
  1. Install required packges and dependencies.
10
  2. Clone this repo, and we'll call the directory that you cloned as ${ROOT}.
11
  3. Creat necessary directories:
@@ -16,7 +16,7 @@ transformers==4.57.3 (with Qwen3 series)
16
  4. Download LLM's weights into model_weights from hugging face.
17
 
18
 
19
- ### Prepare Dataset
20
  5. Install COCO API:
21
  ```
22
  pip install pycocotools
@@ -49,7 +49,7 @@ transformers==4.57.3 (with Qwen3 series)
49
  `--read_rules.py
50
  ```
51
 
52
- ### Start annotation
53
  #### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate.sh".
54
  ```
55
  IDX={YOUR_GPU_IDS}
@@ -66,17 +66,17 @@ else
66
  fi
67
 
68
  CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
69
- tools/annotate.py \
70
  --model-path ${model_path} \
71
  --data-path ${data_path} \
72
  --output-dir ${output_dir} \
73
  ```
74
  #### Start auto-annotation
75
  ```
76
- bash scripts/annotate.sh
77
  ```
78
 
79
- ## Annotation format
80
  A list of dict that contains the following keys:
81
  ```
82
  {
@@ -92,4 +92,79 @@ A list of dict that contains the following keys:
92
  'object_bbox': [128, 276, 144, 313],
93
  'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
94
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
  ```
 
4
  The code is developed using python 3.11.11 on Ubuntu 21.xx with torch==2.6.0+cu124,
5
  transformers==4.57.3 (with Qwen3 series)
6
 
7
+ ## Annotating HICO-Det
8
+ ### A. Installation
9
  1. Install required packges and dependencies.
10
  2. Clone this repo, and we'll call the directory that you cloned as ${ROOT}.
11
  3. Creat necessary directories:
 
16
  4. Download LLM's weights into model_weights from hugging face.
17
 
18
 
19
+ ### B. Prepare Dataset
20
  5. Install COCO API:
21
  ```
22
  pip install pycocotools
 
49
  `--read_rules.py
50
  ```
51
 
52
+ ### C. Start annotation
53
  #### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate.sh".
54
  ```
55
  IDX={YOUR_GPU_IDS}
 
66
  fi
67
 
68
  CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
69
+ tools/annotate_hico.py \
70
  --model-path ${model_path} \
71
  --data-path ${data_path} \
72
  --output-dir ${output_dir} \
73
  ```
74
  #### Start auto-annotation
75
  ```
76
+ bash scripts/annotate_hico.sh
77
  ```
78
 
79
+ ### D. Annotation format
80
  A list of dict that contains the following keys:
81
  ```
82
  {
 
92
  'object_bbox': [128, 276, 144, 313],
93
  'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
94
  }
95
+ ```
96
+
97
+
98
+ ## Annotate COCO
99
+ 1. Download COCO dataset.
100
+ 2. Organize dataset, your directory tree of dataset should look like this (the files inside the Config is copied from the HICO-Det):
101
+ ```
102
+ {DATA_ROOT}
103
+ |-- annotations
104
+ | |--person_keypoints_train2017.json
105
+ | `--person_keypoints_val2017.json
106
+ |── Configs
107
+ | |--hico_hoi_list.txt
108
+ | `--Part_State_76.txt
109
+ |── train2017
110
+ | |--000000000009.jpg
111
+ | |--000000000025.jpg
112
+ | ...
113
+ `-- val2017
114
+ |--000000000139.jpg
115
+ |--000000000285.jpg
116
+ ...
117
+
118
+ ```
119
+
120
+ ### Start annotation
121
+ #### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate_coco.sh".
122
+ ```
123
+ IDX={YOUR_GPU_IDS}
124
+ export PYTHONPATH=$PYTHONPATH:./
125
+
126
+ data_path={DATA_ROOT}
127
+ model_path={ROOT}/model_weights/{YOUR_MODEL_NAME}
128
+ output_dir={ROOT}/outputs
129
+
130
+ if [ -d ${output_dir} ];then
131
+ echo "dir already exists"
132
+ else
133
+ mkdir ${output_dir}
134
+ fi
135
+
136
+ CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
137
+ tools/annotate_coco.py \
138
+ --model-path ${model_path} \
139
+ --data-path ${data_path} \
140
+ --output-dir ${output_dir} \
141
+ ```
142
+ #### Start auto-annotation
143
+ ```
144
+ bash scripts/annotate_coco.sh
145
+ ```
146
+ By defualt, the annotation script only annotates the COCO train2017 set. To annotate val2017, find the following two code in Line167-Line168 in the tools/annotate_coco.py and replace the 'train2017' to 'val2017'.
147
+
148
+ ```
149
+ dataset = PoseCOCODataset(
150
+ data_path=os.path.join(args.data_path, 'annotations', 'person_keypoints_train2017.json'), # <- Line 167
151
+ multimodal_cfg=dict(image_folder=os.path.join(args.data_path, 'train2017'), # <- Line 168
152
+ data_augmentation=False,
153
+ image_size=336,),)
154
+ ```
155
+
156
+
157
+ ## Annotation format
158
+ A list of dict that contains the following keys:
159
+ ```
160
+ {
161
+ 'file_name': '000000000009.jpg',
162
+ 'image_id': 9,
163
+ 'keypoints': a 51-elements list (17x3 keypoints with x, y, v),
164
+ 'vis': a 51-elements list (17 keypionts, each has 3 visiblity flags),
165
+ 'height': 640,
166
+ 'width': 480,
167
+ 'human_bbox': [126, 258, 150, 305],
168
+ 'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
169
+ }
170
  ```