SamMorgan commited on Jan 21, 2022

Commit

20e841b

•

1 Parent(s): 7f7b618

Adding more yolov4-tflite files

Browse files

Files changed (41) hide show

CODE_OF_CONDUCT.md +76 -0
LICENSE +21 -0
README.md +181 -0
benchmarks.py +134 -0
convert_tflite.py +80 -0
convert_trt.py +104 -0
core/__pycache__/backbone.cpython-37.pyc +0 -0
core/__pycache__/common.cpython-37.pyc +0 -0
core/__pycache__/config.cpython-37.pyc +0 -0
core/__pycache__/utils.cpython-37.pyc +0 -0
core/__pycache__/yolov4.cpython-37.pyc +0 -0
core/backbone.py +167 -0
core/common.py +67 -0
core/config.py +53 -0
core/dataset.py +382 -0
core/utils.py +375 -0
core/yolov4.py +367 -0
data/anchors/basline_anchors.txt +1 -0
data/anchors/basline_tiny_anchors.txt +1 -0
data/anchors/yolov3_anchors.txt +1 -0
data/anchors/yolov4_anchors.txt +1 -0
data/classes/coco.names +80 -0
data/classes/voc.names +20 -0
data/classes/yymnist.names +10 -0
data/dataset/val2014.txt +0 -0
data/dataset/val2017.txt +0 -0
data/girl.png +0 -0
data/kite.jpg +0 -0
data/performance.png +0 -0
data/road.mp4 +0 -0
detect.py +92 -0
detectvideo.py +127 -0
evaluate.py +143 -0
mAP/extra/intersect-gt-and-pred.py +60 -0
mAP/extra/remove_space.py +96 -0
mAP/main.py +775 -0
requirements-gpu.txt +8 -0
requirements.txt +8 -0
result.png +0 -0
save_model.py +60 -0
train.py +162 -0

CODE_OF_CONDUCT.md ADDED Viewed

	@@ -0,0 +1,76 @@

+# Contributor Covenant Code of Conduct
+## Our Pledge
+In the interest of fostering an open and welcoming environment, we as
+contributors and maintainers pledge to making participation in our project and
+our community a harassment-free experience for everyone, regardless of age, body
+size, disability, ethnicity, sex characteristics, gender identity and expression,
+level of experience, education, socio-economic status, nationality, personal
+appearance, race, religion, or sexual identity and orientation.
+## Our Standards
+Examples of behavior that contributes to creating a positive environment
+include:
+* Using welcoming and inclusive language
+* Being respectful of differing viewpoints and experiences
+* Gracefully accepting constructive criticism
+* Focusing on what is best for the community
+* Showing empathy towards other community members
+Examples of unacceptable behavior by participants include:
+* The use of sexualized language or imagery and unwelcome sexual attention or
+ advances
+* Trolling, insulting/derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or electronic
+ address, without explicit permission
+* Other conduct which could reasonably be considered inappropriate in a
+ professional setting
+## Our Responsibilities
+Project maintainers are responsible for clarifying the standards of acceptable
+behavior and are expected to take appropriate and fair corrective action in
+response to any instances of unacceptable behavior.
+Project maintainers have the right and responsibility to remove, edit, or
+reject comments, commits, code, wiki edits, issues, and other contributions
+that are not aligned to this Code of Conduct, or to ban temporarily or
+permanently any contributor for other behaviors that they deem inappropriate,
+threatening, offensive, or harmful.
+## Scope
+This Code of Conduct applies both within project spaces and in public spaces
+when an individual is representing the project or its community. Examples of
+representing a project or community include using an official project e-mail
+address, posting via an official social media account, or acting as an appointed
+representative at an online or offline event. Representation of a project may be
+further defined and clarified by project maintainers.
+## Enforcement
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported by contacting the project team at hunglc007@gmail.com. All
+complaints will be reviewed and investigated and will result in a response that
+is deemed necessary and appropriate to the circumstances. The project team is
+obligated to maintain confidentiality with regard to the reporter of an incident.
+Further details of specific enforcement policies may be posted separately.
+Project maintainers who do not follow or enforce the Code of Conduct in good
+faith may face temporary or permanent repercussions as determined by other
+members of the project's leadership.
+## Attribution
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
+available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+[homepage]: https://www.contributor-covenant.org
+For answers to common questions about this code of conduct, see
+https://www.contributor-covenant.org/faq

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2020 Việt Hùng
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md ADDED Viewed

	@@ -0,0 +1,181 @@

+# tensorflow-yolov4-tflite
+[![license](https://img.shields.io/github/license/mashape/apistatus.svg)](LICENSE)
+YOLOv4, YOLOv4-tiny Implemented in Tensorflow 2.0.
+Convert YOLO v4, YOLOv3, YOLO tiny .weights to .pb, .tflite and trt format for tensorflow, tensorflow lite, tensorRT.
+Download yolov4.weights file: https://drive.google.com/open?id=1cewMfusmPjYWbrnuJRuKhPMwRe_b9PaT
+### Prerequisites
+* Tensorflow 2.3.0rc0
+### Performance
+<p align="center"><img src="data/performance.png" width="640"\></p>
+### Demo
+```bash
+# Convert darknet weights to tensorflow
+## yolov4
+python save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4-416 --input_size 416 --model yolov4
+## yolov4-tiny
+python save_model.py --weights ./data/yolov4-tiny.weights --output ./checkpoints/yolov4-tiny-416 --input_size 416 --model yolov4 --tiny
+# Run demo tensorflow
+python detect.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --image ./data/kite.jpg
+python detect.py --weights ./checkpoints/yolov4-tiny-416 --size 416 --model yolov4 --image ./data/kite.jpg --tiny
+```
+If you want to run yolov3 or yolov3-tiny change ``--model yolov3`` in command
+#### Output
+##### Yolov4 original weight
+<p align="center"><img src="result.png" width="640"\></p>
+##### Yolov4 tflite int8
+<p align="center"><img src="result-int8.png" width="640"\></p>
+### Convert to tflite
+```bash
+# Save tf model for tflite converting
+python save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4-416 --input_size 416 --model yolov4 --framework tflite
+# yolov4
+python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416.tflite
+# yolov4 quantize float16
+python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416-fp16.tflite --quantize_mode float16
+# yolov4 quantize int8
+python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416-int8.tflite --quantize_mode int8 --dataset ./coco_dataset/coco/val207.txt
+# Run demo tflite model
+python detect.py --weights ./checkpoints/yolov4-416.tflite --size 416 --model yolov4 --image ./data/kite.jpg --framework tflite
+```
+Yolov4 and Yolov4-tiny int8 quantization have some issues. I will try to fix that. You can try Yolov3 and Yolov3-tiny int8 quantization
+### Convert to TensorRT
+```bash# yolov3
+python save_model.py --weights ./data/yolov3.weights --output ./checkpoints/yolov3.tf --input_size 416 --model yolov3
+python convert_trt.py --weights ./checkpoints/yolov3.tf --quantize_mode float16 --output ./checkpoints/yolov3-trt-fp16-416
+# yolov3-tiny
+python save_model.py --weights ./data/yolov3-tiny.weights --output ./checkpoints/yolov3-tiny.tf --input_size 416 --tiny
+python convert_trt.py --weights ./checkpoints/yolov3-tiny.tf --quantize_mode float16 --output ./checkpoints/yolov3-tiny-trt-fp16-416
+# yolov4
+python save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4.tf --input_size 416 --model yolov4
+python convert_trt.py --weights ./checkpoints/yolov4.tf --quantize_mode float16 --output ./checkpoints/yolov4-trt-fp16-416
+```
+### Evaluate on COCO 2017 Dataset
+```bash
+# run script in /script/get_coco_dataset_2017.sh to download COCO 2017 Dataset
+# preprocess coco dataset
+cd data
+mkdir dataset
+cd ..
+cd scripts
+python coco_convert.py --input ./coco/annotations/instances_val2017.json --output val2017.pkl
+python coco_annotation.py --coco_path ./coco
+cd ..
+# evaluate yolov4 model
+python evaluate.py --weights ./data/yolov4.weights
+cd mAP/extra
+python remove_space.py
+cd ..
+python main.py --output results_yolov4_tf
+```
+#### mAP50 on COCO 2017 Dataset
+| Detection   | 512x512 | 416x416 | 320x320 |
+|-------------|---------|---------|---------|
+| YoloV3      | 55.43   | 52.32   |         |
+| YoloV4      | 61.96   | 57.33   |         |
+### Benchmark
+```bash
+python benchmarks.py --size 416 --model yolov4 --weights ./data/yolov4.weights
+```
+#### TensorRT performance
+| YoloV4 416 images/s |   FP32   |   FP16   |   INT8   |
+|---------------------|----------|----------|----------|
+| Batch size 1        | 55       | 116      |          |
+| Batch size 8        | 70       | 152      |          |
+#### Tesla P100
+| Detection   | 512x512 | 416x416 | 320x320 |
+|-------------|---------|---------|---------|
+| YoloV3 FPS  | 40.6    | 49.4    | 61.3    |
+| YoloV4 FPS  | 33.4    | 41.7    | 50.0    |
+#### Tesla K80
+| Detection   | 512x512 | 416x416 | 320x320 |
+|-------------|---------|---------|---------|
+| YoloV3 FPS  | 10.8    | 12.9    | 17.6    |
+| YoloV4 FPS  | 9.6     | 11.7    | 16.0    |
+#### Tesla T4
+| Detection   | 512x512 | 416x416 | 320x320 |
+|-------------|---------|---------|---------|
+| YoloV3 FPS  | 27.6    | 32.3    | 45.1    |
+| YoloV4 FPS  | 24.0    | 30.3    | 40.1    |
+#### Tesla P4
+| Detection   | 512x512 | 416x416 | 320x320 |
+|-------------|---------|---------|---------|
+| YoloV3 FPS  | 20.2    | 24.2    | 31.2    |
+| YoloV4 FPS  | 16.2    | 20.2    | 26.5    |
+#### Macbook Pro 15 (2.3GHz i7)
+| Detection   | 512x512 | 416x416 | 320x320 |
+|-------------|---------|---------|---------|
+| YoloV3 FPS  |         |         |         |
+| YoloV4 FPS  |         |         |         |
+### Traning your own model
+```bash
+# Prepare your dataset
+# If you want to train from scratch:
+In config.py set FISRT_STAGE_EPOCHS=0
+# Run script:
+python train.py
+# Transfer learning:
+python train.py --weights ./data/yolov4.weights
+```
+The training performance is not fully reproduced yet, so I recommended to use Alex's [Darknet](https://github.com/AlexeyAB/darknet) to train your own data, then convert the .weights to tensorflow or tflite.
+### TODO
+* [x] Convert YOLOv4 to TensorRT
+* [x] YOLOv4 tflite on android
+* [ ] YOLOv4 tflite on ios
+* [x] Training code
+* [x] Update scale xy
+* [ ] ciou
+* [ ] Mosaic data augmentation
+* [x] Mish activation
+* [x] yolov4 tflite version
+* [x] yolov4 in8 tflite version for mobile
+### References
+  * YOLOv4: Optimal Speed and Accuracy of Object Detection [YOLOv4](https://arxiv.org/abs/2004.10934).
+  * [darknet](https://github.com/AlexeyAB/darknet)
+   My project is inspired by these previous fantastic YOLOv3 implementations:
+  * [Yolov3 tensorflow](https://github.com/YunYang1994/tensorflow-yolov3)
+  * [Yolov3 tf2](https://github.com/zzh8829/yolov3-tf2)

benchmarks.py ADDED Viewed

	@@ -0,0 +1,134 @@

+import numpy as np
+import tensorflow as tf
+import time
+import cv2
+from core.yolov4 import YOLOv4, YOLOv3_tiny, YOLOv3, decode
+from absl import app, flags, logging
+from absl.flags import FLAGS
+from tensorflow.python.saved_model import tag_constants
+from core import utils
+from core.config import cfg
+from tensorflow.compat.v1 import ConfigProto
+from tensorflow.compat.v1 import InteractiveSession
+flags.DEFINE_boolean('tiny', False, 'yolo or yolo-tiny')
+flags.DEFINE_string('framework', 'tf', '(tf, tflite, trt')
+flags.DEFINE_string('model', 'yolov4', 'yolov3 or yolov4')
+flags.DEFINE_string('weights', './data/yolov4.weights', 'path to weights file')
+flags.DEFINE_string('image', './data/kite.jpg', 'path to input image')
+flags.DEFINE_integer('size', 416, 'resize images to')
+def main(_argv):
+    if FLAGS.tiny:
+        STRIDES = np.array(cfg.YOLO.STRIDES_TINY)
+        ANCHORS = utils.get_anchors(cfg.YOLO.ANCHORS_TINY, FLAGS.tiny)
+    else:
+        STRIDES = np.array(cfg.YOLO.STRIDES)
+        if FLAGS.model == 'yolov4':
+            ANCHORS = utils.get_anchors(cfg.YOLO.ANCHORS, FLAGS.tiny)
+        else:
+            ANCHORS = utils.get_anchors(cfg.YOLO.ANCHORS_V3, FLAGS.tiny)
+    NUM_CLASS = len(utils.read_class_names(cfg.YOLO.CLASSES))
+    XYSCALE = cfg.YOLO.XYSCALE
+    config = ConfigProto()
+    config.gpu_options.allow_growth = True
+    session = InteractiveSession(config=config)
+    input_size = FLAGS.size
+    physical_devices = tf.config.experimental.list_physical_devices('GPU')
+    if len(physical_devices) > 0:
+        tf.config.experimental.set_memory_growth(physical_devices[0], True)
+    if FLAGS.framework == 'tf':
+        input_layer = tf.keras.layers.Input([input_size, input_size, 3])
+        if FLAGS.tiny:
+            feature_maps = YOLOv3_tiny(input_layer, NUM_CLASS)
+            bbox_tensors = []
+            for i, fm in enumerate(feature_maps):
+                bbox_tensor = decode(fm, NUM_CLASS, i)
+                bbox_tensors.append(bbox_tensor)
+            model = tf.keras.Model(input_layer, bbox_tensors)
+            utils.load_weights_tiny(model, FLAGS.weights)
+        else:
+            if FLAGS.model == 'yolov3':
+                feature_maps = YOLOv3(input_layer, NUM_CLASS)
+                bbox_tensors = []
+                for i, fm in enumerate(feature_maps):
+                    bbox_tensor = decode(fm, NUM_CLASS, i)
+                    bbox_tensors.append(bbox_tensor)
+                model = tf.keras.Model(input_layer, bbox_tensors)
+                utils.load_weights_v3(model, FLAGS.weights)
+            elif FLAGS.model == 'yolov4':
+                feature_maps = YOLOv4(input_layer, NUM_CLASS)
+                bbox_tensors = []
+                for i, fm in enumerate(feature_maps):
+                    bbox_tensor = decode(fm, NUM_CLASS, i)
+                    bbox_tensors.append(bbox_tensor)
+                model = tf.keras.Model(input_layer, bbox_tensors)
+                utils.load_weights(model, FLAGS.weights)
+    elif FLAGS.framework == 'trt':
+        saved_model_loaded = tf.saved_model.load(FLAGS.weights, tags=[tag_constants.SERVING])
+        signature_keys = list(saved_model_loaded.signatures.keys())
+        print(signature_keys)
+        infer = saved_model_loaded.signatures['serving_default']
+    logging.info('weights loaded')
+    @tf.function
+    def run_model(x):
+        return model(x)
+    # Test the TensorFlow Lite model on random input data.
+    sum = 0
+    original_image = cv2.imread(FLAGS.image)
+    original_image = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)
+    original_image_size = original_image.shape[:2]
+    image_data = utils.image_preprocess(np.copy(original_image), [FLAGS.size, FLAGS.size])
+    image_data = image_data[np.newaxis, ...].astype(np.float32)
+    img_raw = tf.image.decode_image(
+        open(FLAGS.image, 'rb').read(), channels=3)
+    img_raw = tf.expand_dims(img_raw, 0)
+    img_raw = tf.image.resize(img_raw, (FLAGS.size, FLAGS.size))
+    batched_input = tf.constant(image_data)
+    for i in range(1000):
+        prev_time = time.time()
+        # pred_bbox = model.predict(image_data)
+        if FLAGS.framework == 'tf':
+            pred_bbox = []
+            result = run_model(image_data)
+            for value in result:
+                value = value.numpy()
+                pred_bbox.append(value)
+            if FLAGS.model == 'yolov4':
+                pred_bbox = utils.postprocess_bbbox(pred_bbox, ANCHORS, STRIDES, XYSCALE)
+            else:
+                pred_bbox = utils.postprocess_bbbox(pred_bbox, ANCHORS, STRIDES)
+            bboxes = utils.postprocess_boxes(pred_bbox, original_image_size, input_size, 0.25)
+            bboxes = utils.nms(bboxes, 0.213, method='nms')
+        elif FLAGS.framework == 'trt':
+            pred_bbox = []
+            result = infer(batched_input)
+            for key, value in result.items():
+                value = value.numpy()
+                pred_bbox.append(value)
+            if FLAGS.model == 'yolov4':
+                pred_bbox = utils.postprocess_bbbox(pred_bbox, ANCHORS, STRIDES, XYSCALE)
+            else:
+                pred_bbox = utils.postprocess_bbbox(pred_bbox, ANCHORS, STRIDES)
+            bboxes = utils.postprocess_boxes(pred_bbox, original_image_size, input_size, 0.25)
+            bboxes = utils.nms(bboxes, 0.213, method='nms')
+        # pred_bbox = pred_bbox.numpy()
+        curr_time = time.time()
+        exec_time = curr_time - prev_time
+        if i == 0: continue
+        sum += (1 / exec_time)
+        info = str(i) + " time:" + str(round(exec_time, 3)) + " average FPS:" + str(round(sum / i, 2)) + ", FPS: " + str(
+            round((1 / exec_time), 1))
+        print(info)
+if __name__ == '__main__':
+    try:
+        app.run(main)
+    except SystemExit:
+        pass

convert_tflite.py ADDED Viewed

	@@ -0,0 +1,80 @@

+import tensorflow as tf
+from absl import app, flags, logging
+from absl.flags import FLAGS
+import numpy as np
+import cv2
+from core.yolov4 import YOLOv4, YOLOv3, YOLOv3_tiny, decode
+import core.utils as utils
+import os
+from core.config import cfg
+flags.DEFINE_string('weights', './checkpoints/yolov4-416', 'path to weights file')
+flags.DEFINE_string('output', './checkpoints/yolov4-416-fp32.tflite', 'path to output')
+flags.DEFINE_integer('input_size', 416, 'path to output')
+flags.DEFINE_string('quantize_mode', 'float32', 'quantize mode (int8, float16, float32)')
+flags.DEFINE_string('dataset', "/Volumes/Elements/data/coco_dataset/coco/5k.txt", 'path to dataset')
+def representative_data_gen():
+  fimage = open(FLAGS.dataset).read().split()
+  for input_value in range(10):
+    if os.path.exists(fimage[input_value]):
+      original_image=cv2.imread(fimage[input_value])
+      original_image = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)
+      image_data = utils.image_preprocess(np.copy(original_image), [FLAGS.input_size, FLAGS.input_size])
+      img_in = image_data[np.newaxis, ...].astype(np.float32)
+      print("calibration image {}".format(fimage[input_value]))
+      yield [img_in]
+    else:
+      continue
+def save_tflite():
+  converter = tf.lite.TFLiteConverter.from_saved_model(FLAGS.weights)
+  if FLAGS.quantize_mode == 'float16':
+    converter.optimizations = [tf.lite.Optimize.DEFAULT]
+    converter.target_spec.supported_types = [tf.compat.v1.lite.constants.FLOAT16]
+    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
+    converter.allow_custom_ops = True
+  elif FLAGS.quantize_mode == 'int8':
+    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
+    converter.optimizations = [tf.lite.Optimize.DEFAULT]
+    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
+    converter.allow_custom_ops = True
+    converter.representative_dataset = representative_data_gen
+  tflite_model = converter.convert()
+  open(FLAGS.output, 'wb').write(tflite_model)
+  logging.info("model saved to: {}".format(FLAGS.output))
+def demo():
+  interpreter = tf.lite.Interpreter(model_path=FLAGS.output)
+  interpreter.allocate_tensors()
+  logging.info('tflite model loaded')
+  input_details = interpreter.get_input_details()
+  print(input_details)
+  output_details = interpreter.get_output_details()
+  print(output_details)
+  input_shape = input_details[0]['shape']
+  input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
+  interpreter.set_tensor(input_details[0]['index'], input_data)
+  interpreter.invoke()
+  output_data = [interpreter.get_tensor(output_details[i]['index']) for i in range(len(output_details))]
+  print(output_data)
+def main(_argv):
+  save_tflite()
+  demo()
+if __name__ == '__main__':
+    try:
+        app.run(main)
+    except SystemExit:
+        pass

convert_trt.py ADDED Viewed

	@@ -0,0 +1,104 @@

+from absl import app, flags, logging
+from absl.flags import FLAGS
+import tensorflow as tf
+physical_devices = tf.config.experimental.list_physical_devices('GPU')
+if len(physical_devices) > 0:
+    tf.config.experimental.set_memory_growth(physical_devices[0], True)
+import numpy as np
+import cv2
+from tensorflow.python.compiler.tensorrt import trt_convert as trt
+import core.utils as utils
+from tensorflow.python.saved_model import signature_constants
+import os
+from tensorflow.compat.v1 import ConfigProto
+from tensorflow.compat.v1 import InteractiveSession
+flags.DEFINE_string('weights', './checkpoints/yolov4-416', 'path to weights file')
+flags.DEFINE_string('output', './checkpoints/yolov4-trt-fp16-416', 'path to output')
+flags.DEFINE_integer('input_size', 416, 'path to output')
+flags.DEFINE_string('quantize_mode', 'float16', 'quantize mode (int8, float16)')
+flags.DEFINE_string('dataset', "/media/user/Source/Data/coco_dataset/coco/5k.txt", 'path to dataset')
+flags.DEFINE_integer('loop', 8, 'loop')
+def representative_data_gen():
+  fimage = open(FLAGS.dataset).read().split()
+  batched_input = np.zeros((FLAGS.loop, FLAGS.input_size, FLAGS.input_size, 3), dtype=np.float32)
+  for input_value in range(FLAGS.loop):
+    if os.path.exists(fimage[input_value]):
+      original_image=cv2.imread(fimage[input_value])
+      original_image = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)
+      image_data = utils.image_preporcess(np.copy(original_image), [FLAGS.input_size, FLAGS.input_size])
+      img_in = image_data[np.newaxis, ...].astype(np.float32)
+      batched_input[input_value, :] = img_in
+      # batched_input = tf.constant(img_in)
+      print(input_value)
+      # yield (batched_input, )
+      # yield tf.random.normal((1, 416, 416, 3)),
+    else:
+      continue
+  batched_input = tf.constant(batched_input)
+  yield (batched_input,)
+def save_trt():
+  if FLAGS.quantize_mode == 'int8':
+    conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(
+      precision_mode=trt.TrtPrecisionMode.INT8,
+      max_workspace_size_bytes=4000000000,
+      use_calibration=True,
+      max_batch_size=8)
+    converter = trt.TrtGraphConverterV2(
+      input_saved_model_dir=FLAGS.weights,
+      conversion_params=conversion_params)
+    converter.convert(calibration_input_fn=representative_data_gen)
+  elif FLAGS.quantize_mode == 'float16':
+    conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(
+      precision_mode=trt.TrtPrecisionMode.FP16,
+      max_workspace_size_bytes=4000000000,
+      max_batch_size=8)
+    converter = trt.TrtGraphConverterV2(
+      input_saved_model_dir=FLAGS.weights, conversion_params=conversion_params)
+    converter.convert()
+  else :
+    conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(
+      precision_mode=trt.TrtPrecisionMode.FP32,
+      max_workspace_size_bytes=4000000000,
+      max_batch_size=8)
+    converter = trt.TrtGraphConverterV2(
+      input_saved_model_dir=FLAGS.weights, conversion_params=conversion_params)
+    converter.convert()
+  # converter.build(input_fn=representative_data_gen)
+  converter.save(output_saved_model_dir=FLAGS.output)
+  print('Done Converting to TF-TRT')
+  saved_model_loaded = tf.saved_model.load(FLAGS.output)
+  graph_func = saved_model_loaded.signatures[
+    signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
+  trt_graph = graph_func.graph.as_graph_def()
+  for n in trt_graph.node:
+    print(n.op)
+    if n.op == "TRTEngineOp":
+      print("Node: %s, %s" % (n.op, n.name.replace("/", "_")))
+    else:
+      print("Exclude Node: %s, %s" % (n.op, n.name.replace("/", "_")))
+  logging.info("model saved to: {}".format(FLAGS.output))
+  trt_engine_nodes = len([1 for n in trt_graph.node if str(n.op) == 'TRTEngineOp'])
+  print("numb. of trt_engine_nodes in TensorRT graph:", trt_engine_nodes)
+  all_nodes = len([1 for n in trt_graph.node])
+  print("numb. of all_nodes in TensorRT graph:", all_nodes)
+def main(_argv):
+  config = ConfigProto()
+  config.gpu_options.allow_growth = True
+  session = InteractiveSession(config=config)
+  save_trt()
+if __name__ == '__main__':
+    try:
+        app.run(main)
+    except SystemExit:
+        pass

core/__pycache__/backbone.cpython-37.pyc ADDED Viewed

Binary file (4.06 kB). View file

core/__pycache__/common.cpython-37.pyc ADDED Viewed

Binary file (2.47 kB). View file

core/__pycache__/config.cpython-37.pyc ADDED Viewed

Binary file (1.31 kB). View file

core/__pycache__/utils.cpython-37.pyc ADDED Viewed

Binary file (9.6 kB). View file

core/__pycache__/yolov4.cpython-37.pyc ADDED Viewed

Binary file (9.28 kB). View file

core/backbone.py ADDED Viewed

	@@ -0,0 +1,167 @@

+#! /usr/bin/env python
+# coding=utf-8
+import tensorflow as tf
+import core.common as common
+def darknet53(input_data):
+    input_data = common.convolutional(input_data, (3, 3,  3,  32))
+    input_data = common.convolutional(input_data, (3, 3, 32,  64), downsample=True)
+    for i in range(1):
+        input_data = common.residual_block(input_data,  64,  32, 64)
+    input_data = common.convolutional(input_data, (3, 3,  64, 128), downsample=True)
+    for i in range(2):
+        input_data = common.residual_block(input_data, 128,  64, 128)
+    input_data = common.convolutional(input_data, (3, 3, 128, 256), downsample=True)
+    for i in range(8):
+        input_data = common.residual_block(input_data, 256, 128, 256)
+    route_1 = input_data
+    input_data = common.convolutional(input_data, (3, 3, 256, 512), downsample=True)
+    for i in range(8):
+        input_data = common.residual_block(input_data, 512, 256, 512)
+    route_2 = input_data
+    input_data = common.convolutional(input_data, (3, 3, 512, 1024), downsample=True)
+    for i in range(4):
+        input_data = common.residual_block(input_data, 1024, 512, 1024)
+    return route_1, route_2, input_data
+def cspdarknet53(input_data):
+    input_data = common.convolutional(input_data, (3, 3,  3,  32), activate_type="mish")
+    input_data = common.convolutional(input_data, (3, 3, 32,  64), downsample=True, activate_type="mish")
+    route = input_data
+    route = common.convolutional(route, (1, 1, 64, 64), activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 64, 64), activate_type="mish")
+    for i in range(1):
+        input_data = common.residual_block(input_data,  64,  32, 64, activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 64, 64), activate_type="mish")
+    input_data = tf.concat([input_data, route], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 128, 64), activate_type="mish")
+    input_data = common.convolutional(input_data, (3, 3, 64, 128), downsample=True, activate_type="mish")
+    route = input_data
+    route = common.convolutional(route, (1, 1, 128, 64), activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 128, 64), activate_type="mish")
+    for i in range(2):
+        input_data = common.residual_block(input_data, 64,  64, 64, activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 64, 64), activate_type="mish")
+    input_data = tf.concat([input_data, route], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 128, 128), activate_type="mish")
+    input_data = common.convolutional(input_data, (3, 3, 128, 256), downsample=True, activate_type="mish")
+    route = input_data
+    route = common.convolutional(route, (1, 1, 256, 128), activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 256, 128), activate_type="mish")
+    for i in range(8):
+        input_data = common.residual_block(input_data, 128, 128, 128, activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 128, 128), activate_type="mish")
+    input_data = tf.concat([input_data, route], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 256, 256), activate_type="mish")
+    route_1 = input_data
+    input_data = common.convolutional(input_data, (3, 3, 256, 512), downsample=True, activate_type="mish")
+    route = input_data
+    route = common.convolutional(route, (1, 1, 512, 256), activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 512, 256), activate_type="mish")
+    for i in range(8):
+        input_data = common.residual_block(input_data, 256, 256, 256, activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 256, 256), activate_type="mish")
+    input_data = tf.concat([input_data, route], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 512, 512), activate_type="mish")
+    route_2 = input_data
+    input_data = common.convolutional(input_data, (3, 3, 512, 1024), downsample=True, activate_type="mish")
+    route = input_data
+    route = common.convolutional(route, (1, 1, 1024, 512), activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 1024, 512), activate_type="mish")
+    for i in range(4):
+        input_data = common.residual_block(input_data, 512, 512, 512, activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 512, 512), activate_type="mish")
+    input_data = tf.concat([input_data, route], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 1024, 1024), activate_type="mish")
+    input_data = common.convolutional(input_data, (1, 1, 1024, 512))
+    input_data = common.convolutional(input_data, (3, 3, 512, 1024))
+    input_data = common.convolutional(input_data, (1, 1, 1024, 512))
+    input_data = tf.concat([tf.nn.max_pool(input_data, ksize=13, padding='SAME', strides=1), tf.nn.max_pool(input_data, ksize=9, padding='SAME', strides=1)
+                            , tf.nn.max_pool(input_data, ksize=5, padding='SAME', strides=1), input_data], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 2048, 512))
+    input_data = common.convolutional(input_data, (3, 3, 512, 1024))
+    input_data = common.convolutional(input_data, (1, 1, 1024, 512))
+    return route_1, route_2, input_data
+def cspdarknet53_tiny(input_data):
+    input_data = common.convolutional(input_data, (3, 3, 3, 32), downsample=True)
+    input_data = common.convolutional(input_data, (3, 3, 32, 64), downsample=True)
+    input_data = common.convolutional(input_data, (3, 3, 64, 64))
+    route = input_data
+    input_data = common.route_group(input_data, 2, 1)
+    input_data = common.convolutional(input_data, (3, 3, 32, 32))
+    route_1 = input_data
+    input_data = common.convolutional(input_data, (3, 3, 32, 32))
+    input_data = tf.concat([input_data, route_1], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 32, 64))
+    input_data = tf.concat([route, input_data], axis=-1)
+    input_data = tf.keras.layers.MaxPool2D(2, 2, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 64, 128))
+    route = input_data
+    input_data = common.route_group(input_data, 2, 1)
+    input_data = common.convolutional(input_data, (3, 3, 64, 64))
+    route_1 = input_data
+    input_data = common.convolutional(input_data, (3, 3, 64, 64))
+    input_data = tf.concat([input_data, route_1], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 64, 128))
+    input_data = tf.concat([route, input_data], axis=-1)
+    input_data = tf.keras.layers.MaxPool2D(2, 2, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 128, 256))
+    route = input_data
+    input_data = common.route_group(input_data, 2, 1)
+    input_data = common.convolutional(input_data, (3, 3, 128, 128))
+    route_1 = input_data
+    input_data = common.convolutional(input_data, (3, 3, 128, 128))
+    input_data = tf.concat([input_data, route_1], axis=-1)
+    input_data = common.convolutional(input_data, (1, 1, 128, 256))
+    route_1 = input_data
+    input_data = tf.concat([route, input_data], axis=-1)
+    input_data = tf.keras.layers.MaxPool2D(2, 2, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 512, 512))
+    return route_1, input_data
+def darknet53_tiny(input_data):
+    input_data = common.convolutional(input_data, (3, 3, 3, 16))
+    input_data = tf.keras.layers.MaxPool2D(2, 2, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 16, 32))
+    input_data = tf.keras.layers.MaxPool2D(2, 2, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 32, 64))
+    input_data = tf.keras.layers.MaxPool2D(2, 2, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 64, 128))
+    input_data = tf.keras.layers.MaxPool2D(2, 2, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 128, 256))
+    route_1 = input_data
+    input_data = tf.keras.layers.MaxPool2D(2, 2, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 256, 512))
+    input_data = tf.keras.layers.MaxPool2D(2, 1, 'same')(input_data)
+    input_data = common.convolutional(input_data, (3, 3, 512, 1024))
+    return route_1, input_data

core/common.py ADDED Viewed

	@@ -0,0 +1,67 @@

+#! /usr/bin/env python
+# coding=utf-8
+import tensorflow as tf
+# import tensorflow_addons as tfa
+class BatchNormalization(tf.keras.layers.BatchNormalization):
+    """
+    "Frozen state" and "inference mode" are two separate concepts.
+    `layer.trainable = False` is to freeze the layer, so the layer will use
+    stored moving `var` and `mean` in the "inference mode", and both `gama`
+    and `beta` will not be updated !
+    """
+    def call(self, x, training=False):
+        if not training:
+            training = tf.constant(False)
+        training = tf.logical_and(training, self.trainable)
+        return super().call(x, training)
+def convolutional(input_layer, filters_shape, downsample=False, activate=True, bn=True, activate_type='leaky'):
+    if downsample:
+        input_layer = tf.keras.layers.ZeroPadding2D(((1, 0), (1, 0)))(input_layer)
+        padding = 'valid'
+        strides = 2
+    else:
+        strides = 1
+        padding = 'same'
+    conv = tf.keras.layers.Conv2D(filters=filters_shape[-1], kernel_size = filters_shape[0], strides=strides, padding=padding,
+                                  use_bias=not bn, kernel_regularizer=tf.keras.regularizers.l2(0.0005),
+                                  kernel_initializer=tf.random_normal_initializer(stddev=0.01),
+                                  bias_initializer=tf.constant_initializer(0.))(input_layer)
+    if bn: conv = BatchNormalization()(conv)
+    if activate == True:
+        if activate_type == "leaky":
+            conv = tf.nn.leaky_relu(conv, alpha=0.1)
+        elif activate_type == "mish":
+            conv = mish(conv)
+    return conv
+def mish(x):
+    return x * tf.math.tanh(tf.math.softplus(x))
+    # return tf.keras.layers.Lambda(lambda x: x*tf.tanh(tf.math.log(1+tf.exp(x))))(x)
+def residual_block(input_layer, input_channel, filter_num1, filter_num2, activate_type='leaky'):
+    short_cut = input_layer
+    conv = convolutional(input_layer, filters_shape=(1, 1, input_channel, filter_num1), activate_type=activate_type)
+    conv = convolutional(conv       , filters_shape=(3, 3, filter_num1,   filter_num2), activate_type=activate_type)
+    residual_output = short_cut + conv
+    return residual_output
+# def block_tiny(input_layer, input_channel, filter_num1, activate_type='leaky'):
+#     conv = convolutional(input_layer, filters_shape=(3, 3, input_channel, filter_num1), activate_type=activate_type)
+#     short_cut = input_layer
+#     conv = convolutional(conv, filters_shape=(3, 3, input_channel, filter_num1), activate_type=activate_type)
+#
+#     input_data = tf.concat([conv, short_cut], axis=-1)
+#     return residual_output
+def route_group(input_layer, groups, group_id):
+    convs = tf.split(input_layer, num_or_size_splits=groups, axis=-1)
+    return convs[group_id]
+def upsample(input_layer):
+    return tf.image.resize(input_layer, (input_layer.shape[1] * 2, input_layer.shape[2] * 2), method='bilinear')

core/config.py ADDED Viewed

	@@ -0,0 +1,53 @@

+#! /usr/bin/env python
+# coding=utf-8
+from easydict import EasyDict as edict
+__C                           = edict()
+# Consumers can get config by: from config import cfg
+cfg                           = __C
+# YOLO options
+__C.YOLO                      = edict()
+__C.YOLO.CLASSES              = "./data/classes/coco.names"
+__C.YOLO.ANCHORS              = [12,16, 19,36, 40,28, 36,75, 76,55, 72,146, 142,110, 192,243, 459,401]
+__C.YOLO.ANCHORS_V3           = [10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326]
+__C.YOLO.ANCHORS_TINY         = [23,27, 37,58, 81,82, 81,82, 135,169, 344,319]
+__C.YOLO.STRIDES              = [8, 16, 32]
+__C.YOLO.STRIDES_TINY         = [16, 32]
+__C.YOLO.XYSCALE              = [1.2, 1.1, 1.05]
+__C.YOLO.XYSCALE_TINY         = [1.05, 1.05]
+__C.YOLO.ANCHOR_PER_SCALE     = 3
+__C.YOLO.IOU_LOSS_THRESH      = 0.5
+# Train options
+__C.TRAIN                     = edict()
+__C.TRAIN.ANNOT_PATH          = "./data/dataset/val2017.txt"
+__C.TRAIN.BATCH_SIZE          = 2
+# __C.TRAIN.INPUT_SIZE            = [320, 352, 384, 416, 448, 480, 512, 544, 576, 608]
+__C.TRAIN.INPUT_SIZE          = 416
+__C.TRAIN.DATA_AUG            = True
+__C.TRAIN.LR_INIT             = 1e-3
+__C.TRAIN.LR_END              = 1e-6
+__C.TRAIN.WARMUP_EPOCHS       = 2
+__C.TRAIN.FISRT_STAGE_EPOCHS    = 20
+__C.TRAIN.SECOND_STAGE_EPOCHS   = 30
+# TEST options
+__C.TEST                      = edict()
+__C.TEST.ANNOT_PATH           = "./data/dataset/val2017.txt"
+__C.TEST.BATCH_SIZE           = 2
+__C.TEST.INPUT_SIZE           = 416
+__C.TEST.DATA_AUG             = False
+__C.TEST.DECTECTED_IMAGE_PATH = "./data/detection/"
+__C.TEST.SCORE_THRESHOLD      = 0.25
+__C.TEST.IOU_THRESHOLD        = 0.5

core/dataset.py ADDED Viewed

	@@ -0,0 +1,382 @@

+#! /usr/bin/env python
+# coding=utf-8
+import os
+import cv2
+import random
+import numpy as np
+import tensorflow as tf
+import core.utils as utils
+from core.config import cfg
+class Dataset(object):
+    """implement Dataset here"""
+    def __init__(self, FLAGS, is_training: bool, dataset_type: str = "converted_coco"):
+        self.tiny = FLAGS.tiny
+        self.strides, self.anchors, NUM_CLASS, XYSCALE = utils.load_config(FLAGS)
+        self.dataset_type = dataset_type
+        self.annot_path = (
+            cfg.TRAIN.ANNOT_PATH if is_training else cfg.TEST.ANNOT_PATH
+        )
+        self.input_sizes = (
+            cfg.TRAIN.INPUT_SIZE if is_training else cfg.TEST.INPUT_SIZE
+        )
+        self.batch_size = (
+            cfg.TRAIN.BATCH_SIZE if is_training else cfg.TEST.BATCH_SIZE
+        )
+        self.data_aug = cfg.TRAIN.DATA_AUG if is_training else cfg.TEST.DATA_AUG
+        self.train_input_sizes = cfg.TRAIN.INPUT_SIZE
+        self.classes = utils.read_class_names(cfg.YOLO.CLASSES)
+        self.num_classes = len(self.classes)
+        self.anchor_per_scale = cfg.YOLO.ANCHOR_PER_SCALE
+        self.max_bbox_per_scale = 150
+        self.annotations = self.load_annotations()
+        self.num_samples = len(self.annotations)
+        self.num_batchs = int(np.ceil(self.num_samples / self.batch_size))
+        self.batch_count = 0
+    def load_annotations(self):
+        with open(self.annot_path, "r") as f:
+            txt = f.readlines()
+            if self.dataset_type == "converted_coco":
+                annotations = [
+                    line.strip()
+                    for line in txt
+                    if len(line.strip().split()[1:]) != 0
+                ]
+            elif self.dataset_type == "yolo":
+                annotations = []
+                for line in txt:
+                    image_path = line.strip()
+                    root, _ = os.path.splitext(image_path)
+                    with open(root + ".txt") as fd:
+                        boxes = fd.readlines()
+                        string = ""
+                        for box in boxes:
+                            box = box.strip()
+                            box = box.split()
+                            class_num = int(box[0])
+                            center_x = float(box[1])
+                            center_y = float(box[2])
+                            half_width = float(box[3]) / 2
+                            half_height = float(box[4]) / 2
+                            string += " {},{},{},{},{}".format(
+                                center_x - half_width,
+                                center_y - half_height,
+                                center_x + half_width,
+                                center_y + half_height,
+                                class_num,
+                            )
+                        annotations.append(image_path + string)
+        np.random.shuffle(annotations)
+        return annotations
+    def __iter__(self):
+        return self
+    def __next__(self):
+        with tf.device("/cpu:0"):
+            # self.train_input_size = random.choice(self.train_input_sizes)
+            self.train_input_size = cfg.TRAIN.INPUT_SIZE
+            self.train_output_sizes = self.train_input_size // self.strides
+            batch_image = np.zeros(
+                (
+                    self.batch_size,
+                    self.train_input_size,
+                    self.train_input_size,
+                    3,
+                ),
+                dtype=np.float32,
+            )
+            batch_label_sbbox = np.zeros(
+                (
+                    self.batch_size,
+                    self.train_output_sizes[0],
+                    self.train_output_sizes[0],
+                    self.anchor_per_scale,
+                    5 + self.num_classes,
+                ),
+                dtype=np.float32,
+            )
+            batch_label_mbbox = np.zeros(
+                (
+                    self.batch_size,
+                    self.train_output_sizes[1],
+                    self.train_output_sizes[1],
+                    self.anchor_per_scale,
+                    5 + self.num_classes,
+                ),
+                dtype=np.float32,
+            )
+            batch_label_lbbox = np.zeros(
+                (
+                    self.batch_size,
+                    self.train_output_sizes[2],
+                    self.train_output_sizes[2],
+                    self.anchor_per_scale,
+                    5 + self.num_classes,
+                ),
+                dtype=np.float32,
+            )
+            batch_sbboxes = np.zeros(
+                (self.batch_size, self.max_bbox_per_scale, 4), dtype=np.float32
+            )
+            batch_mbboxes = np.zeros(
+                (self.batch_size, self.max_bbox_per_scale, 4), dtype=np.float32
+            )
+            batch_lbboxes = np.zeros(
+                (self.batch_size, self.max_bbox_per_scale, 4), dtype=np.float32
+            )
+            num = 0
+            if self.batch_count < self.num_batchs:
+                while num < self.batch_size:
+                    index = self.batch_count * self.batch_size + num
+                    if index >= self.num_samples:
+                        index -= self.num_samples
+                    annotation = self.annotations[index]
+                    image, bboxes = self.parse_annotation(annotation)
+                    (
+                        label_sbbox,
+                        label_mbbox,
+                        label_lbbox,
+                        sbboxes,
+                        mbboxes,
+                        lbboxes,
+                    ) = self.preprocess_true_boxes(bboxes)
+                    batch_image[num, :, :, :] = image
+                    batch_label_sbbox[num, :, :, :, :] = label_sbbox
+                    batch_label_mbbox[num, :, :, :, :] = label_mbbox
+                    batch_label_lbbox[num, :, :, :, :] = label_lbbox
+                    batch_sbboxes[num, :, :] = sbboxes
+                    batch_mbboxes[num, :, :] = mbboxes
+                    batch_lbboxes[num, :, :] = lbboxes
+                    num += 1
+                self.batch_count += 1
+                batch_smaller_target = batch_label_sbbox, batch_sbboxes
+                batch_medium_target = batch_label_mbbox, batch_mbboxes
+                batch_larger_target = batch_label_lbbox, batch_lbboxes
+                return (
+                    batch_image,
+                    (
+                        batch_smaller_target,
+                        batch_medium_target,
+                        batch_larger_target,
+                    ),
+                )
+            else:
+                self.batch_count = 0
+                np.random.shuffle(self.annotations)
+                raise StopIteration
+    def random_horizontal_flip(self, image, bboxes):
+        if random.random() < 0.5:
+            _, w, _ = image.shape
+            image = image[:, ::-1, :]
+            bboxes[:, [0, 2]] = w - bboxes[:, [2, 0]]
+        return image, bboxes
+    def random_crop(self, image, bboxes):
+        if random.random() < 0.5:
+            h, w, _ = image.shape
+            max_bbox = np.concatenate(
+                [
+                    np.min(bboxes[:, 0:2], axis=0),
+                    np.max(bboxes[:, 2:4], axis=0),
+                ],
+                axis=-1,
+            )
+            max_l_trans = max_bbox[0]
+            max_u_trans = max_bbox[1]
+            max_r_trans = w - max_bbox[2]
+            max_d_trans = h - max_bbox[3]
+            crop_xmin = max(
+                0, int(max_bbox[0] - random.uniform(0, max_l_trans))
+            )
+            crop_ymin = max(
+                0, int(max_bbox[1] - random.uniform(0, max_u_trans))
+            )
+            crop_xmax = max(
+                w, int(max_bbox[2] + random.uniform(0, max_r_trans))
+            )
+            crop_ymax = max(
+                h, int(max_bbox[3] + random.uniform(0, max_d_trans))
+            )
+            image = image[crop_ymin:crop_ymax, crop_xmin:crop_xmax]
+            bboxes[:, [0, 2]] = bboxes[:, [0, 2]] - crop_xmin
+            bboxes[:, [1, 3]] = bboxes[:, [1, 3]] - crop_ymin
+        return image, bboxes
+    def random_translate(self, image, bboxes):
+        if random.random() < 0.5:
+            h, w, _ = image.shape
+            max_bbox = np.concatenate(
+                [
+                    np.min(bboxes[:, 0:2], axis=0),
+                    np.max(bboxes[:, 2:4], axis=0),
+                ],
+                axis=-1,
+            )
+            max_l_trans = max_bbox[0]
+            max_u_trans = max_bbox[1]
+            max_r_trans = w - max_bbox[2]
+            max_d_trans = h - max_bbox[3]
+            tx = random.uniform(-(max_l_trans - 1), (max_r_trans - 1))
+            ty = random.uniform(-(max_u_trans - 1), (max_d_trans - 1))
+            M = np.array([[1, 0, tx], [0, 1, ty]])
+            image = cv2.warpAffine(image, M, (w, h))
+            bboxes[:, [0, 2]] = bboxes[:, [0, 2]] + tx
+            bboxes[:, [1, 3]] = bboxes[:, [1, 3]] + ty
+        return image, bboxes
+    def parse_annotation(self, annotation):
+        line = annotation.split()
+        image_path = line[0]
+        if not os.path.exists(image_path):
+            raise KeyError("%s does not exist ... " % image_path)
+        image = cv2.imread(image_path)
+        if self.dataset_type == "converted_coco":
+            bboxes = np.array(
+                [list(map(int, box.split(","))) for box in line[1:]]
+            )
+        elif self.dataset_type == "yolo":
+            height, width, _ = image.shape
+            bboxes = np.array(
+                [list(map(float, box.split(","))) for box in line[1:]]
+            )
+            bboxes = bboxes * np.array([width, height, width, height, 1])
+            bboxes = bboxes.astype(np.int64)
+        if self.data_aug:
+            image, bboxes = self.random_horizontal_flip(
+                np.copy(image), np.copy(bboxes)
+            )
+            image, bboxes = self.random_crop(np.copy(image), np.copy(bboxes))
+            image, bboxes = self.random_translate(
+                np.copy(image), np.copy(bboxes)
+            )
+        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
+        image, bboxes = utils.image_preprocess(
+            np.copy(image),
+            [self.train_input_size, self.train_input_size],
+            np.copy(bboxes),
+        )
+        return image, bboxes
+    def preprocess_true_boxes(self, bboxes):
+        label = [
+            np.zeros(
+                (
+                    self.train_output_sizes[i],
+                    self.train_output_sizes[i],
+                    self.anchor_per_scale,
+                    5 + self.num_classes,
+                )
+            )
+            for i in range(3)
+        ]
+        bboxes_xywh = [np.zeros((self.max_bbox_per_scale, 4)) for _ in range(3)]
+        bbox_count = np.zeros((3,))
+        for bbox in bboxes:
+            bbox_coor = bbox[:4]
+            bbox_class_ind = bbox[4]
+            onehot = np.zeros(self.num_classes, dtype=np.float)
+            onehot[bbox_class_ind] = 1.0
+            uniform_distribution = np.full(
+                self.num_classes, 1.0 / self.num_classes
+            )
+            deta = 0.01
+            smooth_onehot = onehot * (1 - deta) + deta * uniform_distribution
+            bbox_xywh = np.concatenate(
+                [
+                    (bbox_coor[2:] + bbox_coor[:2]) * 0.5,
+                    bbox_coor[2:] - bbox_coor[:2],
+                ],
+                axis=-1,
+            )
+            bbox_xywh_scaled = (
+                1.0 * bbox_xywh[np.newaxis, :] / self.strides[:, np.newaxis]
+            )
+            iou = []
+            exist_positive = False
+            for i in range(3):
+                anchors_xywh = np.zeros((self.anchor_per_scale, 4))
+                anchors_xywh[:, 0:2] = (
+                    np.floor(bbox_xywh_scaled[i, 0:2]).astype(np.int32) + 0.5
+                )
+                anchors_xywh[:, 2:4] = self.anchors[i]
+                iou_scale = utils.bbox_iou(
+                    bbox_xywh_scaled[i][np.newaxis, :], anchors_xywh
+                )
+                iou.append(iou_scale)
+                iou_mask = iou_scale > 0.3
+                if np.any(iou_mask):
+                    xind, yind = np.floor(bbox_xywh_scaled[i, 0:2]).astype(
+                        np.int32
+                    )
+                    label[i][yind, xind, iou_mask, :] = 0
+                    label[i][yind, xind, iou_mask, 0:4] = bbox_xywh
+                    label[i][yind, xind, iou_mask, 4:5] = 1.0
+                    label[i][yind, xind, iou_mask, 5:] = smooth_onehot
+                    bbox_ind = int(bbox_count[i] % self.max_bbox_per_scale)
+                    bboxes_xywh[i][bbox_ind, :4] = bbox_xywh
+                    bbox_count[i] += 1
+                    exist_positive = True
+            if not exist_positive:
+                best_anchor_ind = np.argmax(np.array(iou).reshape(-1), axis=-1)
+                best_detect = int(best_anchor_ind / self.anchor_per_scale)
+                best_anchor = int(best_anchor_ind % self.anchor_per_scale)
+                xind, yind = np.floor(
+                    bbox_xywh_scaled[best_detect, 0:2]
+                ).astype(np.int32)
+                label[best_detect][yind, xind, best_anchor, :] = 0
+                label[best_detect][yind, xind, best_anchor, 0:4] = bbox_xywh
+                label[best_detect][yind, xind, best_anchor, 4:5] = 1.0
+                label[best_detect][yind, xind, best_anchor, 5:] = smooth_onehot
+                bbox_ind = int(
+                    bbox_count[best_detect] % self.max_bbox_per_scale
+                )
+                bboxes_xywh[best_detect][bbox_ind, :4] = bbox_xywh
+                bbox_count[best_detect] += 1
+        label_sbbox, label_mbbox, label_lbbox = label
+        sbboxes, mbboxes, lbboxes = bboxes_xywh
+        return label_sbbox, label_mbbox, label_lbbox, sbboxes, mbboxes, lbboxes
+    def __len__(self):
+        return self.num_batchs

core/utils.py ADDED Viewed

	@@ -0,0 +1,375 @@

+import cv2
+import random
+import colorsys
+import numpy as np
+import tensorflow as tf
+from core.config import cfg
+def load_freeze_layer(model='yolov4', tiny=False):
+    if tiny:
+        if model == 'yolov3':
+            freeze_layouts = ['conv2d_9', 'conv2d_12']
+        else:
+            freeze_layouts = ['conv2d_17', 'conv2d_20']
+    else:
+        if model == 'yolov3':
+            freeze_layouts = ['conv2d_58', 'conv2d_66', 'conv2d_74']
+        else:
+            freeze_layouts = ['conv2d_93', 'conv2d_101', 'conv2d_109']
+    return freeze_layouts
+def load_weights(model, weights_file, model_name='yolov4', is_tiny=False):
+    if is_tiny:
+        if model_name == 'yolov3':
+            layer_size = 13
+            output_pos = [9, 12]
+        else:
+            layer_size = 21
+            output_pos = [17, 20]
+    else:
+        if model_name == 'yolov3':
+            layer_size = 75
+            output_pos = [58, 66, 74]
+        else:
+            layer_size = 110
+            output_pos = [93, 101, 109]
+    wf = open(weights_file, 'rb')
+    major, minor, revision, seen, _ = np.fromfile(wf, dtype=np.int32, count=5)
+    j = 0
+    for i in range(layer_size):
+        conv_layer_name = 'conv2d_%d' %i if i > 0 else 'conv2d'
+        bn_layer_name = 'batch_normalization_%d' %j if j > 0 else 'batch_normalization'
+        conv_layer = model.get_layer(conv_layer_name)
+        filters = conv_layer.filters
+        k_size = conv_layer.kernel_size[0]
+        in_dim = conv_layer.input_shape[-1]
+        if i not in output_pos:
+            # darknet weights: [beta, gamma, mean, variance]
+            bn_weights = np.fromfile(wf, dtype=np.float32, count=4 * filters)
+            # tf weights: [gamma, beta, mean, variance]
+            bn_weights = bn_weights.reshape((4, filters))[[1, 0, 2, 3]]
+            bn_layer = model.get_layer(bn_layer_name)
+            j += 1
+        else:
+            conv_bias = np.fromfile(wf, dtype=np.float32, count=filters)
+        # darknet shape (out_dim, in_dim, height, width)
+        conv_shape = (filters, in_dim, k_size, k_size)
+        conv_weights = np.fromfile(wf, dtype=np.float32, count=np.product(conv_shape))
+        # tf shape (height, width, in_dim, out_dim)
+        conv_weights = conv_weights.reshape(conv_shape).transpose([2, 3, 1, 0])
+        if i not in output_pos:
+            conv_layer.set_weights([conv_weights])
+            bn_layer.set_weights(bn_weights)
+        else:
+            conv_layer.set_weights([conv_weights, conv_bias])
+    # assert len(wf.read()) == 0, 'failed to read all data'
+    wf.close()
+def read_class_names(class_file_name):
+    names = {}
+    with open(class_file_name, 'r') as data:
+        for ID, name in enumerate(data):
+            names[ID] = name.strip('\n')
+    return names
+def load_config(FLAGS):
+    if FLAGS.tiny:
+        STRIDES = np.array(cfg.YOLO.STRIDES_TINY)
+        ANCHORS = get_anchors(cfg.YOLO.ANCHORS_TINY, FLAGS.tiny)
+        XYSCALE = cfg.YOLO.XYSCALE_TINY if FLAGS.model == 'yolov4' else [1, 1]
+    else:
+        STRIDES = np.array(cfg.YOLO.STRIDES)
+        if FLAGS.model == 'yolov4':
+            ANCHORS = get_anchors(cfg.YOLO.ANCHORS, FLAGS.tiny)
+        elif FLAGS.model == 'yolov3':
+            ANCHORS = get_anchors(cfg.YOLO.ANCHORS_V3, FLAGS.tiny)
+        XYSCALE = cfg.YOLO.XYSCALE if FLAGS.model == 'yolov4' else [1, 1, 1]
+    NUM_CLASS = len(read_class_names(cfg.YOLO.CLASSES))
+    return STRIDES, ANCHORS, NUM_CLASS, XYSCALE
+def get_anchors(anchors_path, tiny=False):
+    anchors = np.array(anchors_path)
+    if tiny:
+        return anchors.reshape(2, 3, 2)
+    else:
+        return anchors.reshape(3, 3, 2)
+def image_preprocess(image, target_size, gt_boxes=None):
+    ih, iw    = target_size
+    h,  w, _  = image.shape
+    scale = min(iw/w, ih/h)
+    nw, nh  = int(scale * w), int(scale * h)
+    image_resized = cv2.resize(image, (nw, nh))
+    image_paded = np.full(shape=[ih, iw, 3], fill_value=128.0)
+    dw, dh = (iw - nw) // 2, (ih-nh) // 2
+    image_paded[dh:nh+dh, dw:nw+dw, :] = image_resized
+    image_paded = image_paded / 255.
+    if gt_boxes is None:
+        return image_paded
+    else:
+        gt_boxes[:, [0, 2]] = gt_boxes[:, [0, 2]] * scale + dw
+        gt_boxes[:, [1, 3]] = gt_boxes[:, [1, 3]] * scale + dh
+        return image_paded, gt_boxes
+def draw_bbox(image, bboxes, classes=read_class_names(cfg.YOLO.CLASSES), show_label=True):
+    num_classes = len(classes)
+    image_h, image_w, _ = image.shape
+    hsv_tuples = [(1.0 * x / num_classes, 1., 1.) for x in range(num_classes)]
+    colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
+    colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), colors))
+    random.seed(0)
+    random.shuffle(colors)
+    random.seed(None)
+    out_boxes, out_scores, out_classes, num_boxes = bboxes
+    for i in range(num_boxes[0]):
+        if int(out_classes[0][i]) < 0 or int(out_classes[0][i]) > num_classes: continue
+        coor = out_boxes[0][i]
+        coor[0] = int(coor[0] * image_h)
+        coor[2] = int(coor[2] * image_h)
+        coor[1] = int(coor[1] * image_w)
+        coor[3] = int(coor[3] * image_w)
+        fontScale = 0.5
+        score = out_scores[0][i]
+        class_ind = int(out_classes[0][i])
+        bbox_color = colors[class_ind]
+        bbox_thick = int(0.6 * (image_h + image_w) / 600)
+        c1, c2 = (coor[1], coor[0]), (coor[3], coor[2])
+        cv2.rectangle(image, c1, c2, bbox_color, bbox_thick)
+        if show_label:
+            bbox_mess = '%s: %.2f' % (classes[class_ind], score)
+            t_size = cv2.getTextSize(bbox_mess, 0, fontScale, thickness=bbox_thick // 2)[0]
+            c3 = (c1[0] + t_size[0], c1[1] - t_size[1] - 3)
+            cv2.rectangle(image, c1, (np.float32(c3[0]), np.float32(c3[1])), bbox_color, -1) #filled
+            cv2.putText(image, bbox_mess, (c1[0], np.float32(c1[1] - 2)), cv2.FONT_HERSHEY_SIMPLEX,
+                        fontScale, (0, 0, 0), bbox_thick // 2, lineType=cv2.LINE_AA)
+    return image
+def bbox_iou(bboxes1, bboxes2):
+    """
+    @param bboxes1: (a, b, ..., 4)
+    @param bboxes2: (A, B, ..., 4)
+        x:X is 1:n or n:n or n:1
+    @return (max(a,A), max(b,B), ...)
+    ex) (4,):(3,4) -> (3,)
+        (2,1,4):(2,3,4) -> (2,3)
+    """
+    bboxes1_area = bboxes1[..., 2] * bboxes1[..., 3]
+    bboxes2_area = bboxes2[..., 2] * bboxes2[..., 3]
+    bboxes1_coor = tf.concat(
+        [
+            bboxes1[..., :2] - bboxes1[..., 2:] * 0.5,
+            bboxes1[..., :2] + bboxes1[..., 2:] * 0.5,
+        ],
+        axis=-1,
+    )
+    bboxes2_coor = tf.concat(
+        [
+            bboxes2[..., :2] - bboxes2[..., 2:] * 0.5,
+            bboxes2[..., :2] + bboxes2[..., 2:] * 0.5,
+        ],
+        axis=-1,
+    )
+    left_up = tf.maximum(bboxes1_coor[..., :2], bboxes2_coor[..., :2])
+    right_down = tf.minimum(bboxes1_coor[..., 2:], bboxes2_coor[..., 2:])
+    inter_section = tf.maximum(right_down - left_up, 0.0)
+    inter_area = inter_section[..., 0] * inter_section[..., 1]
+    union_area = bboxes1_area + bboxes2_area - inter_area
+    iou = tf.math.divide_no_nan(inter_area, union_area)
+    return iou
+def bbox_giou(bboxes1, bboxes2):
+    """
+    Generalized IoU
+    @param bboxes1: (a, b, ..., 4)
+    @param bboxes2: (A, B, ..., 4)
+        x:X is 1:n or n:n or n:1
+    @return (max(a,A), max(b,B), ...)
+    ex) (4,):(3,4) -> (3,)
+        (2,1,4):(2,3,4) -> (2,3)
+    """
+    bboxes1_area = bboxes1[..., 2] * bboxes1[..., 3]
+    bboxes2_area = bboxes2[..., 2] * bboxes2[..., 3]
+    bboxes1_coor = tf.concat(
+        [
+            bboxes1[..., :2] - bboxes1[..., 2:] * 0.5,
+            bboxes1[..., :2] + bboxes1[..., 2:] * 0.5,
+        ],
+        axis=-1,
+    )
+    bboxes2_coor = tf.concat(
+        [
+            bboxes2[..., :2] - bboxes2[..., 2:] * 0.5,
+            bboxes2[..., :2] + bboxes2[..., 2:] * 0.5,
+        ],
+        axis=-1,
+    )
+    left_up = tf.maximum(bboxes1_coor[..., :2], bboxes2_coor[..., :2])
+    right_down = tf.minimum(bboxes1_coor[..., 2:], bboxes2_coor[..., 2:])
+    inter_section = tf.maximum(right_down - left_up, 0.0)
+    inter_area = inter_section[..., 0] * inter_section[..., 1]
+    union_area = bboxes1_area + bboxes2_area - inter_area
+    iou = tf.math.divide_no_nan(inter_area, union_area)
+    enclose_left_up = tf.minimum(bboxes1_coor[..., :2], bboxes2_coor[..., :2])
+    enclose_right_down = tf.maximum(
+        bboxes1_coor[..., 2:], bboxes2_coor[..., 2:]
+    )
+    enclose_section = enclose_right_down - enclose_left_up
+    enclose_area = enclose_section[..., 0] * enclose_section[..., 1]
+    giou = iou - tf.math.divide_no_nan(enclose_area - union_area, enclose_area)
+    return giou
+def bbox_ciou(bboxes1, bboxes2):
+    """
+    Complete IoU
+    @param bboxes1: (a, b, ..., 4)
+    @param bboxes2: (A, B, ..., 4)
+        x:X is 1:n or n:n or n:1
+    @return (max(a,A), max(b,B), ...)
+    ex) (4,):(3,4) -> (3,)
+        (2,1,4):(2,3,4) -> (2,3)
+    """
+    bboxes1_area = bboxes1[..., 2] * bboxes1[..., 3]
+    bboxes2_area = bboxes2[..., 2] * bboxes2[..., 3]
+    bboxes1_coor = tf.concat(
+        [
+            bboxes1[..., :2] - bboxes1[..., 2:] * 0.5,
+            bboxes1[..., :2] + bboxes1[..., 2:] * 0.5,
+        ],
+        axis=-1,
+    )
+    bboxes2_coor = tf.concat(
+        [
+            bboxes2[..., :2] - bboxes2[..., 2:] * 0.5,
+            bboxes2[..., :2] + bboxes2[..., 2:] * 0.5,
+        ],
+        axis=-1,
+    )
+    left_up = tf.maximum(bboxes1_coor[..., :2], bboxes2_coor[..., :2])
+    right_down = tf.minimum(bboxes1_coor[..., 2:], bboxes2_coor[..., 2:])
+    inter_section = tf.maximum(right_down - left_up, 0.0)
+    inter_area = inter_section[..., 0] * inter_section[..., 1]
+    union_area = bboxes1_area + bboxes2_area - inter_area
+    iou = tf.math.divide_no_nan(inter_area, union_area)
+    enclose_left_up = tf.minimum(bboxes1_coor[..., :2], bboxes2_coor[..., :2])
+    enclose_right_down = tf.maximum(
+        bboxes1_coor[..., 2:], bboxes2_coor[..., 2:]
+    )
+    enclose_section = enclose_right_down - enclose_left_up
+    c_2 = enclose_section[..., 0] ** 2 + enclose_section[..., 1] ** 2
+    center_diagonal = bboxes2[..., :2] - bboxes1[..., :2]
+    rho_2 = center_diagonal[..., 0] ** 2 + center_diagonal[..., 1] ** 2
+    diou = iou - tf.math.divide_no_nan(rho_2, c_2)
+    v = (
+        (
+            tf.math.atan(
+                tf.math.divide_no_nan(bboxes1[..., 2], bboxes1[..., 3])
+            )
+            - tf.math.atan(
+                tf.math.divide_no_nan(bboxes2[..., 2], bboxes2[..., 3])
+            )
+        )
+        * 2
+        / np.pi
+    ) ** 2
+    alpha = tf.math.divide_no_nan(v, 1 - iou + v)
+    ciou = diou - alpha * v
+    return ciou
+def nms(bboxes, iou_threshold, sigma=0.3, method='nms'):
+    """
+    :param bboxes: (xmin, ymin, xmax, ymax, score, class)
+    Note: soft-nms, https://arxiv.org/pdf/1704.04503.pdf
+          https://github.com/bharatsingh430/soft-nms
+    """
+    classes_in_img = list(set(bboxes[:, 5]))
+    best_bboxes = []
+    for cls in classes_in_img:
+        cls_mask = (bboxes[:, 5] == cls)
+        cls_bboxes = bboxes[cls_mask]
+        while len(cls_bboxes) > 0:
+            max_ind = np.argmax(cls_bboxes[:, 4])
+            best_bbox = cls_bboxes[max_ind]
+            best_bboxes.append(best_bbox)
+            cls_bboxes = np.concatenate([cls_bboxes[: max_ind], cls_bboxes[max_ind + 1:]])
+            iou = bbox_iou(best_bbox[np.newaxis, :4], cls_bboxes[:, :4])
+            weight = np.ones((len(iou),), dtype=np.float32)
+            assert method in ['nms', 'soft-nms']
+            if method == 'nms':
+                iou_mask = iou > iou_threshold
+                weight[iou_mask] = 0.0
+            if method == 'soft-nms':
+                weight = np.exp(-(1.0 * iou ** 2 / sigma))
+            cls_bboxes[:, 4] = cls_bboxes[:, 4] * weight
+            score_mask = cls_bboxes[:, 4] > 0.
+            cls_bboxes = cls_bboxes[score_mask]
+    return best_bboxes
+def freeze_all(model, frozen=True):
+    model.trainable = not frozen
+    if isinstance(model, tf.keras.Model):
+        for l in model.layers:
+            freeze_all(l, frozen)
+def unfreeze_all(model, frozen=False):
+    model.trainable = not frozen
+    if isinstance(model, tf.keras.Model):
+        for l in model.layers:
+            unfreeze_all(l, frozen)

core/yolov4.py ADDED Viewed

	@@ -0,0 +1,367 @@

+#! /usr/bin/env python
+# coding=utf-8
+import numpy as np
+import tensorflow as tf
+import core.utils as utils
+import core.common as common
+import core.backbone as backbone
+from core.config import cfg
+# NUM_CLASS       = len(utils.read_class_names(cfg.YOLO.CLASSES))
+# STRIDES         = np.array(cfg.YOLO.STRIDES)
+# IOU_LOSS_THRESH = cfg.YOLO.IOU_LOSS_THRESH
+# XYSCALE = cfg.YOLO.XYSCALE
+# ANCHORS = utils.get_anchors(cfg.YOLO.ANCHORS)
+def YOLO(input_layer, NUM_CLASS, model='yolov4', is_tiny=False):
+    if is_tiny:
+        if model == 'yolov4':
+            return YOLOv4_tiny(input_layer, NUM_CLASS)
+        elif model == 'yolov3':
+            return YOLOv3_tiny(input_layer, NUM_CLASS)
+    else:
+        if model == 'yolov4':
+            return YOLOv4(input_layer, NUM_CLASS)
+        elif model == 'yolov3':
+            return YOLOv3(input_layer, NUM_CLASS)
+def YOLOv3(input_layer, NUM_CLASS):
+    route_1, route_2, conv = backbone.darknet53(input_layer)
+    conv = common.convolutional(conv, (1, 1, 1024, 512))
+    conv = common.convolutional(conv, (3, 3, 512, 1024))
+    conv = common.convolutional(conv, (1, 1, 1024, 512))
+    conv = common.convolutional(conv, (3, 3, 512, 1024))
+    conv = common.convolutional(conv, (1, 1, 1024, 512))
+    conv_lobj_branch = common.convolutional(conv, (3, 3, 512, 1024))
+    conv_lbbox = common.convolutional(conv_lobj_branch, (1, 1, 1024, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv = common.upsample(conv)
+    conv = tf.concat([conv, route_2], axis=-1)
+    conv = common.convolutional(conv, (1, 1, 768, 256))
+    conv = common.convolutional(conv, (3, 3, 256, 512))
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv = common.convolutional(conv, (3, 3, 256, 512))
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv_mobj_branch = common.convolutional(conv, (3, 3, 256, 512))
+    conv_mbbox = common.convolutional(conv_mobj_branch, (1, 1, 512, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    conv = common.upsample(conv)
+    conv = tf.concat([conv, route_1], axis=-1)
+    conv = common.convolutional(conv, (1, 1, 384, 128))
+    conv = common.convolutional(conv, (3, 3, 128, 256))
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    conv = common.convolutional(conv, (3, 3, 128, 256))
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    conv_sobj_branch = common.convolutional(conv, (3, 3, 128, 256))
+    conv_sbbox = common.convolutional(conv_sobj_branch, (1, 1, 256, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    return [conv_sbbox, conv_mbbox, conv_lbbox]
+def YOLOv4(input_layer, NUM_CLASS):
+    route_1, route_2, conv = backbone.cspdarknet53(input_layer)
+    route = conv
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv = common.upsample(conv)
+    route_2 = common.convolutional(route_2, (1, 1, 512, 256))
+    conv = tf.concat([route_2, conv], axis=-1)
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv = common.convolutional(conv, (3, 3, 256, 512))
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv = common.convolutional(conv, (3, 3, 256, 512))
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    route_2 = conv
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    conv = common.upsample(conv)
+    route_1 = common.convolutional(route_1, (1, 1, 256, 128))
+    conv = tf.concat([route_1, conv], axis=-1)
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    conv = common.convolutional(conv, (3, 3, 128, 256))
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    conv = common.convolutional(conv, (3, 3, 128, 256))
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    route_1 = conv
+    conv = common.convolutional(conv, (3, 3, 128, 256))
+    conv_sbbox = common.convolutional(conv, (1, 1, 256, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    conv = common.convolutional(route_1, (3, 3, 128, 256), downsample=True)
+    conv = tf.concat([conv, route_2], axis=-1)
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv = common.convolutional(conv, (3, 3, 256, 512))
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv = common.convolutional(conv, (3, 3, 256, 512))
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    route_2 = conv
+    conv = common.convolutional(conv, (3, 3, 256, 512))
+    conv_mbbox = common.convolutional(conv, (1, 1, 512, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    conv = common.convolutional(route_2, (3, 3, 256, 512), downsample=True)
+    conv = tf.concat([conv, route], axis=-1)
+    conv = common.convolutional(conv, (1, 1, 1024, 512))
+    conv = common.convolutional(conv, (3, 3, 512, 1024))
+    conv = common.convolutional(conv, (1, 1, 1024, 512))
+    conv = common.convolutional(conv, (3, 3, 512, 1024))
+    conv = common.convolutional(conv, (1, 1, 1024, 512))
+    conv = common.convolutional(conv, (3, 3, 512, 1024))
+    conv_lbbox = common.convolutional(conv, (1, 1, 1024, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    return [conv_sbbox, conv_mbbox, conv_lbbox]
+def YOLOv4_tiny(input_layer, NUM_CLASS):
+    route_1, conv = backbone.cspdarknet53_tiny(input_layer)
+    conv = common.convolutional(conv, (1, 1, 512, 256))
+    conv_lobj_branch = common.convolutional(conv, (3, 3, 256, 512))
+    conv_lbbox = common.convolutional(conv_lobj_branch, (1, 1, 512, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    conv = common.upsample(conv)
+    conv = tf.concat([conv, route_1], axis=-1)
+    conv_mobj_branch = common.convolutional(conv, (3, 3, 128, 256))
+    conv_mbbox = common.convolutional(conv_mobj_branch, (1, 1, 256, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    return [conv_mbbox, conv_lbbox]
+def YOLOv3_tiny(input_layer, NUM_CLASS):
+    route_1, conv = backbone.darknet53_tiny(input_layer)
+    conv = common.convolutional(conv, (1, 1, 1024, 256))
+    conv_lobj_branch = common.convolutional(conv, (3, 3, 256, 512))
+    conv_lbbox = common.convolutional(conv_lobj_branch, (1, 1, 512, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    conv = common.convolutional(conv, (1, 1, 256, 128))
+    conv = common.upsample(conv)
+    conv = tf.concat([conv, route_1], axis=-1)
+    conv_mobj_branch = common.convolutional(conv, (3, 3, 128, 256))
+    conv_mbbox = common.convolutional(conv_mobj_branch, (1, 1, 256, 3 * (NUM_CLASS + 5)), activate=False, bn=False)
+    return [conv_mbbox, conv_lbbox]
+def decode(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE=[1,1,1], FRAMEWORK='tf'):
+    if FRAMEWORK == 'trt':
+        return decode_trt(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=i, XYSCALE=XYSCALE)
+    elif FRAMEWORK == 'tflite':
+        return decode_tflite(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=i, XYSCALE=XYSCALE)
+    else:
+        return decode_tf(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=i, XYSCALE=XYSCALE)
+def decode_train(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=0, XYSCALE=[1, 1, 1]):
+    conv_output = tf.reshape(conv_output,
+                             (tf.shape(conv_output)[0], output_size, output_size, 3, 5 + NUM_CLASS))
+    conv_raw_dxdy, conv_raw_dwdh, conv_raw_conf, conv_raw_prob = tf.split(conv_output, (2, 2, 1, NUM_CLASS),
+                                                                          axis=-1)
+    xy_grid = tf.meshgrid(tf.range(output_size), tf.range(output_size))
+    xy_grid = tf.expand_dims(tf.stack(xy_grid, axis=-1), axis=2)  # [gx, gy, 1, 2]
+    xy_grid = tf.tile(tf.expand_dims(xy_grid, axis=0), [tf.shape(conv_output)[0], 1, 1, 3, 1])
+    xy_grid = tf.cast(xy_grid, tf.float32)
+    pred_xy = ((tf.sigmoid(conv_raw_dxdy) * XYSCALE[i]) - 0.5 * (XYSCALE[i] - 1) + xy_grid) * \
+              STRIDES[i]
+    pred_wh = (tf.exp(conv_raw_dwdh) * ANCHORS[i])
+    pred_xywh = tf.concat([pred_xy, pred_wh], axis=-1)
+    pred_conf = tf.sigmoid(conv_raw_conf)
+    pred_prob = tf.sigmoid(conv_raw_prob)
+    return tf.concat([pred_xywh, pred_conf, pred_prob], axis=-1)
+def decode_tf(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=0, XYSCALE=[1, 1, 1]):
+    batch_size = tf.shape(conv_output)[0]
+    conv_output = tf.reshape(conv_output,
+                             (batch_size, output_size, output_size, 3, 5 + NUM_CLASS))
+    conv_raw_dxdy, conv_raw_dwdh, conv_raw_conf, conv_raw_prob = tf.split(conv_output, (2, 2, 1, NUM_CLASS),
+                                                                          axis=-1)
+    xy_grid = tf.meshgrid(tf.range(output_size), tf.range(output_size))
+    xy_grid = tf.expand_dims(tf.stack(xy_grid, axis=-1), axis=2)  # [gx, gy, 1, 2]
+    xy_grid = tf.tile(tf.expand_dims(xy_grid, axis=0), [batch_size, 1, 1, 3, 1])
+    xy_grid = tf.cast(xy_grid, tf.float32)
+    pred_xy = ((tf.sigmoid(conv_raw_dxdy) * XYSCALE[i]) - 0.5 * (XYSCALE[i] - 1) + xy_grid) * \
+              STRIDES[i]
+    pred_wh = (tf.exp(conv_raw_dwdh) * ANCHORS[i])
+    pred_xywh = tf.concat([pred_xy, pred_wh], axis=-1)
+    pred_conf = tf.sigmoid(conv_raw_conf)
+    pred_prob = tf.sigmoid(conv_raw_prob)
+    pred_prob = pred_conf * pred_prob
+    pred_prob = tf.reshape(pred_prob, (batch_size, -1, NUM_CLASS))
+    pred_xywh = tf.reshape(pred_xywh, (batch_size, -1, 4))
+    return pred_xywh, pred_prob
+    # return tf.concat([pred_xywh, pred_conf, pred_prob], axis=-1)
+def decode_tflite(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=0, XYSCALE=[1,1,1]):
+    conv_raw_dxdy_0, conv_raw_dwdh_0, conv_raw_score_0,\
+    conv_raw_dxdy_1, conv_raw_dwdh_1, conv_raw_score_1,\
+    conv_raw_dxdy_2, conv_raw_dwdh_2, conv_raw_score_2 = tf.split(conv_output, (2, 2, 1+NUM_CLASS, 2, 2, 1+NUM_CLASS,
+                                                                                2, 2, 1+NUM_CLASS), axis=-1)
+    conv_raw_score = [conv_raw_score_0, conv_raw_score_1, conv_raw_score_2]
+    for idx, score in enumerate(conv_raw_score):
+        score = tf.sigmoid(score)
+        score = score[:, :, :, 0:1] * score[:, :, :, 1:]
+        conv_raw_score[idx] = tf.reshape(score, (1, -1, NUM_CLASS))
+    pred_prob = tf.concat(conv_raw_score, axis=1)
+    conv_raw_dwdh = [conv_raw_dwdh_0, conv_raw_dwdh_1, conv_raw_dwdh_2]
+    for idx, dwdh in enumerate(conv_raw_dwdh):
+        dwdh = tf.exp(dwdh) * ANCHORS[i][idx]
+        conv_raw_dwdh[idx] = tf.reshape(dwdh, (1, -1, 2))
+    pred_wh = tf.concat(conv_raw_dwdh, axis=1)
+    xy_grid = tf.meshgrid(tf.range(output_size), tf.range(output_size))
+    xy_grid = tf.stack(xy_grid, axis=-1)  # [gx, gy, 2]
+    xy_grid = tf.expand_dims(xy_grid, axis=0)
+    xy_grid = tf.cast(xy_grid, tf.float32)
+    conv_raw_dxdy = [conv_raw_dxdy_0, conv_raw_dxdy_1, conv_raw_dxdy_2]
+    for idx, dxdy in enumerate(conv_raw_dxdy):
+        dxdy = ((tf.sigmoid(dxdy) * XYSCALE[i]) - 0.5 * (XYSCALE[i] - 1) + xy_grid) * \
+              STRIDES[i]
+        conv_raw_dxdy[idx] = tf.reshape(dxdy, (1, -1, 2))
+    pred_xy = tf.concat(conv_raw_dxdy, axis=1)
+    pred_xywh = tf.concat([pred_xy, pred_wh], axis=-1)
+    return pred_xywh, pred_prob
+    # return tf.concat([pred_xywh, pred_conf, pred_prob], axis=-1)
+def decode_trt(conv_output, output_size, NUM_CLASS, STRIDES, ANCHORS, i=0, XYSCALE=[1,1,1]):
+    batch_size = tf.shape(conv_output)[0]
+    conv_output = tf.reshape(conv_output, (batch_size, output_size, output_size, 3, 5 + NUM_CLASS))
+    conv_raw_dxdy, conv_raw_dwdh, conv_raw_conf, conv_raw_prob = tf.split(conv_output, (2, 2, 1, NUM_CLASS), axis=-1)
+    xy_grid = tf.meshgrid(tf.range(output_size), tf.range(output_size))
+    xy_grid = tf.expand_dims(tf.stack(xy_grid, axis=-1), axis=2)  # [gx, gy, 1, 2]
+    xy_grid = tf.tile(tf.expand_dims(xy_grid, axis=0), [batch_size, 1, 1, 3, 1])
+    # x = tf.tile(tf.expand_dims(tf.range(output_size, dtype=tf.float32), axis=0), [output_size, 1])
+    # y = tf.tile(tf.expand_dims(tf.range(output_size, dtype=tf.float32), axis=1), [1, output_size])
+    # xy_grid = tf.expand_dims(tf.stack([x, y], axis=-1), axis=2)  # [gx, gy, 1, 2]
+    # xy_grid = tf.tile(tf.expand_dims(xy_grid, axis=0), [tf.shape(conv_output)[0], 1, 1, 3, 1])
+    xy_grid = tf.cast(xy_grid, tf.float32)
+    # pred_xy = ((tf.sigmoid(conv_raw_dxdy) * XYSCALE[i]) - 0.5 * (XYSCALE[i] - 1) + xy_grid) * \
+    #           STRIDES[i]
+    pred_xy = (tf.reshape(tf.sigmoid(conv_raw_dxdy), (-1, 2)) * XYSCALE[i] - 0.5 * (XYSCALE[i] - 1) + tf.reshape(xy_grid, (-1, 2))) * STRIDES[i]
+    pred_xy = tf.reshape(pred_xy, (batch_size, output_size, output_size, 3, 2))
+    pred_wh = (tf.exp(conv_raw_dwdh) * ANCHORS[i])
+    pred_xywh = tf.concat([pred_xy, pred_wh], axis=-1)
+    pred_conf = tf.sigmoid(conv_raw_conf)
+    pred_prob = tf.sigmoid(conv_raw_prob)
+    pred_prob = pred_conf * pred_prob
+    pred_prob = tf.reshape(pred_prob, (batch_size, -1, NUM_CLASS))
+    pred_xywh = tf.reshape(pred_xywh, (batch_size, -1, 4))
+    return pred_xywh, pred_prob
+    # return tf.concat([pred_xywh, pred_conf, pred_prob], axis=-1)
+def filter_boxes(box_xywh, scores, score_threshold=0.4, input_shape = tf.constant([416,416])):
+    scores_max = tf.math.reduce_max(scores, axis=-1)
+    mask = scores_max >= score_threshold
+    class_boxes = tf.boolean_mask(box_xywh, mask)
+    pred_conf = tf.boolean_mask(scores, mask)
+    class_boxes = tf.reshape(class_boxes, [tf.shape(scores)[0], -1, tf.shape(class_boxes)[-1]])
+    pred_conf = tf.reshape(pred_conf, [tf.shape(scores)[0], -1, tf.shape(pred_conf)[-1]])
+    box_xy, box_wh = tf.split(class_boxes, (2, 2), axis=-1)
+    input_shape = tf.cast(input_shape, dtype=tf.float32)
+    box_yx = box_xy[..., ::-1]
+    box_hw = box_wh[..., ::-1]
+    box_mins = (box_yx - (box_hw / 2.)) / input_shape
+    box_maxes = (box_yx + (box_hw / 2.)) / input_shape
+    boxes = tf.concat([
+        box_mins[..., 0:1],  # y_min
+        box_mins[..., 1:2],  # x_min
+        box_maxes[..., 0:1],  # y_max
+        box_maxes[..., 1:2]  # x_max
+    ], axis=-1)
+    # return tf.concat([boxes, pred_conf], axis=-1)
+    return (boxes, pred_conf)
+def compute_loss(pred, conv, label, bboxes, STRIDES, NUM_CLASS, IOU_LOSS_THRESH, i=0):
+    conv_shape  = tf.shape(conv)
+    batch_size  = conv_shape[0]
+    output_size = conv_shape[1]
+    input_size  = STRIDES[i] * output_size
+    conv = tf.reshape(conv, (batch_size, output_size, output_size, 3, 5 + NUM_CLASS))
+    conv_raw_conf = conv[:, :, :, :, 4:5]
+    conv_raw_prob = conv[:, :, :, :, 5:]
+    pred_xywh     = pred[:, :, :, :, 0:4]
+    pred_conf     = pred[:, :, :, :, 4:5]
+    label_xywh    = label[:, :, :, :, 0:4]
+    respond_bbox  = label[:, :, :, :, 4:5]
+    label_prob    = label[:, :, :, :, 5:]
+    giou = tf.expand_dims(utils.bbox_giou(pred_xywh, label_xywh), axis=-1)
+    input_size = tf.cast(input_size, tf.float32)
+    bbox_loss_scale = 2.0 - 1.0 * label_xywh[:, :, :, :, 2:3] * label_xywh[:, :, :, :, 3:4] / (input_size ** 2)
+    giou_loss = respond_bbox * bbox_loss_scale * (1- giou)
+    iou = utils.bbox_iou(pred_xywh[:, :, :, :, np.newaxis, :], bboxes[:, np.newaxis, np.newaxis, np.newaxis, :, :])
+    max_iou = tf.expand_dims(tf.reduce_max(iou, axis=-1), axis=-1)
+    respond_bgd = (1.0 - respond_bbox) * tf.cast( max_iou < IOU_LOSS_THRESH, tf.float32 )
+    conf_focal = tf.pow(respond_bbox - pred_conf, 2)
+    conf_loss = conf_focal * (
+            respond_bbox * tf.nn.sigmoid_cross_entropy_with_logits(labels=respond_bbox, logits=conv_raw_conf)
+            +
+            respond_bgd * tf.nn.sigmoid_cross_entropy_with_logits(labels=respond_bbox, logits=conv_raw_conf)
+    )
+    prob_loss = respond_bbox * tf.nn.sigmoid_cross_entropy_with_logits(labels=label_prob, logits=conv_raw_prob)
+    giou_loss = tf.reduce_mean(tf.reduce_sum(giou_loss, axis=[1,2,3,4]))
+    conf_loss = tf.reduce_mean(tf.reduce_sum(conf_loss, axis=[1,2,3,4]))
+    prob_loss = tf.reduce_mean(tf.reduce_sum(prob_loss, axis=[1,2,3,4]))
+    return giou_loss, conf_loss, prob_loss

data/anchors/basline_anchors.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ 1.25,1.625, 2.0,3.75, 4.125,2.875, 1.875,3.8125, 3.875,2.8125, 3.6875,7.4375, 3.625,2.8125, 4.875,6.1875, 11.65625,10.1875

data/anchors/basline_tiny_anchors.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ 23,27, 37,58, 81,82, 81,82, 135,169, 344,319

data/anchors/yolov3_anchors.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326

data/anchors/yolov4_anchors.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ 12,16, 19,36, 40,28, 36,75, 76,55, 72,146, 142,110, 192,243, 459,401

data/classes/coco.names ADDED Viewed

	@@ -0,0 +1,80 @@

+person
+bicycle
+car
+motorbike
+aeroplane
+bus
+train
+truck
+boat
+traffic light
+fire hydrant
+stop sign
+parking meter
+bench
+bird
+cat
+dog
+horse
+sheep
+cow
+elephant
+bear
+zebra
+giraffe
+backpack
+umbrella
+handbag
+tie
+suitcase
+frisbee
+skis
+snowboard
+sports ball
+kite
+baseball bat
+baseball glove
+skateboard
+surfboard
+tennis racket
+bottle
+wine glass
+cup
+fork
+knife
+spoon
+bowl
+banana
+apple
+sandwich
+orange
+broccoli
+carrot
+hot dog
+pizza
+donut
+cake
+chair
+sofa
+potted plant
+bed
+dining table
+toilet
+tvmonitor
+laptop
+mouse
+remote
+keyboard
+cell phone
+microwave
+oven
+toaster
+sink
+refrigerator
+book
+clock
+vase
+scissors
+teddy bear
+hair drier
+toothbrush

data/classes/voc.names ADDED Viewed

	@@ -0,0 +1,20 @@

+aeroplane
+bicycle
+bird
+boat
+bottle
+bus
+car
+cat
+chair
+cow
+diningtable
+dog
+horse
+motorbike
+person
+pottedplant
+sheep
+sofa
+train
+tvmonitor

data/classes/yymnist.names ADDED Viewed

	@@ -0,0 +1,10 @@

+0
+1
+2
+3
+4
+5
+6
+7
+8
+9

data/dataset/val2014.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

data/dataset/val2017.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

data/girl.png ADDED Viewed

data/kite.jpg ADDED Viewed

data/performance.png ADDED Viewed

data/road.mp4 ADDED Viewed

Binary file (801 kB). View file

detect.py ADDED Viewed

	@@ -0,0 +1,92 @@

+import tensorflow as tf
+physical_devices = tf.config.experimental.list_physical_devices('GPU')
+if len(physical_devices) > 0:
+    tf.config.experimental.set_memory_growth(physical_devices[0], True)
+from absl import app, flags, logging
+from absl.flags import FLAGS
+import core.utils as utils
+from core.yolov4 import filter_boxes
+from tensorflow.python.saved_model import tag_constants
+from PIL import Image
+import cv2
+import numpy as np
+from tensorflow.compat.v1 import ConfigProto
+from tensorflow.compat.v1 import InteractiveSession
+flags.DEFINE_string('framework', 'tf', '(tf, tflite, trt')
+flags.DEFINE_string('weights', './checkpoints/yolov4-416',
+                    'path to weights file')
+flags.DEFINE_integer('size', 416, 'resize images to')
+flags.DEFINE_boolean('tiny', False, 'yolo or yolo-tiny')
+flags.DEFINE_string('model', 'yolov4', 'yolov3 or yolov4')
+flags.DEFINE_string('image', './data/kite.jpg', 'path to input image')
+flags.DEFINE_string('output', 'result.png', 'path to output image')
+flags.DEFINE_float('iou', 0.45, 'iou threshold')
+flags.DEFINE_float('score', 0.25, 'score threshold')
+def main(_argv):
+    config = ConfigProto()
+    config.gpu_options.allow_growth = True
+    session = InteractiveSession(config=config)
+    STRIDES, ANCHORS, NUM_CLASS, XYSCALE = utils.load_config(FLAGS)
+    input_size = FLAGS.size
+    image_path = FLAGS.image
+    original_image = cv2.imread(image_path)
+    original_image = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)
+    # image_data = utils.image_preprocess(np.copy(original_image), [input_size, input_size])
+    image_data = cv2.resize(original_image, (input_size, input_size))
+    image_data = image_data / 255.
+    # image_data = image_data[np.newaxis, ...].astype(np.float32)
+    images_data = []
+    for i in range(1):
+        images_data.append(image_data)
+    images_data = np.asarray(images_data).astype(np.float32)
+    if FLAGS.framework == 'tflite':
+        interpreter = tf.lite.Interpreter(model_path=FLAGS.weights)
+        interpreter.allocate_tensors()
+        input_details = interpreter.get_input_details()
+        output_details = interpreter.get_output_details()
+        print(input_details)
+        print(output_details)
+        interpreter.set_tensor(input_details[0]['index'], images_data)
+        interpreter.invoke()
+        pred = [interpreter.get_tensor(output_details[i]['index']) for i in range(len(output_details))]
+        if FLAGS.model == 'yolov3' and FLAGS.tiny == True:
+            boxes, pred_conf = filter_boxes(pred[1], pred[0], score_threshold=0.25, input_shape=tf.constant([input_size, input_size]))
+        else:
+            boxes, pred_conf = filter_boxes(pred[0], pred[1], score_threshold=0.25, input_shape=tf.constant([input_size, input_size]))
+    else:
+        saved_model_loaded = tf.saved_model.load(FLAGS.weights, tags=[tag_constants.SERVING])
+        infer = saved_model_loaded.signatures['serving_default']
+        batch_data = tf.constant(images_data)
+        pred_bbox = infer(batch_data)
+        for key, value in pred_bbox.items():
+            boxes = value[:, :, 0:4]
+            pred_conf = value[:, :, 4:]
+    boxes, scores, classes, valid_detections = tf.image.combined_non_max_suppression(
+        boxes=tf.reshape(boxes, (tf.shape(boxes)[0], -1, 1, 4)),
+        scores=tf.reshape(
+            pred_conf, (tf.shape(pred_conf)[0], -1, tf.shape(pred_conf)[-1])),
+        max_output_size_per_class=50,
+        max_total_size=50,
+        iou_threshold=FLAGS.iou,
+        score_threshold=FLAGS.score
+    )
+    pred_bbox = [boxes.numpy(), scores.numpy(), classes.numpy(), valid_detections.numpy()]
+    image = utils.draw_bbox(original_image, pred_bbox)
+    # image = utils.draw_bbox(image_data*255, pred_bbox)
+    image = Image.fromarray(image.astype(np.uint8))
+    image.show()
+    image = cv2.cvtColor(np.array(image), cv2.COLOR_BGR2RGB)
+    cv2.imwrite(FLAGS.output, image)
+if __name__ == '__main__':
+    try:
+        app.run(main)
+    except SystemExit:
+        pass

detectvideo.py ADDED Viewed

	@@ -0,0 +1,127 @@

+import time
+import tensorflow as tf
+physical_devices = tf.config.experimental.list_physical_devices('GPU')
+if len(physical_devices) > 0:
+    tf.config.experimental.set_memory_growth(physical_devices[0], True)
+from absl import app, flags, logging
+from absl.flags import FLAGS
+import core.utils as utils
+from core.yolov4 import filter_boxes
+from tensorflow.python.saved_model import tag_constants
+from PIL import Image
+import cv2
+import numpy as np
+from tensorflow.compat.v1 import ConfigProto
+from tensorflow.compat.v1 import InteractiveSession
+flags.DEFINE_string('framework', 'tf', '(tf, tflite, trt')
+flags.DEFINE_string('weights', './checkpoints/yolov4-416',
+                    'path to weights file')
+flags.DEFINE_integer('size', 416, 'resize images to')
+flags.DEFINE_boolean('tiny', False, 'yolo or yolo-tiny')
+flags.DEFINE_string('model', 'yolov4', 'yolov3 or yolov4')
+flags.DEFINE_string('video', './data/road.mp4', 'path to input video')
+flags.DEFINE_float('iou', 0.45, 'iou threshold')
+flags.DEFINE_float('score', 0.25, 'score threshold')
+flags.DEFINE_string('output', None, 'path to output video')
+flags.DEFINE_string('output_format', 'XVID', 'codec used in VideoWriter when saving video to file')
+flags.DEFINE_boolean('dis_cv2_window', False, 'disable cv2 window during the process') # this is good for the .ipynb
+def main(_argv):
+    config = ConfigProto()
+    config.gpu_options.allow_growth = True
+    session = InteractiveSession(config=config)
+    STRIDES, ANCHORS, NUM_CLASS, XYSCALE = utils.load_config(FLAGS)
+    input_size = FLAGS.size
+    video_path = FLAGS.video
+    print("Video from: ", video_path )
+    vid = cv2.VideoCapture(video_path)
+    if FLAGS.framework == 'tflite':
+        interpreter = tf.lite.Interpreter(model_path=FLAGS.weights)
+        interpreter.allocate_tensors()
+        input_details = interpreter.get_input_details()
+        output_details = interpreter.get_output_details()
+        print(input_details)
+        print(output_details)
+    else:
+        saved_model_loaded = tf.saved_model.load(FLAGS.weights, tags=[tag_constants.SERVING])
+        infer = saved_model_loaded.signatures['serving_default']
+    if FLAGS.output:
+        # by default VideoCapture returns float instead of int
+        width = int(vid.get(cv2.CAP_PROP_FRAME_WIDTH))
+        height = int(vid.get(cv2.CAP_PROP_FRAME_HEIGHT))
+        fps = int(vid.get(cv2.CAP_PROP_FPS))
+        codec = cv2.VideoWriter_fourcc(*FLAGS.output_format)
+        out = cv2.VideoWriter(FLAGS.output, codec, fps, (width, height))
+    frame_id = 0
+    while True:
+        return_value, frame = vid.read()
+        if return_value:
+            frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+            image = Image.fromarray(frame)
+        else:
+            if frame_id == vid.get(cv2.CAP_PROP_FRAME_COUNT):
+                print("Video processing complete")
+                break
+            raise ValueError("No image! Try with another video format")
+        frame_size = frame.shape[:2]
+        image_data = cv2.resize(frame, (input_size, input_size))
+        image_data = image_data / 255.
+        image_data = image_data[np.newaxis, ...].astype(np.float32)
+        prev_time = time.time()
+        if FLAGS.framework == 'tflite':
+            interpreter.set_tensor(input_details[0]['index'], image_data)
+            interpreter.invoke()
+            pred = [interpreter.get_tensor(output_details[i]['index']) for i in range(len(output_details))]
+            if FLAGS.model == 'yolov3' and FLAGS.tiny == True:
+                boxes, pred_conf = filter_boxes(pred[1], pred[0], score_threshold=0.25,
+                                                input_shape=tf.constant([input_size, input_size]))
+            else:
+                boxes, pred_conf = filter_boxes(pred[0], pred[1], score_threshold=0.25,
+                                                input_shape=tf.constant([input_size, input_size]))
+        else:
+            batch_data = tf.constant(image_data)
+            pred_bbox = infer(batch_data)
+            for key, value in pred_bbox.items():
+                boxes = value[:, :, 0:4]
+                pred_conf = value[:, :, 4:]
+        boxes, scores, classes, valid_detections = tf.image.combined_non_max_suppression(
+            boxes=tf.reshape(boxes, (tf.shape(boxes)[0], -1, 1, 4)),
+            scores=tf.reshape(
+                pred_conf, (tf.shape(pred_conf)[0], -1, tf.shape(pred_conf)[-1])),
+            max_output_size_per_class=50,
+            max_total_size=50,
+            iou_threshold=FLAGS.iou,
+            score_threshold=FLAGS.score
+        )
+        pred_bbox = [boxes.numpy(), scores.numpy(), classes.numpy(), valid_detections.numpy()]
+        image = utils.draw_bbox(frame, pred_bbox)
+        curr_time = time.time()
+        exec_time = curr_time - prev_time
+        result = np.asarray(image)
+        info = "time: %.2f ms" %(1000*exec_time)
+        print(info)
+        result = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
+        if not FLAGS.dis_cv2_window:
+            cv2.namedWindow("result", cv2.WINDOW_AUTOSIZE)
+            cv2.imshow("result", result)
+            if cv2.waitKey(1) & 0xFF == ord('q'): break
+        if FLAGS.output:
+            out.write(result)
+        frame_id += 1
+if __name__ == '__main__':
+    try:
+        app.run(main)
+    except SystemExit:
+        pass

evaluate.py ADDED Viewed

	@@ -0,0 +1,143 @@

+from absl import app, flags, logging
+from absl.flags import FLAGS
+import cv2
+import os
+import shutil
+import numpy as np
+import tensorflow as tf
+from core.yolov4 import filter_boxes
+from tensorflow.python.saved_model import tag_constants
+import core.utils as utils
+from core.config import cfg
+flags.DEFINE_string('weights', './checkpoints/yolov4-416',
+                    'path to weights file')
+flags.DEFINE_string('framework', 'tf', 'select model type in (tf, tflite, trt)'
+                    'path to weights file')
+flags.DEFINE_string('model', 'yolov4', 'yolov3 or yolov4')
+flags.DEFINE_boolean('tiny', False, 'yolov3 or yolov3-tiny')
+flags.DEFINE_integer('size', 416, 'resize images to')
+flags.DEFINE_string('annotation_path', "./data/dataset/val2017.txt", 'annotation path')
+flags.DEFINE_string('write_image_path', "./data/detection/", 'write image path')
+flags.DEFINE_float('iou', 0.5, 'iou threshold')
+flags.DEFINE_float('score', 0.25, 'score threshold')
+def main(_argv):
+    INPUT_SIZE = FLAGS.size
+    STRIDES, ANCHORS, NUM_CLASS, XYSCALE = utils.load_config(FLAGS)
+    CLASSES = utils.read_class_names(cfg.YOLO.CLASSES)
+    predicted_dir_path = './mAP/predicted'
+    ground_truth_dir_path = './mAP/ground-truth'
+    if os.path.exists(predicted_dir_path): shutil.rmtree(predicted_dir_path)
+    if os.path.exists(ground_truth_dir_path): shutil.rmtree(ground_truth_dir_path)
+    if os.path.exists(cfg.TEST.DECTECTED_IMAGE_PATH): shutil.rmtree(cfg.TEST.DECTECTED_IMAGE_PATH)
+    os.mkdir(predicted_dir_path)
+    os.mkdir(ground_truth_dir_path)
+    os.mkdir(cfg.TEST.DECTECTED_IMAGE_PATH)
+    # Build Model
+    if FLAGS.framework == 'tflite':
+        interpreter = tf.lite.Interpreter(model_path=FLAGS.weights)
+        interpreter.allocate_tensors()
+        input_details = interpreter.get_input_details()
+        output_details = interpreter.get_output_details()
+        print(input_details)
+        print(output_details)
+    else:
+        saved_model_loaded = tf.saved_model.load(FLAGS.weights, tags=[tag_constants.SERVING])
+        infer = saved_model_loaded.signatures['serving_default']
+    num_lines = sum(1 for line in open(FLAGS.annotation_path))
+    with open(cfg.TEST.ANNOT_PATH, 'r') as annotation_file:
+        for num, line in enumerate(annotation_file):
+            annotation = line.strip().split()
+            image_path = annotation[0]
+            image_name = image_path.split('/')[-1]
+            image = cv2.imread(image_path)
+            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
+            bbox_data_gt = np.array([list(map(int, box.split(','))) for box in annotation[1:]])
+            if len(bbox_data_gt) == 0:
+                bboxes_gt = []
+                classes_gt = []
+            else:
+                bboxes_gt, classes_gt = bbox_data_gt[:, :4], bbox_data_gt[:, 4]
+            ground_truth_path = os.path.join(ground_truth_dir_path, str(num) + '.txt')
+            print('=> ground truth of %s:' % image_name)
+            num_bbox_gt = len(bboxes_gt)
+            with open(ground_truth_path, 'w') as f:
+                for i in range(num_bbox_gt):
+                    class_name = CLASSES[classes_gt[i]]
+                    xmin, ymin, xmax, ymax = list(map(str, bboxes_gt[i]))
+                    bbox_mess = ' '.join([class_name, xmin, ymin, xmax, ymax]) + '\n'
+                    f.write(bbox_mess)
+                    print('\t' + str(bbox_mess).strip())
+            print('=> predict result of %s:' % image_name)
+            predict_result_path = os.path.join(predicted_dir_path, str(num) + '.txt')
+            # Predict Process
+            image_size = image.shape[:2]
+            # image_data = utils.image_preprocess(np.copy(image), [INPUT_SIZE, INPUT_SIZE])
+            image_data = cv2.resize(np.copy(image), (INPUT_SIZE, INPUT_SIZE))
+            image_data = image_data / 255.
+            image_data = image_data[np.newaxis, ...].astype(np.float32)
+            if FLAGS.framework == 'tflite':
+                interpreter.set_tensor(input_details[0]['index'], image_data)
+                interpreter.invoke()
+                pred = [interpreter.get_tensor(output_details[i]['index']) for i in range(len(output_details))]
+                if FLAGS.model == 'yolov4' and FLAGS.tiny == True:
+                    boxes, pred_conf = filter_boxes(pred[1], pred[0], score_threshold=0.25)
+                else:
+                    boxes, pred_conf = filter_boxes(pred[0], pred[1], score_threshold=0.25)
+            else:
+                batch_data = tf.constant(image_data)
+                pred_bbox = infer(batch_data)
+                for key, value in pred_bbox.items():
+                    boxes = value[:, :, 0:4]
+                    pred_conf = value[:, :, 4:]
+            boxes, scores, classes, valid_detections = tf.image.combined_non_max_suppression(
+                boxes=tf.reshape(boxes, (tf.shape(boxes)[0], -1, 1, 4)),
+                scores=tf.reshape(
+                    pred_conf, (tf.shape(pred_conf)[0], -1, tf.shape(pred_conf)[-1])),
+                max_output_size_per_class=50,
+                max_total_size=50,
+                iou_threshold=FLAGS.iou,
+                score_threshold=FLAGS.score
+            )
+            boxes, scores, classes, valid_detections = [boxes.numpy(), scores.numpy(), classes.numpy(), valid_detections.numpy()]
+            # if cfg.TEST.DECTECTED_IMAGE_PATH is not None:
+            #     image_result = utils.draw_bbox(np.copy(image), [boxes, scores, classes, valid_detections])
+            #     cv2.imwrite(cfg.TEST.DECTECTED_IMAGE_PATH + image_name, image_result)
+            with open(predict_result_path, 'w') as f:
+                image_h, image_w, _ = image.shape
+                for i in range(valid_detections[0]):
+                    if int(classes[0][i]) < 0 or int(classes[0][i]) > NUM_CLASS: continue
+                    coor = boxes[0][i]
+                    coor[0] = int(coor[0] * image_h)
+                    coor[2] = int(coor[2] * image_h)
+                    coor[1] = int(coor[1] * image_w)
+                    coor[3] = int(coor[3] * image_w)
+                    score = scores[0][i]
+                    class_ind = int(classes[0][i])
+                    class_name = CLASSES[class_ind]
+                    score = '%.4f' % score
+                    ymin, xmin, ymax, xmax = list(map(str, coor))
+                    bbox_mess = ' '.join([class_name, score, xmin, ymin, xmax, ymax]) + '\n'
+                    f.write(bbox_mess)
+                    print('\t' + str(bbox_mess).strip())
+            print(num, num_lines)
+if __name__ == '__main__':
+    try:
+        app.run(main)
+    except SystemExit:
+        pass

mAP/extra/intersect-gt-and-pred.py ADDED Viewed

	@@ -0,0 +1,60 @@

+import sys
+import os
+import glob
+## This script ensures same number of files in ground-truth and predicted folder.
+## When you encounter file not found error, it's usually because you have
+## mismatched numbers of ground-truth and predicted files.
+## You can use this script to move ground-truth and predicted files that are
+## not in the intersection into a backup folder (backup_no_matches_found).
+## This will retain only files that have the same name in both folders.
+# change directory to the one with the files to be changed
+path_to_gt = '../ground-truth'
+path_to_pred = '../predicted'
+backup_folder = 'backup_no_matches_found'  # must end without slash
+os.chdir(path_to_gt)
+gt_files = glob.glob('*.txt')
+if len(gt_files) == 0:
+    print("Error: no .txt files found in", path_to_gt)
+    sys.exit()
+os.chdir(path_to_pred)
+pred_files = glob.glob('*.txt')
+if len(pred_files) == 0:
+    print("Error: no .txt files found in", path_to_pred)
+    sys.exit()
+gt_files = set(gt_files)
+pred_files = set(pred_files)
+print('total ground-truth files:', len(gt_files))
+print('total predicted files:', len(pred_files))
+print()
+gt_backup = gt_files - pred_files
+pred_backup = pred_files - gt_files
+def backup(src_folder, backup_files, backup_folder):
+    # non-intersection files (txt format) will be moved to a backup folder
+    if not backup_files:
+        print('No backup required for', src_folder)
+        return
+    os.chdir(src_folder)
+    ## create the backup dir if it doesn't exist already
+    if not os.path.exists(backup_folder):
+        os.makedirs(backup_folder)
+    for file in backup_files:
+        os.rename(file, backup_folder + '/' + file)
+backup(path_to_gt, gt_backup, backup_folder)
+backup(path_to_pred, pred_backup, backup_folder)
+if gt_backup:
+    print('total ground-truth backup files:', len(gt_backup))
+if pred_backup:
+    print('total predicted backup files:', len(pred_backup))
+intersection = gt_files & pred_files
+print('total intersected files:', len(intersection))
+print("Intersection completed!")

mAP/extra/remove_space.py ADDED Viewed

	@@ -0,0 +1,96 @@

+import sys
+import os
+import glob
+import argparse
+# this script will load class_list.txt and find class names with spaces
+# then replace spaces with delimiters inside ground-truth/ and predicted/
+parser = argparse.ArgumentParser()
+parser.add_argument('-d', '--delimiter', type=str, help="delimiter to replace space (default: '-')", default='-')
+parser.add_argument('-y', '--yes', action='store_true', help="force yes confirmation on yes/no query (default: False)", default=False)
+args = parser.parse_args()
+def query_yes_no(question, default="yes", bypass=False):
+  """Ask a yes/no question via raw_input() and return their answer.
+  "question" is a string that is presented to the user.
+  "default" is the presumed answer if the user just hits <Enter>.
+      It must be "yes" (the default), "no" or None (meaning
+      an answer is required of the user).
+  The "answer" return value is True for "yes" or False for "no".
+  """
+  valid = {"yes": True, "y": True, "ye": True,
+           "no": False, "n": False}
+  if default is None:
+    prompt = " [y/n] "
+  elif default == "yes":
+    prompt = " [Y/n] "
+  elif default == "no":
+    prompt = " [y/N] "
+  else:
+    raise ValueError("invalid default answer: '%s'" % default)
+  while True:
+    sys.stdout.write(question + prompt)
+    if bypass:
+        break
+    if sys.version_info[0] == 3:
+      choice = input().lower() # if version 3 of Python
+    else:
+      choice = raw_input().lower()
+    if default is not None and choice == '':
+      return valid[default]
+    elif choice in valid:
+      return valid[choice]
+    else:
+      sys.stdout.write("Please respond with 'yes' or 'no' "
+                         "(or 'y' or 'n').\n")
+def rename_class(current_class_name, new_class_name):
+  # get list of txt files
+  file_list = glob.glob('*.txt')
+  file_list.sort()
+  # iterate through the txt files
+  for txt_file in file_list:
+    class_found = False
+    # open txt file lines to a list
+    with open(txt_file) as f:
+      content = f.readlines()
+    # remove whitespace characters like `\n` at the end of each line
+    content = [x.strip() for x in content]
+    new_content = []
+    # go through each line of eache file
+    for line in content:
+      #class_name = line.split()[0]
+      if current_class_name in line:
+        class_found = True
+        line = line.replace(current_class_name, new_class_name)
+      new_content.append(line)
+    if class_found:
+      # rewrite file
+      with open(txt_file, 'w') as new_f:
+        for line in new_content:
+          new_f.write("%s\n" % line)
+with open('../../data/classes/coco.names') as f:
+    for line in f:
+        current_class_name = line.rstrip("\n")
+        new_class_name = line.replace(' ', args.delimiter).rstrip("\n")
+        if current_class_name == new_class_name:
+            continue
+        y_n_message = ("Are you sure you want "
+                       "to rename the class "
+                       "\"" + current_class_name + "\" "
+                       "into \"" + new_class_name + "\"?"
+                      )
+        if query_yes_no(y_n_message, bypass=args.yes):
+          os.chdir("../ground-truth")
+          rename_class(current_class_name, new_class_name)
+          os.chdir("../predicted")
+          rename_class(current_class_name, new_class_name)
+print('Done!')

mAP/main.py ADDED Viewed

	@@ -0,0 +1,775 @@

+import glob
+import json
+import os
+import shutil
+import operator
+import sys
+import argparse
+from absl import app, flags, logging
+from absl.flags import FLAGS
+MINOVERLAP = 0.5 # default value (defined in the PASCAL VOC2012 challenge)
+parser = argparse.ArgumentParser()
+parser.add_argument('-na', '--no-animation',default=True, help="no animation is shown.", action="store_true")
+parser.add_argument('-np', '--no-plot', help="no plot is shown.", action="store_true")
+parser.add_argument('-q', '--quiet', help="minimalistic console output.", action="store_true")
+# argparse receiving list of classes to be ignored
+parser.add_argument('-i', '--ignore', nargs='+', type=str, help="ignore a list of classes.")
+parser.add_argument('-o', '--output', default="results", type=str, help="output path name")
+# argparse receiving list of classes with specific IoU
+parser.add_argument('--set-class-iou', nargs='+', type=str, help="set IoU for a specific class.")
+args = parser.parse_args()
+# if there are no classes to ignore then replace None by empty list
+if args.ignore is None:
+  args.ignore = []
+specific_iou_flagged = False
+if args.set_class_iou is not None:
+  specific_iou_flagged = True
+# if there are no images then no animation can be shown
+img_path = 'images'
+if os.path.exists(img_path):
+  for dirpath, dirnames, files in os.walk(img_path):
+    if not files:
+      # no image files found
+      args.no_animation = True
+else:
+  args.no_animation = True
+# try to import OpenCV if the user didn't choose the option --no-animation
+show_animation = False
+if not args.no_animation:
+  try:
+    import cv2
+    show_animation = True
+  except ImportError:
+    print("\"opencv-python\" not found, please install to visualize the results.")
+    args.no_animation = True
+# try to import Matplotlib if the user didn't choose the option --no-plot
+draw_plot = False
+if not args.no_plot:
+  try:
+    import matplotlib.pyplot as plt
+    draw_plot = True
+  except ImportError:
+    print("\"matplotlib\" not found, please install it to get the resulting plots.")
+    args.no_plot = True
+"""
+ throw error and exit
+"""
+def error(msg):
+  print(msg)
+  sys.exit(0)
+"""
+ check if the number is a float between 0.0 and 1.0
+"""
+def is_float_between_0_and_1(value):
+  try:
+    val = float(value)
+    if val > 0.0 and val < 1.0:
+      return True
+    else:
+      return False
+  except ValueError:
+    return False
+"""
+ Calculate the AP given the recall and precision array
+  1st) We compute a version of the measured precision/recall curve with
+       precision monotonically decreasing
+  2nd) We compute the AP as the area under this curve by numerical integration.
+"""
+def voc_ap(rec, prec):
+  """
+  --- Official matlab code VOC2012---
+  mrec=[0 ; rec ; 1];
+  mpre=[0 ; prec ; 0];
+  for i=numel(mpre)-1:-1:1
+      mpre(i)=max(mpre(i),mpre(i+1));
+  end
+  i=find(mrec(2:end)~=mrec(1:end-1))+1;
+  ap=sum((mrec(i)-mrec(i-1)).*mpre(i));
+  """
+  rec.insert(0, 0.0) # insert 0.0 at begining of list
+  rec.append(1.0) # insert 1.0 at end of list
+  mrec = rec[:]
+  prec.insert(0, 0.0) # insert 0.0 at begining of list
+  prec.append(0.0) # insert 0.0 at end of list
+  mpre = prec[:]
+  """
+   This part makes the precision monotonically decreasing
+    (goes from the end to the beginning)
+    matlab:  for i=numel(mpre)-1:-1:1
+                mpre(i)=max(mpre(i),mpre(i+1));
+  """
+  # matlab indexes start in 1 but python in 0, so I have to do:
+  #   range(start=(len(mpre) - 2), end=0, step=-1)
+  # also the python function range excludes the end, resulting in:
+  #   range(start=(len(mpre) - 2), end=-1, step=-1)
+  for i in range(len(mpre)-2, -1, -1):
+    mpre[i] = max(mpre[i], mpre[i+1])
+  """
+   This part creates a list of indexes where the recall changes
+    matlab:  i=find(mrec(2:end)~=mrec(1:end-1))+1;
+  """
+  i_list = []
+  for i in range(1, len(mrec)):
+    if mrec[i] != mrec[i-1]:
+      i_list.append(i) # if it was matlab would be i + 1
+  """
+   The Average Precision (AP) is the area under the curve
+    (numerical integration)
+    matlab: ap=sum((mrec(i)-mrec(i-1)).*mpre(i));
+  """
+  ap = 0.0
+  for i in i_list:
+    ap += ((mrec[i]-mrec[i-1])*mpre[i])
+  return ap, mrec, mpre
+"""
+ Convert the lines of a file to a list
+"""
+def file_lines_to_list(path):
+  # open txt file lines to a list
+  with open(path) as f:
+    content = f.readlines()
+  # remove whitespace characters like `\n` at the end of each line
+  content = [x.strip() for x in content]
+  return content
+"""
+ Draws text in image
+"""
+def draw_text_in_image(img, text, pos, color, line_width):
+  font = cv2.FONT_HERSHEY_PLAIN
+  fontScale = 1
+  lineType = 1
+  bottomLeftCornerOfText = pos
+  cv2.putText(img, text,
+      bottomLeftCornerOfText,
+      font,
+      fontScale,
+      color,
+      lineType)
+  text_width, _ = cv2.getTextSize(text, font, fontScale, lineType)[0]
+  return img, (line_width + text_width)
+"""
+ Plot - adjust axes
+"""
+def adjust_axes(r, t, fig, axes):
+  # get text width for re-scaling
+  bb = t.get_window_extent(renderer=r)
+  text_width_inches = bb.width / fig.dpi
+  # get axis width in inches
+  current_fig_width = fig.get_figwidth()
+  new_fig_width = current_fig_width + text_width_inches
+  propotion = new_fig_width / current_fig_width
+  # get axis limit
+  x_lim = axes.get_xlim()
+  axes.set_xlim([x_lim[0], x_lim[1]*propotion])
+"""
+ Draw plot using Matplotlib
+"""
+def draw_plot_func(dictionary, n_classes, window_title, plot_title, x_label, output_path, to_show, plot_color, true_p_bar):
+  # sort the dictionary by decreasing value, into a list of tuples
+  sorted_dic_by_value = sorted(dictionary.items(), key=operator.itemgetter(1))
+  # unpacking the list of tuples into two lists
+  sorted_keys, sorted_values = zip(*sorted_dic_by_value)
+  #
+  if true_p_bar != "":
+    """
+     Special case to draw in (green=true predictions) & (red=false predictions)
+    """
+    fp_sorted = []
+    tp_sorted = []
+    for key in sorted_keys:
+      fp_sorted.append(dictionary[key] - true_p_bar[key])
+      tp_sorted.append(true_p_bar[key])
+    plt.barh(range(n_classes), fp_sorted, align='center', color='crimson', label='False Predictions')
+    plt.barh(range(n_classes), tp_sorted, align='center', color='forestgreen', label='True Predictions', left=fp_sorted)
+    # add legend
+    plt.legend(loc='lower right')
+    """
+     Write number on side of bar
+    """
+    fig = plt.gcf() # gcf - get current figure
+    axes = plt.gca()
+    r = fig.canvas.get_renderer()
+    for i, val in enumerate(sorted_values):
+      fp_val = fp_sorted[i]
+      tp_val = tp_sorted[i]
+      fp_str_val = " " + str(fp_val)
+      tp_str_val = fp_str_val + " " + str(tp_val)
+      # trick to paint multicolor with offset:
+      #   first paint everything and then repaint the first number
+      t = plt.text(val, i, tp_str_val, color='forestgreen', va='center', fontweight='bold')
+      plt.text(val, i, fp_str_val, color='crimson', va='center', fontweight='bold')
+      if i == (len(sorted_values)-1): # largest bar
+        adjust_axes(r, t, fig, axes)
+  else:
+    plt.barh(range(n_classes), sorted_values, color=plot_color)
+    """
+     Write number on side of bar
+    """
+    fig = plt.gcf() # gcf - get current figure
+    axes = plt.gca()
+    r = fig.canvas.get_renderer()
+    for i, val in enumerate(sorted_values):
+      str_val = " " + str(val) # add a space before
+      if val < 1.0:
+        str_val = " {0:.2f}".format(val)
+      t = plt.text(val, i, str_val, color=plot_color, va='center', fontweight='bold')
+      # re-set axes to show number inside the figure
+      if i == (len(sorted_values)-1): # largest bar
+        adjust_axes(r, t, fig, axes)
+  # set window title
+  fig.canvas.set_window_title(window_title)
+  # write classes in y axis
+  tick_font_size = 12
+  plt.yticks(range(n_classes), sorted_keys, fontsize=tick_font_size)
+  """
+   Re-scale height accordingly
+  """
+  init_height = fig.get_figheight()
+  # comput the matrix height in points and inches
+  dpi = fig.dpi
+  height_pt = n_classes * (tick_font_size * 1.4) # 1.4 (some spacing)
+  height_in = height_pt / dpi
+  # compute the required figure height
+  top_margin = 0.15    # in percentage of the figure height
+  bottom_margin = 0.05 # in percentage of the figure height
+  figure_height = height_in / (1 - top_margin - bottom_margin)
+  # set new height
+  if figure_height > init_height:
+    fig.set_figheight(figure_height)
+  # set plot title
+  plt.title(plot_title, fontsize=14)
+  # set axis titles
+  # plt.xlabel('classes')
+  plt.xlabel(x_label, fontsize='large')
+  # adjust size of window
+  fig.tight_layout()
+  # save the plot
+  fig.savefig(output_path)
+  # show image
+  if to_show:
+    plt.show()
+  # close the plot
+  plt.close()
+"""
+ Create a "tmp_files/" and "results/" directory
+"""
+tmp_files_path = "tmp_files"
+if not os.path.exists(tmp_files_path): # if it doesn't exist already
+  os.makedirs(tmp_files_path)
+results_files_path = args.output
+if os.path.exists(results_files_path): # if it exist already
+  # reset the results directory
+  shutil.rmtree(results_files_path)
+os.makedirs(results_files_path)
+if draw_plot:
+  os.makedirs(results_files_path + "/classes")
+if show_animation:
+  os.makedirs(results_files_path + "/images")
+  os.makedirs(results_files_path + "/images/single_predictions")
+"""
+ Ground-Truth
+   Load each of the ground-truth files into a temporary ".json" file.
+   Create a list of all the class names present in the ground-truth (gt_classes).
+"""
+# get a list with the ground-truth files
+ground_truth_files_list = glob.glob('ground-truth/*.txt')
+if len(ground_truth_files_list) == 0:
+  error("Error: No ground-truth files found!")
+ground_truth_files_list.sort()
+# dictionary with counter per class
+gt_counter_per_class = {}
+for txt_file in ground_truth_files_list:
+  #print(txt_file)
+  file_id = txt_file.split(".txt",1)[0]
+  file_id = os.path.basename(os.path.normpath(file_id))
+  # check if there is a correspondent predicted objects file
+  if not os.path.exists('predicted/' + file_id + ".txt"):
+    error_msg = "Error. File not found: predicted/" +  file_id + ".txt\n"
+    error_msg += "(You can avoid this error message by running extra/intersect-gt-and-pred.py)"
+    error(error_msg)
+  lines_list = file_lines_to_list(txt_file)
+  # create ground-truth dictionary
+  bounding_boxes = []
+  is_difficult = False
+  for line in lines_list:
+    try:
+      if "difficult" in line:
+          class_name, left, top, right, bottom, _difficult = line.split()
+          is_difficult = True
+      else:
+          class_name, left, top, right, bottom = line.split()
+    except ValueError:
+      error_msg = "Error: File " + txt_file + " in the wrong format.\n"
+      error_msg += " Expected: <class_name> <left> <top> <right> <bottom> ['difficult']\n"
+      error_msg += " Received: " + line
+      error_msg += "\n\nIf you have a <class_name> with spaces between words you should remove them\n"
+      error_msg += "by running the script \"remove_space.py\" or \"rename_class.py\" in the \"extra/\" folder."
+      error(error_msg)
+    # check if class is in the ignore list, if yes skip
+    if class_name in args.ignore:
+      continue
+    bbox = left + " " + top + " " + right + " " +bottom
+    if is_difficult:
+        bounding_boxes.append({"class_name":class_name, "bbox":bbox, "used":False, "difficult":True})
+        is_difficult = False
+    else:
+        bounding_boxes.append({"class_name":class_name, "bbox":bbox, "used":False})
+        # count that object
+        if class_name in gt_counter_per_class:
+          gt_counter_per_class[class_name] += 1
+        else:
+          # if class didn't exist yet
+          gt_counter_per_class[class_name] = 1
+  # dump bounding_boxes into a ".json" file
+  with open(tmp_files_path + "/" + file_id + "_ground_truth.json", 'w') as outfile:
+    json.dump(bounding_boxes, outfile)
+gt_classes = list(gt_counter_per_class.keys())
+# let's sort the classes alphabetically
+gt_classes = sorted(gt_classes)
+n_classes = len(gt_classes)
+#print(gt_classes)
+#print(gt_counter_per_class)
+"""
+ Check format of the flag --set-class-iou (if used)
+  e.g. check if class exists
+"""
+if specific_iou_flagged:
+  n_args = len(args.set_class_iou)
+  error_msg = \
+    '\n --set-class-iou [class_1] [IoU_1] [class_2] [IoU_2] [...]'
+  if n_args % 2 != 0:
+    error('Error, missing arguments. Flag usage:' + error_msg)
+  # [class_1] [IoU_1] [class_2] [IoU_2]
+  # specific_iou_classes = ['class_1', 'class_2']
+  specific_iou_classes = args.set_class_iou[::2] # even
+  # iou_list = ['IoU_1', 'IoU_2']
+  iou_list = args.set_class_iou[1::2] # odd
+  if len(specific_iou_classes) != len(iou_list):
+    error('Error, missing arguments. Flag usage:' + error_msg)
+  for tmp_class in specific_iou_classes:
+    if tmp_class not in gt_classes:
+          error('Error, unknown class \"' + tmp_class + '\". Flag usage:' + error_msg)
+  for num in iou_list:
+    if not is_float_between_0_and_1(num):
+      error('Error, IoU must be between 0.0 and 1.0. Flag usage:' + error_msg)
+"""
+ Predicted
+   Load each of the predicted files into a temporary ".json" file.
+"""
+# get a list with the predicted files
+predicted_files_list = glob.glob('predicted/*.txt')
+predicted_files_list.sort()
+for class_index, class_name in enumerate(gt_classes):
+  bounding_boxes = []
+  for txt_file in predicted_files_list:
+    #print(txt_file)
+    # the first time it checks if all the corresponding ground-truth files exist
+    file_id = txt_file.split(".txt",1)[0]
+    file_id = os.path.basename(os.path.normpath(file_id))
+    if class_index == 0:
+      if not os.path.exists('ground-truth/' + file_id + ".txt"):
+        error_msg = "Error. File not found: ground-truth/" +  file_id + ".txt\n"
+        error_msg += "(You can avoid this error message by running extra/intersect-gt-and-pred.py)"
+        error(error_msg)
+    lines = file_lines_to_list(txt_file)
+    for line in lines:
+      try:
+        tmp_class_name, confidence, left, top, right, bottom = line.split()
+      except ValueError:
+        error_msg = "Error: File " + txt_file + " in the wrong format.\n"
+        error_msg += " Expected: <class_name> <confidence> <left> <top> <right> <bottom>\n"
+        error_msg += " Received: " + line
+        error(error_msg)
+      if tmp_class_name == class_name:
+        #print("match")
+        bbox = left + " " + top + " " + right + " " +bottom
+        bounding_boxes.append({"confidence":confidence, "file_id":file_id, "bbox":bbox})
+        #print(bounding_boxes)
+  # sort predictions by decreasing confidence
+  bounding_boxes.sort(key=lambda x:float(x['confidence']), reverse=True)
+  with open(tmp_files_path + "/" + class_name + "_predictions.json", 'w') as outfile:
+    json.dump(bounding_boxes, outfile)
+"""
+ Calculate the AP for each class
+"""
+sum_AP = 0.0
+ap_dictionary = {}
+# open file to store the results
+with open(results_files_path + "/results.txt", 'w') as results_file:
+  results_file.write("# AP and precision/recall per class\n")
+  count_true_positives = {}
+  for class_index, class_name in enumerate(gt_classes):
+    count_true_positives[class_name] = 0
+    """
+     Load predictions of that class
+    """
+    predictions_file = tmp_files_path + "/" + class_name + "_predictions.json"
+    predictions_data = json.load(open(predictions_file))
+    """
+     Assign predictions to ground truth objects
+    """
+    nd = len(predictions_data)
+    tp = [0] * nd # creates an array of zeros of size nd
+    fp = [0] * nd
+    for idx, prediction in enumerate(predictions_data):
+      file_id = prediction["file_id"]
+      if show_animation:
+        # find ground truth image
+        ground_truth_img = glob.glob1(img_path, file_id + ".*")
+        #tifCounter = len(glob.glob1(myPath,"*.tif"))
+        if len(ground_truth_img) == 0:
+          error("Error. Image not found with id: " + file_id)
+        elif len(ground_truth_img) > 1:
+          error("Error. Multiple image with id: " + file_id)
+        else: # found image
+          #print(img_path + "/" + ground_truth_img[0])
+          # Load image
+          img = cv2.imread(img_path + "/" + ground_truth_img[0])
+          # load image with draws of multiple detections
+          img_cumulative_path = results_files_path + "/images/" + ground_truth_img[0]
+          if os.path.isfile(img_cumulative_path):
+            img_cumulative = cv2.imread(img_cumulative_path)
+          else:
+            img_cumulative = img.copy()
+          # Add bottom border to image
+          bottom_border = 60
+          BLACK = [0, 0, 0]
+          img = cv2.copyMakeBorder(img, 0, bottom_border, 0, 0, cv2.BORDER_CONSTANT, value=BLACK)
+      # assign prediction to ground truth object if any
+      #   open ground-truth with that file_id
+      gt_file = tmp_files_path + "/" + file_id + "_ground_truth.json"
+      ground_truth_data = json.load(open(gt_file))
+      ovmax = -1
+      gt_match = -1
+      # load prediction bounding-box
+      bb = [ float(x) for x in prediction["bbox"].split() ]
+      for obj in ground_truth_data:
+        # look for a class_name match
+        if obj["class_name"] == class_name:
+          bbgt = [ float(x) for x in obj["bbox"].split() ]
+          bi = [max(bb[0],bbgt[0]), max(bb[1],bbgt[1]), min(bb[2],bbgt[2]), min(bb[3],bbgt[3])]
+          iw = bi[2] - bi[0] + 1
+          ih = bi[3] - bi[1] + 1
+          if iw > 0 and ih > 0:
+            # compute overlap (IoU) = area of intersection / area of union
+            ua = (bb[2] - bb[0] + 1) * (bb[3] - bb[1] + 1) + (bbgt[2] - bbgt[0]
+                    + 1) * (bbgt[3] - bbgt[1] + 1) - iw * ih
+            ov = iw * ih / ua
+            if ov > ovmax:
+              ovmax = ov
+              gt_match = obj
+      # assign prediction as true positive/don't care/false positive
+      if show_animation:
+        status = "NO MATCH FOUND!" # status is only used in the animation
+      # set minimum overlap
+      min_overlap = MINOVERLAP
+      if specific_iou_flagged:
+        if class_name in specific_iou_classes:
+          index = specific_iou_classes.index(class_name)
+          min_overlap = float(iou_list[index])
+      if ovmax >= min_overlap:
+        if "difficult" not in gt_match:
+            if not bool(gt_match["used"]):
+              # true positive
+              tp[idx] = 1
+              gt_match["used"] = True
+              count_true_positives[class_name] += 1
+              # update the ".json" file
+              with open(gt_file, 'w') as f:
+                  f.write(json.dumps(ground_truth_data))
+              if show_animation:
+                status = "MATCH!"
+            else:
+              # false positive (multiple detection)
+              fp[idx] = 1
+              if show_animation:
+                status = "REPEATED MATCH!"
+      else:
+        # false positive
+        fp[idx] = 1
+        if ovmax > 0:
+          status = "INSUFFICIENT OVERLAP"
+      """
+       Draw image to show animation
+      """
+      if show_animation:
+        height, widht = img.shape[:2]
+        # colors (OpenCV works with BGR)
+        white = (255,255,255)
+        light_blue = (255,200,100)
+        green = (0,255,0)
+        light_red = (30,30,255)
+        # 1st line
+        margin = 10
+        v_pos = int(height - margin - (bottom_border / 2))
+        text = "Image: " + ground_truth_img[0] + " "
+        img, line_width = draw_text_in_image(img, text, (margin, v_pos), white, 0)
+        text = "Class [" + str(class_index) + "/" + str(n_classes) + "]: " + class_name + " "
+        img, line_width = draw_text_in_image(img, text, (margin + line_width, v_pos), light_blue, line_width)
+        if ovmax != -1:
+          color = light_red
+          if status == "INSUFFICIENT OVERLAP":
+            text = "IoU: {0:.2f}% ".format(ovmax*100) + "< {0:.2f}% ".format(min_overlap*100)
+          else:
+            text = "IoU: {0:.2f}% ".format(ovmax*100) + ">= {0:.2f}% ".format(min_overlap*100)
+            color = green
+          img, _ = draw_text_in_image(img, text, (margin + line_width, v_pos), color, line_width)
+        # 2nd line
+        v_pos += int(bottom_border / 2)
+        rank_pos = str(idx+1) # rank position (idx starts at 0)
+        text = "Prediction #rank: " + rank_pos + " confidence: {0:.2f}% ".format(float(prediction["confidence"])*100)
+        img, line_width = draw_text_in_image(img, text, (margin, v_pos), white, 0)
+        color = light_red
+        if status == "MATCH!":
+          color = green
+        text = "Result: " + status + " "
+        img, line_width = draw_text_in_image(img, text, (margin + line_width, v_pos), color, line_width)
+        font = cv2.FONT_HERSHEY_SIMPLEX
+        if ovmax > 0: # if there is intersections between the bounding-boxes
+          bbgt = [ int(x) for x in gt_match["bbox"].split() ]
+          cv2.rectangle(img,(bbgt[0],bbgt[1]),(bbgt[2],bbgt[3]),light_blue,2)
+          cv2.rectangle(img_cumulative,(bbgt[0],bbgt[1]),(bbgt[2],bbgt[3]),light_blue,2)
+          cv2.putText(img_cumulative, class_name, (bbgt[0],bbgt[1] - 5), font, 0.6, light_blue, 1, cv2.LINE_AA)
+        bb = [int(i) for i in bb]
+        cv2.rectangle(img,(bb[0],bb[1]),(bb[2],bb[3]),color,2)
+        cv2.rectangle(img_cumulative,(bb[0],bb[1]),(bb[2],bb[3]),color,2)
+        cv2.putText(img_cumulative, class_name, (bb[0],bb[1] - 5), font, 0.6, color, 1, cv2.LINE_AA)
+        # show image
+        cv2.imshow("Animation", img)
+        cv2.waitKey(20) # show for 20 ms
+        # save image to results
+        output_img_path = results_files_path + "/images/single_predictions/" + class_name + "_prediction" + str(idx) + ".jpg"
+        cv2.imwrite(output_img_path, img)
+        # save the image with all the objects drawn to it
+        cv2.imwrite(img_cumulative_path, img_cumulative)
+    #print(tp)
+    # compute precision/recall
+    cumsum = 0
+    for idx, val in enumerate(fp):
+      fp[idx] += cumsum
+      cumsum += val
+    cumsum = 0
+    for idx, val in enumerate(tp):
+      tp[idx] += cumsum
+      cumsum += val
+    #print(tp)
+    rec = tp[:]
+    for idx, val in enumerate(tp):
+      rec[idx] = float(tp[idx]) / gt_counter_per_class[class_name]
+    #print(rec)
+    prec = tp[:]
+    for idx, val in enumerate(tp):
+      prec[idx] = float(tp[idx]) / (fp[idx] + tp[idx])
+    #print(prec)
+    ap, mrec, mprec = voc_ap(rec, prec)
+    sum_AP += ap
+    text = "{0:.2f}%".format(ap*100) + " = " + class_name + " AP  " #class_name + " AP = {0:.2f}%".format(ap*100)
+    """
+     Write to results.txt
+    """
+    rounded_prec = [ '%.2f' % elem for elem in prec ]
+    rounded_rec = [ '%.2f' % elem for elem in rec ]
+    results_file.write(text + "\n Precision: " + str(rounded_prec) + "\n Recall   :" + str(rounded_rec) + "\n\n")
+    if not args.quiet:
+      print(text)
+    ap_dictionary[class_name] = ap
+    """
+     Draw plot
+    """
+    if draw_plot:
+      plt.plot(rec, prec, '-o')
+      # add a new penultimate point to the list (mrec[-2], 0.0)
+      # since the last line segment (and respective area) do not affect the AP value
+      area_under_curve_x = mrec[:-1] + [mrec[-2]] + [mrec[-1]]
+      area_under_curve_y = mprec[:-1] + [0.0] + [mprec[-1]]
+      plt.fill_between(area_under_curve_x, 0, area_under_curve_y, alpha=0.2, edgecolor='r')
+      # set window title
+      fig = plt.gcf() # gcf - get current figure
+      fig.canvas.set_window_title('AP ' + class_name)
+      # set plot title
+      plt.title('class: ' + text)
+      #plt.suptitle('This is a somewhat long figure title', fontsize=16)
+      # set axis titles
+      plt.xlabel('Recall')
+      plt.ylabel('Precision')
+      # optional - set axes
+      axes = plt.gca() # gca - get current axes
+      axes.set_xlim([0.0,1.0])
+      axes.set_ylim([0.0,1.05]) # .05 to give some extra space
+      # Alternative option -> wait for button to be pressed
+      #while not plt.waitforbuttonpress(): pass # wait for key display
+      # Alternative option -> normal display
+      #plt.show()
+      # save the plot
+      fig.savefig(results_files_path + "/classes/" + class_name + ".png")
+      plt.cla() # clear axes for next plot
+  if show_animation:
+    cv2.destroyAllWindows()
+  results_file.write("\n# mAP of all classes\n")
+  mAP = sum_AP / n_classes
+  text = "mAP = {0:.2f}%".format(mAP*100)
+  results_file.write(text + "\n")
+  print(text)
+# remove the tmp_files directory
+shutil.rmtree(tmp_files_path)
+"""
+ Count total of Predictions
+"""
+# iterate through all the files
+pred_counter_per_class = {}
+#all_classes_predicted_files = set([])
+for txt_file in predicted_files_list:
+  # get lines to list
+  lines_list = file_lines_to_list(txt_file)
+  for line in lines_list:
+    class_name = line.split()[0]
+    # check if class is in the ignore list, if yes skip
+    if class_name in args.ignore:
+      continue
+    # count that object
+    if class_name in pred_counter_per_class:
+      pred_counter_per_class[class_name] += 1
+    else:
+      # if class didn't exist yet
+      pred_counter_per_class[class_name] = 1
+#print(pred_counter_per_class)
+pred_classes = list(pred_counter_per_class.keys())
+"""
+ Plot the total number of occurences of each class in the ground-truth
+"""
+if draw_plot:
+  window_title = "Ground-Truth Info"
+  plot_title = "Ground-Truth\n"
+  plot_title += "(" + str(len(ground_truth_files_list)) + " files and " + str(n_classes) + " classes)"
+  x_label = "Number of objects per class"
+  output_path = results_files_path + "/Ground-Truth Info.png"
+  to_show = False
+  plot_color = 'forestgreen'
+  draw_plot_func(
+    gt_counter_per_class,
+    n_classes,
+    window_title,
+    plot_title,
+    x_label,
+    output_path,
+    to_show,
+    plot_color,
+    '',
+    )
+"""
+ Write number of ground-truth objects per class to results.txt
+"""
+with open(results_files_path + "/results.txt", 'a') as results_file:
+  results_file.write("\n# Number of ground-truth objects per class\n")
+  for class_name in sorted(gt_counter_per_class):
+    results_file.write(class_name + ": " + str(gt_counter_per_class[class_name]) + "\n")
+"""
+ Finish counting true positives
+"""
+for class_name in pred_classes:
+  # if class exists in predictions but not in ground-truth then there are no true positives in that class
+  if class_name not in gt_classes:
+    count_true_positives[class_name] = 0
+#print(count_true_positives)
+"""
+ Plot the total number of occurences of each class in the "predicted" folder
+"""
+if draw_plot:
+  window_title = "Predicted Objects Info"
+  # Plot title
+  plot_title = "Predicted Objects\n"
+  plot_title += "(" + str(len(predicted_files_list)) + " files and "
+  count_non_zero_values_in_dictionary = sum(int(x) > 0 for x in list(pred_counter_per_class.values()))
+  plot_title += str(count_non_zero_values_in_dictionary) + " detected classes)"
+  # end Plot title
+  x_label = "Number of objects per class"
+  output_path = results_files_path + "/Predicted Objects Info.png"
+  to_show = False
+  plot_color = 'forestgreen'
+  true_p_bar = count_true_positives
+  draw_plot_func(
+    pred_counter_per_class,
+    len(pred_counter_per_class),
+    window_title,
+    plot_title,
+    x_label,
+    output_path,
+    to_show,
+    plot_color,
+    true_p_bar
+    )
+"""
+ Write number of predicted objects per class to results.txt
+"""
+with open(results_files_path + "/results", 'a') as results_file:
+  results_file.write("\n# Number of predicted objects per class\n")
+  for class_name in sorted(pred_classes):
+    n_pred = pred_counter_per_class[class_name]
+    text = class_name + ": " + str(n_pred)
+    text += " (tp:" + str(count_true_positives[class_name]) + ""
+    text += ", fp:" + str(n_pred - count_true_positives[class_name]) + ")\n"
+    results_file.write(text)
+"""
+ Draw mAP plot (Show AP's of all classes in decreasing order)
+"""
+if draw_plot:
+  window_title = "mAP"
+  plot_title = "mAP = {0:.2f}%".format(mAP*100)
+  x_label = "Average Precision"
+  output_path = results_files_path + "/mAP.png"
+  to_show = True
+  plot_color = 'royalblue'
+  draw_plot_func(
+    ap_dictionary,
+    n_classes,
+    window_title,
+    plot_title,
+    x_label,
+    output_path,
+    to_show,
+    plot_color,
+    ""
+    )

requirements-gpu.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+tensorflow-gpu==2.3.0rc0
+opencv-python==4.1.1.26
+lxml
+tqdm
+absl-py
+matplotlib
+easydict
+pillow

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+opencv-python==4.1.1.26
+lxml
+tqdm
+tensorflow==2.3.0rc0
+absl-py
+easydict
+matplotlib
+pillow

result.png ADDED Viewed

save_model.py ADDED Viewed

	@@ -0,0 +1,60 @@

+import tensorflow as tf
+from absl import app, flags, logging
+from absl.flags import FLAGS
+from core.yolov4 import YOLO, decode, filter_boxes
+import core.utils as utils
+from core.config import cfg
+flags.DEFINE_string('weights', './data/yolov4.weights', 'path to weights file')
+flags.DEFINE_string('output', './checkpoints/yolov4-416', 'path to output')
+flags.DEFINE_boolean('tiny', False, 'is yolo-tiny or not')
+flags.DEFINE_integer('input_size', 416, 'define input size of export model')
+flags.DEFINE_float('score_thres', 0.2, 'define score threshold')
+flags.DEFINE_string('framework', 'tf', 'define what framework do you want to convert (tf, trt, tflite)')
+flags.DEFINE_string('model', 'yolov4', 'yolov3 or yolov4')
+def save_tf():
+  STRIDES, ANCHORS, NUM_CLASS, XYSCALE = utils.load_config(FLAGS)
+  input_layer = tf.keras.layers.Input([FLAGS.input_size, FLAGS.input_size, 3])
+  feature_maps = YOLO(input_layer, NUM_CLASS, FLAGS.model, FLAGS.tiny)
+  bbox_tensors = []
+  prob_tensors = []
+  if FLAGS.tiny:
+    for i, fm in enumerate(feature_maps):
+      if i == 0:
+        output_tensors = decode(fm, FLAGS.input_size // 16, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE, FLAGS.framework)
+      else:
+        output_tensors = decode(fm, FLAGS.input_size // 32, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE, FLAGS.framework)
+      bbox_tensors.append(output_tensors[0])
+      prob_tensors.append(output_tensors[1])
+  else:
+    for i, fm in enumerate(feature_maps):
+      if i == 0:
+        output_tensors = decode(fm, FLAGS.input_size // 8, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE, FLAGS.framework)
+      elif i == 1:
+        output_tensors = decode(fm, FLAGS.input_size // 16, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE, FLAGS.framework)
+      else:
+        output_tensors = decode(fm, FLAGS.input_size // 32, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE, FLAGS.framework)
+      bbox_tensors.append(output_tensors[0])
+      prob_tensors.append(output_tensors[1])
+  pred_bbox = tf.concat(bbox_tensors, axis=1)
+  pred_prob = tf.concat(prob_tensors, axis=1)
+  if FLAGS.framework == 'tflite':
+    pred = (pred_bbox, pred_prob)
+  else:
+    boxes, pred_conf = filter_boxes(pred_bbox, pred_prob, score_threshold=FLAGS.score_thres, input_shape=tf.constant([FLAGS.input_size, FLAGS.input_size]))
+    pred = tf.concat([boxes, pred_conf], axis=-1)
+  model = tf.keras.Model(input_layer, pred)
+  utils.load_weights(model, FLAGS.weights, FLAGS.model, FLAGS.tiny)
+  model.summary()
+  model.save(FLAGS.output)
+def main(_argv):
+  save_tf()
+if __name__ == '__main__':
+    try:
+        app.run(main)
+    except SystemExit:
+        pass

train.py ADDED Viewed

	@@ -0,0 +1,162 @@

+from absl import app, flags, logging
+from absl.flags import FLAGS
+import os
+import shutil
+import tensorflow as tf
+from core.yolov4 import YOLO, decode, compute_loss, decode_train
+from core.dataset import Dataset
+from core.config import cfg
+import numpy as np
+from core import utils
+from core.utils import freeze_all, unfreeze_all
+flags.DEFINE_string('model', 'yolov4', 'yolov4, yolov3')
+flags.DEFINE_string('weights', './scripts/yolov4.weights', 'pretrained weights')
+flags.DEFINE_boolean('tiny', False, 'yolo or yolo-tiny')
+def main(_argv):
+    physical_devices = tf.config.experimental.list_physical_devices('GPU')
+    if len(physical_devices) > 0:
+        tf.config.experimental.set_memory_growth(physical_devices[0], True)
+    trainset = Dataset(FLAGS, is_training=True)
+    testset = Dataset(FLAGS, is_training=False)
+    logdir = "./data/log"
+    isfreeze = False
+    steps_per_epoch = len(trainset)
+    first_stage_epochs = cfg.TRAIN.FISRT_STAGE_EPOCHS
+    second_stage_epochs = cfg.TRAIN.SECOND_STAGE_EPOCHS
+    global_steps = tf.Variable(1, trainable=False, dtype=tf.int64)
+    warmup_steps = cfg.TRAIN.WARMUP_EPOCHS * steps_per_epoch
+    total_steps = (first_stage_epochs + second_stage_epochs) * steps_per_epoch
+    # train_steps = (first_stage_epochs + second_stage_epochs) * steps_per_period
+    input_layer = tf.keras.layers.Input([cfg.TRAIN.INPUT_SIZE, cfg.TRAIN.INPUT_SIZE, 3])
+    STRIDES, ANCHORS, NUM_CLASS, XYSCALE = utils.load_config(FLAGS)
+    IOU_LOSS_THRESH = cfg.YOLO.IOU_LOSS_THRESH
+    freeze_layers = utils.load_freeze_layer(FLAGS.model, FLAGS.tiny)
+    feature_maps = YOLO(input_layer, NUM_CLASS, FLAGS.model, FLAGS.tiny)
+    if FLAGS.tiny:
+        bbox_tensors = []
+        for i, fm in enumerate(feature_maps):
+            if i == 0:
+                bbox_tensor = decode_train(fm, cfg.TRAIN.INPUT_SIZE // 16, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE)
+            else:
+                bbox_tensor = decode_train(fm, cfg.TRAIN.INPUT_SIZE // 32, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE)
+            bbox_tensors.append(fm)
+            bbox_tensors.append(bbox_tensor)
+    else:
+        bbox_tensors = []
+        for i, fm in enumerate(feature_maps):
+            if i == 0:
+                bbox_tensor = decode_train(fm, cfg.TRAIN.INPUT_SIZE // 8, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE)
+            elif i == 1:
+                bbox_tensor = decode_train(fm, cfg.TRAIN.INPUT_SIZE // 16, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE)
+            else:
+                bbox_tensor = decode_train(fm, cfg.TRAIN.INPUT_SIZE // 32, NUM_CLASS, STRIDES, ANCHORS, i, XYSCALE)
+            bbox_tensors.append(fm)
+            bbox_tensors.append(bbox_tensor)
+    model = tf.keras.Model(input_layer, bbox_tensors)
+    model.summary()
+    if FLAGS.weights == None:
+        print("Training from scratch")
+    else:
+        if FLAGS.weights.split(".")[len(FLAGS.weights.split(".")) - 1] == "weights":
+            utils.load_weights(model, FLAGS.weights, FLAGS.model, FLAGS.tiny)
+        else:
+            model.load_weights(FLAGS.weights)
+        print('Restoring weights from: %s ... ' % FLAGS.weights)
+    optimizer = tf.keras.optimizers.Adam()
+    if os.path.exists(logdir): shutil.rmtree(logdir)
+    writer = tf.summary.create_file_writer(logdir)
+    # define training step function
+    # @tf.function
+    def train_step(image_data, target):
+        with tf.GradientTape() as tape:
+            pred_result = model(image_data, training=True)
+            giou_loss = conf_loss = prob_loss = 0
+            # optimizing process
+            for i in range(len(freeze_layers)):
+                conv, pred = pred_result[i * 2], pred_result[i * 2 + 1]
+                loss_items = compute_loss(pred, conv, target[i][0], target[i][1], STRIDES=STRIDES, NUM_CLASS=NUM_CLASS, IOU_LOSS_THRESH=IOU_LOSS_THRESH, i=i)
+                giou_loss += loss_items[0]
+                conf_loss += loss_items[1]
+                prob_loss += loss_items[2]
+            total_loss = giou_loss + conf_loss + prob_loss
+            gradients = tape.gradient(total_loss, model.trainable_variables)
+            optimizer.apply_gradients(zip(gradients, model.trainable_variables))
+            tf.print("=> STEP %4d/%4d   lr: %.6f   giou_loss: %4.2f   conf_loss: %4.2f   "
+                     "prob_loss: %4.2f   total_loss: %4.2f" % (global_steps, total_steps, optimizer.lr.numpy(),
+                                                               giou_loss, conf_loss,
+                                                               prob_loss, total_loss))
+            # update learning rate
+            global_steps.assign_add(1)
+            if global_steps < warmup_steps:
+                lr = global_steps / warmup_steps * cfg.TRAIN.LR_INIT
+            else:
+                lr = cfg.TRAIN.LR_END + 0.5 * (cfg.TRAIN.LR_INIT - cfg.TRAIN.LR_END) * (
+                    (1 + tf.cos((global_steps - warmup_steps) / (total_steps - warmup_steps) * np.pi))
+                )
+            optimizer.lr.assign(lr.numpy())
+            # writing summary data
+            with writer.as_default():
+                tf.summary.scalar("lr", optimizer.lr, step=global_steps)
+                tf.summary.scalar("loss/total_loss", total_loss, step=global_steps)
+                tf.summary.scalar("loss/giou_loss", giou_loss, step=global_steps)
+                tf.summary.scalar("loss/conf_loss", conf_loss, step=global_steps)
+                tf.summary.scalar("loss/prob_loss", prob_loss, step=global_steps)
+            writer.flush()
+    def test_step(image_data, target):
+        with tf.GradientTape() as tape:
+            pred_result = model(image_data, training=True)
+            giou_loss = conf_loss = prob_loss = 0
+            # optimizing process
+            for i in range(len(freeze_layers)):
+                conv, pred = pred_result[i * 2], pred_result[i * 2 + 1]
+                loss_items = compute_loss(pred, conv, target[i][0], target[i][1], STRIDES=STRIDES, NUM_CLASS=NUM_CLASS, IOU_LOSS_THRESH=IOU_LOSS_THRESH, i=i)
+                giou_loss += loss_items[0]
+                conf_loss += loss_items[1]
+                prob_loss += loss_items[2]
+            total_loss = giou_loss + conf_loss + prob_loss
+            tf.print("=> TEST STEP %4d   giou_loss: %4.2f   conf_loss: %4.2f   "
+                     "prob_loss: %4.2f   total_loss: %4.2f" % (global_steps, giou_loss, conf_loss,
+                                                               prob_loss, total_loss))
+    for epoch in range(first_stage_epochs + second_stage_epochs):
+        if epoch < first_stage_epochs:
+            if not isfreeze:
+                isfreeze = True
+                for name in freeze_layers:
+                    freeze = model.get_layer(name)
+                    freeze_all(freeze)
+        elif epoch >= first_stage_epochs:
+            if isfreeze:
+                isfreeze = False
+                for name in freeze_layers:
+                    freeze = model.get_layer(name)
+                    unfreeze_all(freeze)
+        for image_data, target in trainset:
+            train_step(image_data, target)
+        for image_data, target in testset:
+            test_step(image_data, target)
+        model.save_weights("./checkpoints/yolov4")
+if __name__ == '__main__':
+    try:
+        app.run(main)
+    except SystemExit:
+        pass