glenn-jocher commited on
Commit
364fcfd
1 Parent(s): ef58dac

PANet update

Browse files
README.md CHANGED
@@ -4,7 +4,7 @@
4
 
5
  This repository represents Ultralytics open-source research into future object detection methods, and incorporates our lessons learned and best practices evolved over training thousands of models on custom client datasets with our previous YOLO repository https://github.com/ultralytics/yolov3. **All code and models are under active development, and are subject to modification or deletion without notice.** Use at your own risk.
6
 
7
- <img src="https://user-images.githubusercontent.com/26833433/84200349-729f2680-aa5b-11ea-8f9a-604c9e01a658.png" width="1000">** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP32 inference, postprocessing and NMS.
8
 
9
  - **June 19, 2020**: [FP16](https://pytorch.org/docs/stable/nn.html#torch.nn.Module.half) as new default for smaller checkpoints and faster inference. Comparison in [d4c6674](https://github.com/ultralytics/yolov5/commit/d4c6674c98e19df4c40e33a777610a18d1961145).
10
  - **June 9, 2020**: [CSP](https://github.com/WongKinYiu/CrossStagePartialNetworks) updates to all YOLOv5 models. New models are faster, smaller and more accurate. Credit to @WongKinYiu for his excellent work with CSP.
@@ -14,13 +14,14 @@ This repository represents Ultralytics open-source research into future object d
14
 
15
  ## Pretrained Checkpoints
16
 
17
- | Model | AP<sup>val</sup> | AP<sup>test</sup> | AP<sub>50</sub> | Speed<sub>GPU</sub> | FPS<sub>GPU</sub> || params | FLOPs |
18
  |---------- |------ |------ |------ | -------- | ------| ------ |------ | :------: |
19
- | YOLOv5-s ([ckpt](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J)) | 35.5 | 35.5 | 55.0 | **2.1ms** | **476** || 7.1M | 12.6B
20
- | YOLOv5-m ([ckpt](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J)) | 42.7 | 42.7 | 62.4 | 3.2ms | 312 || 22.0M | 39.0B
21
- | YOLOv5-l ([ckpt](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J)) | 45.7 | 45.9 | 65.1 | 4.1ms | 243 || 50.3M | 89.0B
22
- | YOLOv5-x ([ckpt](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J)) | **47.2** | **47.3** | **66.6** | 6.5ms | 153 || 95.9M | 170.3B
23
- | YOLOv3-SPP ([ckpt](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J)) | 45.6 | 45.5 | 65.2 | 4.8ms | 208 || 63.0M | 118.0B
 
24
 
25
  ** AP<sup>test</sup> denotes COCO [test-dev2017](http://cocodataset.org/#upload) server results, all other AP results in the table denote val2017 accuracy.
26
  ** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by `python test.py --img 736 --conf 0.001`
 
4
 
5
  This repository represents Ultralytics open-source research into future object detection methods, and incorporates our lessons learned and best practices evolved over training thousands of models on custom client datasets with our previous YOLO repository https://github.com/ultralytics/yolov3. **All code and models are under active development, and are subject to modification or deletion without notice.** Use at your own risk.
6
 
7
+ <img src="https://user-images.githubusercontent.com/26833433/85336627-c6663280-b493-11ea-9b0a-289b0f182b84.png" width="1000">** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 8, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS.
8
 
9
  - **June 19, 2020**: [FP16](https://pytorch.org/docs/stable/nn.html#torch.nn.Module.half) as new default for smaller checkpoints and faster inference. Comparison in [d4c6674](https://github.com/ultralytics/yolov5/commit/d4c6674c98e19df4c40e33a777610a18d1961145).
10
  - **June 9, 2020**: [CSP](https://github.com/WongKinYiu/CrossStagePartialNetworks) updates to all YOLOv5 models. New models are faster, smaller and more accurate. Credit to @WongKinYiu for his excellent work with CSP.
 
14
 
15
  ## Pretrained Checkpoints
16
 
17
+ | Model | AP<sup>val</sup> | AP<sup>test</sup> | AP<sub>50</sub> | Speed<sub>GPU</sub> | FPS<sub>GPU</sub> || params | FLOPS |
18
  |---------- |------ |------ |------ | -------- | ------| ------ |------ | :------: |
19
+ | [YOLOv5s](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J) | 36.5 | 36.5 | 55.6 | **2.2ms** | **455** || 7.5M | 13.2B
20
+ | [YOLOv5m](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J) | 43.4 | 43.4 | 62.4 | 3.0ms | 333 || 21.8M | 39.4B
21
+ | [YOLOv5l](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J) | 46.6 | 46.7 | 65.4 | 3.9ms | 256 || 47.8M | 88.1B
22
+ | [YOLOv5x](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J) | **48.2** | **48.3** | **66.9** | 6.1ms | 164 || 89.0M | 166.4B
23
+ | [YOLOv3-SPP](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J) | 45.6 | 45.5 | 65.2 | 4.5ms | 222 || 63.0M | 118.0B
24
+
25
 
26
  ** AP<sup>test</sup> denotes COCO [test-dev2017](http://cocodataset.org/#upload) server results, all other AP results in the table denote val2017 accuracy.
27
  ** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by `python test.py --img 736 --conf 0.001`
models/yolov3-spp.yaml CHANGED
@@ -25,8 +25,7 @@ backbone:
25
  [-1, 4, Bottleneck, [1024]], # 10
26
  ]
27
 
28
- # yolov3-spp head
29
- # na = len(anchors[0])
30
  head:
31
  [[-1, 1, Bottleneck, [1024, False]], # 11
32
  [-1, 1, SPP, [512, [5, 9, 13]]],
 
25
  [-1, 4, Bottleneck, [1024]], # 10
26
  ]
27
 
28
+ # YOLOv3-SPP head
 
29
  head:
30
  [[-1, 1, Bottleneck, [1024, False]], # 11
31
  [-1, 1, SPP, [512, [5, 9, 13]]],
models/yolov5l.yaml CHANGED
@@ -5,41 +5,48 @@ width_multiple: 1.0 # layer channel multiple
5
 
6
  # anchors
7
  anchors:
8
- - [10,13, 16,30, 33,23] # P3/8
9
- - [30,61, 62,45, 59,119] # P4/16
10
  - [116,90, 156,198, 373,326] # P5/32
 
 
11
 
12
- # yolov5 backbone
13
  backbone:
14
  # [from, number, module, args]
15
- [[-1, 1, Focus, [64, 3]], # 1-P1/2
16
- [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
17
- [-1, 3, Bottleneck, [128]],
18
- [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
19
  [-1, 9, BottleneckCSP, [256]],
20
- [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
21
  [-1, 9, BottleneckCSP, [512]],
22
- [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23
  [-1, 1, SPP, [1024, [5, 9, 13]]],
24
- [-1, 6, BottleneckCSP, [1024]], # 10
25
  ]
26
 
27
- # yolov5 head
28
  head:
29
- [[-1, 3, BottleneckCSP, [1024, False]], # 11
30
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 12 (P5/32-large)
31
 
32
- [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33
- [[-1, 6], 1, Concat, [1]], # cat backbone P4
34
  [-1, 1, Conv, [512, 1, 1]],
35
- [-1, 3, BottleneckCSP, [512, False]],
36
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 17 (P4/16-medium)
 
37
 
38
- [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39
- [[-1, 4], 1, Concat, [1]], # cat backbone P3
40
  [-1, 1, Conv, [256, 1, 1]],
 
 
41
  [-1, 3, BottleneckCSP, [256, False]],
42
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P3/8-small)
 
 
 
 
 
 
 
 
 
 
43
 
44
- [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
45
  ]
 
5
 
6
  # anchors
7
  anchors:
 
 
8
  - [116,90, 156,198, 373,326] # P5/32
9
+ - [30,61, 62,45, 59,119] # P4/16
10
+ - [10,13, 16,30, 33,23] # P3/8
11
 
12
+ # YOLOv5 backbone
13
  backbone:
14
  # [from, number, module, args]
15
+ [[-1, 1, Focus, [64, 3]], # 0-P1/2
16
+ [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17
+ [-1, 3, BottleneckCSP, [128]],
18
+ [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19
  [-1, 9, BottleneckCSP, [256]],
20
+ [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21
  [-1, 9, BottleneckCSP, [512]],
22
+ [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23
  [-1, 1, SPP, [1024, [5, 9, 13]]],
 
24
  ]
25
 
26
+ # YOLOv5 head
27
  head:
28
+ [[-1, 3, BottleneckCSP, [1024, False]], # 9
 
29
 
 
 
30
  [-1, 1, Conv, [512, 1, 1]],
31
+ [-1, 1, nn.Upsample, [None, 2, 'nearest']],
32
+ [[-1, 6], 1, Concat, [1]], # cat backbone P4
33
+ [-1, 3, BottleneckCSP, [512, False]], # 13
34
 
 
 
35
  [-1, 1, Conv, [256, 1, 1]],
36
+ [-1, 1, nn.Upsample, [None, 2, 'nearest']],
37
+ [[-1, 4], 1, Concat, [1]], # cat backbone P3
38
  [-1, 3, BottleneckCSP, [256, False]],
39
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 18 (P3/8-small)
40
+
41
+ [-2, 1, Conv, [256, 3, 2]],
42
+ [[-1, 14], 1, Concat, [1]], # cat head P4
43
+ [-1, 3, BottleneckCSP, [512, False]],
44
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P4/16-medium)
45
+
46
+ [-2, 1, Conv, [512, 3, 2]],
47
+ [[-1, 10], 1, Concat, [1]], # cat head P5
48
+ [-1, 3, BottleneckCSP, [1024, False]],
49
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 26 (P5/32-large)
50
 
51
+ [[], 1, Detect, [nc, anchors]], # Detect(P5, P4, P3)
52
  ]
models/yolov5m.yaml CHANGED
@@ -5,41 +5,48 @@ width_multiple: 0.75 # layer channel multiple
5
 
6
  # anchors
7
  anchors:
8
- - [10,13, 16,30, 33,23] # P3/8
9
- - [30,61, 62,45, 59,119] # P4/16
10
  - [116,90, 156,198, 373,326] # P5/32
 
 
11
 
12
- # yolov5 backbone
13
  backbone:
14
  # [from, number, module, args]
15
- [[-1, 1, Focus, [64, 3]], # 1-P1/2
16
- [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
17
- [-1, 3, Bottleneck, [128]],
18
- [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
19
  [-1, 9, BottleneckCSP, [256]],
20
- [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
21
  [-1, 9, BottleneckCSP, [512]],
22
- [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23
  [-1, 1, SPP, [1024, [5, 9, 13]]],
24
- [-1, 6, BottleneckCSP, [1024]], # 10
25
  ]
26
 
27
- # yolov5 head
28
  head:
29
- [[-1, 3, BottleneckCSP, [1024, False]], # 11
30
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 12 (P5/32-large)
31
 
32
- [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33
- [[-1, 6], 1, Concat, [1]], # cat backbone P4
34
  [-1, 1, Conv, [512, 1, 1]],
35
- [-1, 3, BottleneckCSP, [512, False]],
36
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 17 (P4/16-medium)
 
37
 
38
- [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39
- [[-1, 4], 1, Concat, [1]], # cat backbone P3
40
  [-1, 1, Conv, [256, 1, 1]],
 
 
41
  [-1, 3, BottleneckCSP, [256, False]],
42
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P3/8-small)
 
 
 
 
 
 
 
 
 
 
43
 
44
- [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
45
  ]
 
5
 
6
  # anchors
7
  anchors:
 
 
8
  - [116,90, 156,198, 373,326] # P5/32
9
+ - [30,61, 62,45, 59,119] # P4/16
10
+ - [10,13, 16,30, 33,23] # P3/8
11
 
12
+ # YOLOv5 backbone
13
  backbone:
14
  # [from, number, module, args]
15
+ [[-1, 1, Focus, [64, 3]], # 0-P1/2
16
+ [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17
+ [-1, 3, BottleneckCSP, [128]],
18
+ [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19
  [-1, 9, BottleneckCSP, [256]],
20
+ [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21
  [-1, 9, BottleneckCSP, [512]],
22
+ [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23
  [-1, 1, SPP, [1024, [5, 9, 13]]],
 
24
  ]
25
 
26
+ # YOLOv5 head
27
  head:
28
+ [[-1, 3, BottleneckCSP, [1024, False]], # 9
 
29
 
 
 
30
  [-1, 1, Conv, [512, 1, 1]],
31
+ [-1, 1, nn.Upsample, [None, 2, 'nearest']],
32
+ [[-1, 6], 1, Concat, [1]], # cat backbone P4
33
+ [-1, 3, BottleneckCSP, [512, False]], # 13
34
 
 
 
35
  [-1, 1, Conv, [256, 1, 1]],
36
+ [-1, 1, nn.Upsample, [None, 2, 'nearest']],
37
+ [[-1, 4], 1, Concat, [1]], # cat backbone P3
38
  [-1, 3, BottleneckCSP, [256, False]],
39
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 18 (P3/8-small)
40
+
41
+ [-2, 1, Conv, [256, 3, 2]],
42
+ [[-1, 14], 1, Concat, [1]], # cat head P4
43
+ [-1, 3, BottleneckCSP, [512, False]],
44
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P4/16-medium)
45
+
46
+ [-2, 1, Conv, [512, 3, 2]],
47
+ [[-1, 10], 1, Concat, [1]], # cat head P5
48
+ [-1, 3, BottleneckCSP, [1024, False]],
49
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 26 (P5/32-large)
50
 
51
+ [[], 1, Detect, [nc, anchors]], # Detect(P5, P4, P3)
52
  ]
models/yolov5s.yaml CHANGED
@@ -5,41 +5,48 @@ width_multiple: 0.50 # layer channel multiple
5
 
6
  # anchors
7
  anchors:
8
- - [10,13, 16,30, 33,23] # P3/8
9
- - [30,61, 62,45, 59,119] # P4/16
10
  - [116,90, 156,198, 373,326] # P5/32
 
 
11
 
12
- # yolov5 backbone
13
  backbone:
14
  # [from, number, module, args]
15
- [[-1, 1, Focus, [64, 3]], # 1-P1/2
16
- [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
17
- [-1, 3, Bottleneck, [128]],
18
- [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
19
  [-1, 9, BottleneckCSP, [256]],
20
- [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
21
  [-1, 9, BottleneckCSP, [512]],
22
- [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23
  [-1, 1, SPP, [1024, [5, 9, 13]]],
24
- [-1, 6, BottleneckCSP, [1024]], # 10
25
  ]
26
 
27
- # yolov5 head
28
  head:
29
- [[-1, 3, BottleneckCSP, [1024, False]], # 11
30
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 12 (P5/32-large)
31
 
32
- [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33
- [[-1, 6], 1, Concat, [1]], # cat backbone P4
34
  [-1, 1, Conv, [512, 1, 1]],
35
- [-1, 3, BottleneckCSP, [512, False]],
36
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 17 (P4/16-medium)
 
37
 
38
- [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39
- [[-1, 4], 1, Concat, [1]], # cat backbone P3
40
  [-1, 1, Conv, [256, 1, 1]],
 
 
41
  [-1, 3, BottleneckCSP, [256, False]],
42
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P3/8-small)
 
 
 
 
 
 
 
 
 
 
43
 
44
- [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
45
  ]
 
5
 
6
  # anchors
7
  anchors:
 
 
8
  - [116,90, 156,198, 373,326] # P5/32
9
+ - [30,61, 62,45, 59,119] # P4/16
10
+ - [10,13, 16,30, 33,23] # P3/8
11
 
12
+ # YOLOv5 backbone
13
  backbone:
14
  # [from, number, module, args]
15
+ [[-1, 1, Focus, [64, 3]], # 0-P1/2
16
+ [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17
+ [-1, 3, BottleneckCSP, [128]],
18
+ [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19
  [-1, 9, BottleneckCSP, [256]],
20
+ [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21
  [-1, 9, BottleneckCSP, [512]],
22
+ [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23
  [-1, 1, SPP, [1024, [5, 9, 13]]],
 
24
  ]
25
 
26
+ # YOLOv5 head
27
  head:
28
+ [[-1, 3, BottleneckCSP, [1024, False]], # 9
 
29
 
 
 
30
  [-1, 1, Conv, [512, 1, 1]],
31
+ [-1, 1, nn.Upsample, [None, 2, 'nearest']],
32
+ [[-1, 6], 1, Concat, [1]], # cat backbone P4
33
+ [-1, 3, BottleneckCSP, [512, False]], # 13
34
 
 
 
35
  [-1, 1, Conv, [256, 1, 1]],
36
+ [-1, 1, nn.Upsample, [None, 2, 'nearest']],
37
+ [[-1, 4], 1, Concat, [1]], # cat backbone P3
38
  [-1, 3, BottleneckCSP, [256, False]],
39
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 18 (P3/8-small)
40
+
41
+ [-2, 1, Conv, [256, 3, 2]],
42
+ [[-1, 14], 1, Concat, [1]], # cat head P4
43
+ [-1, 3, BottleneckCSP, [512, False]],
44
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P4/16-medium)
45
+
46
+ [-2, 1, Conv, [512, 3, 2]],
47
+ [[-1, 10], 1, Concat, [1]], # cat head P5
48
+ [-1, 3, BottleneckCSP, [1024, False]],
49
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 26 (P5/32-large)
50
 
51
+ [[], 1, Detect, [nc, anchors]], # Detect(P5, P4, P3)
52
  ]
models/yolov5x.yaml CHANGED
@@ -5,41 +5,48 @@ width_multiple: 1.25 # layer channel multiple
5
 
6
  # anchors
7
  anchors:
8
- - [10,13, 16,30, 33,23] # P3/8
9
- - [30,61, 62,45, 59,119] # P4/16
10
  - [116,90, 156,198, 373,326] # P5/32
 
 
11
 
12
- # yolov5 backbone
13
  backbone:
14
  # [from, number, module, args]
15
- [[-1, 1, Focus, [64, 3]], # 1-P1/2
16
- [-1, 1, Conv, [128, 3, 2]], # 2-P2/4
17
- [-1, 3, Bottleneck, [128]],
18
- [-1, 1, Conv, [256, 3, 2]], # 4-P3/8
19
  [-1, 9, BottleneckCSP, [256]],
20
- [-1, 1, Conv, [512, 3, 2]], # 6-P4/16
21
  [-1, 9, BottleneckCSP, [512]],
22
- [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
23
  [-1, 1, SPP, [1024, [5, 9, 13]]],
24
- [-1, 6, BottleneckCSP, [1024]], # 10
25
  ]
26
 
27
- # yolov5 head
28
  head:
29
- [[-1, 3, BottleneckCSP, [1024, False]], # 11
30
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 12 (P5/32-large)
31
 
32
- [-2, 1, nn.Upsample, [None, 2, 'nearest']],
33
- [[-1, 6], 1, Concat, [1]], # cat backbone P4
34
  [-1, 1, Conv, [512, 1, 1]],
35
- [-1, 3, BottleneckCSP, [512, False]],
36
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 17 (P4/16-medium)
 
37
 
38
- [-2, 1, nn.Upsample, [None, 2, 'nearest']],
39
- [[-1, 4], 1, Concat, [1]], # cat backbone P3
40
  [-1, 1, Conv, [256, 1, 1]],
 
 
41
  [-1, 3, BottleneckCSP, [256, False]],
42
- [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P3/8-small)
 
 
 
 
 
 
 
 
 
 
43
 
44
- [[], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
45
  ]
 
5
 
6
  # anchors
7
  anchors:
 
 
8
  - [116,90, 156,198, 373,326] # P5/32
9
+ - [30,61, 62,45, 59,119] # P4/16
10
+ - [10,13, 16,30, 33,23] # P3/8
11
 
12
+ # YOLOv5 backbone
13
  backbone:
14
  # [from, number, module, args]
15
+ [[-1, 1, Focus, [64, 3]], # 0-P1/2
16
+ [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
17
+ [-1, 3, BottleneckCSP, [128]],
18
+ [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
19
  [-1, 9, BottleneckCSP, [256]],
20
+ [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
21
  [-1, 9, BottleneckCSP, [512]],
22
+ [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
23
  [-1, 1, SPP, [1024, [5, 9, 13]]],
 
24
  ]
25
 
26
+ # YOLOv5 head
27
  head:
28
+ [[-1, 3, BottleneckCSP, [1024, False]], # 9
 
29
 
 
 
30
  [-1, 1, Conv, [512, 1, 1]],
31
+ [-1, 1, nn.Upsample, [None, 2, 'nearest']],
32
+ [[-1, 6], 1, Concat, [1]], # cat backbone P4
33
+ [-1, 3, BottleneckCSP, [512, False]], # 13
34
 
 
 
35
  [-1, 1, Conv, [256, 1, 1]],
36
+ [-1, 1, nn.Upsample, [None, 2, 'nearest']],
37
+ [[-1, 4], 1, Concat, [1]], # cat backbone P3
38
  [-1, 3, BottleneckCSP, [256, False]],
39
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 18 (P3/8-small)
40
+
41
+ [-2, 1, Conv, [256, 3, 2]],
42
+ [[-1, 14], 1, Concat, [1]], # cat head P4
43
+ [-1, 3, BottleneckCSP, [512, False]],
44
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 22 (P4/16-medium)
45
+
46
+ [-2, 1, Conv, [512, 3, 2]],
47
+ [[-1, 10], 1, Concat, [1]], # cat head P5
48
+ [-1, 3, BottleneckCSP, [1024, False]],
49
+ [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], # 26 (P5/32-large)
50
 
51
+ [[], 1, Detect, [nc, anchors]], # Detect(P5, P4, P3)
52
  ]
utils/utils.py CHANGED
@@ -1094,12 +1094,14 @@ def plot_study_txt(f='study.txt', x=None): # from utils.utils import *; plot_st
1094
 
1095
  ax2.plot(1E3 / np.array([209, 140, 97, 58, 35, 18]), [33.5, 39.1, 42.5, 45.9, 49., 50.5],
1096
  'k.-', linewidth=2, markersize=8, alpha=.25, label='EfficientDet')
 
 
1097
  ax2.set_xlim(0, 30)
1098
- ax2.set_ylim(25, 50)
1099
- ax2.set_xlabel('GPU Latency (ms)')
 
1100
  ax2.set_ylabel('COCO AP val')
1101
  ax2.legend(loc='lower right')
1102
- ax2.grid()
1103
  plt.savefig('study_mAP_latency.png', dpi=300)
1104
  plt.savefig(f.replace('.txt', '.png'), dpi=200)
1105
 
 
1094
 
1095
  ax2.plot(1E3 / np.array([209, 140, 97, 58, 35, 18]), [33.5, 39.1, 42.5, 45.9, 49., 50.5],
1096
  'k.-', linewidth=2, markersize=8, alpha=.25, label='EfficientDet')
1097
+
1098
+ ax2.grid()
1099
  ax2.set_xlim(0, 30)
1100
+ ax2.set_ylim(28, 50)
1101
+ ax2.set_yticks(np.arange(30, 55, 5))
1102
+ ax2.set_xlabel('GPU Speed (ms/img)')
1103
  ax2.set_ylabel('COCO AP val')
1104
  ax2.legend(loc='lower right')
 
1105
  plt.savefig('study_mAP_latency.png', dpi=300)
1106
  plt.savefig(f.replace('.txt', '.png'), dpi=200)
1107