mayrajeo commited on
Commit
7f5ad83
1 Parent(s): 2b2bbfd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +168 -0
README.md CHANGED
@@ -1,3 +1,171 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # Model Card for YOLOv8 models for detecting marine vessels from RGB Sentinel-2 images
6
+
7
+ <!-- Provide a quick summary of what the model is/does. -->
8
+
9
+ TBA
10
+
11
+ ## Model Details
12
+
13
+ ### Model Description
14
+
15
+ <!-- Provide a longer summary of what this model is. -->
16
+
17
+ TBA
18
+
19
+ - **Developed by:** Janne Mäyrä
20
+ - **Model type:** Object Detection
21
+ - **Finetuned from model:** YOLOv8 pretrained models
22
+
23
+ ### Model Sources
24
+
25
+ <!-- Provide the basic links for the model. -->
26
+
27
+ - **Paper:** Under progress
28
+ - **Demo:** [https://huggingface.co/spaces/mayrajeo/marine-vessel-detection](https://huggingface.co/spaces/mayrajeo/marine-vessel-detection)
29
+
30
+ ## Uses
31
+
32
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
33
+
34
+ ### Direct Use
35
+
36
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
37
+
38
+ Models are trained to process 320x320 pixel patches of Sentinel-2 RGB images with 10m resolution and detect marine vessels. The models will detect targets from outside of the water areas, but those detections can be eliminated by using external datasets.
39
+
40
+ ### Out-of-Scope Use
41
+
42
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
43
+
44
+ These models are not suitable for other purposes than for detecting potential marine vessels from satellite imagery.
45
+
46
+ ### Recommendations
47
+
48
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
49
+
50
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
51
+
52
+ ## How to Get Started with the Model
53
+
54
+ TBA
55
+
56
+ ## Training Details
57
+
58
+ ### Training Data
59
+
60
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
61
+
62
+ The model is trained using the following Sentinel-2 mosaics and manually annotated marine vessel data. Archipelago sea 2 and Kvarken were used as test data. Other three locations were sliced into 320x320 pixel patches. These patches were then spatially split into five equal sized folds so that each fold contained data from all timesteps and locations, and all patch locations that contained an annotated vessel in any timestep were included in the folds. In total, this dataset contained 3264 320x320 pixel image patches, of which 1974 contained annotated vessels and 1290 were background patches.
63
+
64
+ Training and validation data:
65
+
66
+ |Location|Date|Number of annotations|Annotated patches|Background patches|
67
+ |-----|----|-------|------|------|
68
+ |Archipelago sea 1|2022-06-19|519|271|269|
69
+ |Archipelago sea 1|2022-07-21|1518|387|153|
70
+ |Archipelago sea 1|2022-08-13|1368|402|138|
71
+ |Gulf of Finland|2022-06-06|275|138|241|
72
+ |Gulf of Finland|2022-06-26|1190|269|110|
73
+ |Gulf of Finland|2022-07-21|971|260|119|
74
+ |Bothnian Bay|2022-06-27|122|81|88|
75
+ |Bothnian Bay|2022-07-12|162|98|71|
76
+ |Bothnian Bay|2022-08-28|98|68|101|
77
+
78
+
79
+ #### Training Hyperparameters
80
+
81
+ Training configs can be found from each model directory, in the file `args.yaml`.
82
+
83
+ ## Evaluation
84
+
85
+ <!-- This section describes the evaluation protocols and provides the results. -->
86
+
87
+ ### Testing Data, Factors & Metrics
88
+
89
+ #### Testing Data
90
+
91
+ <!-- This should link to a Data Card if possible. -->
92
+
93
+ Test data consists of six Sentinel-2 mosaics:
94
+
95
+ |Location|Date|Number of annotations|
96
+ |--------|----|---------------------|
97
+ |Archipelago sea 2|2021-07-14|433|
98
+ |Archipelago sea 2|2022-06-24|413|
99
+ |Archipelago sea 2|2022-08-13|391|
100
+ |Kvarken|2022-06-17|79|
101
+ |Kvarken|2022-07-12|167|
102
+ |Kvarken|2022-08-26|81|
103
+
104
+ #### Factors
105
+
106
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
107
+
108
+ Before evaluating, the predictions for the test set are cleaned using the following steps:
109
+
110
+ 1. All prediction whose centroid points are not located on water are discarded. The water mask used contains layers `jarvi` (Lakes), `meri` (Sea) and `virtavesialue` (Rivers as polygon geometry) from the Topographical database by the National Land Survey of Finland. Unfortunately this also discards all points not within the Finnish borders.
111
+ 2. All predictions whose centroid points are located on water rock areas are discarded. The mask is the layer `vesikivikko` (Water rock areas) from the Topographical database.
112
+ 3. All predictions that contain an above water rock within the bounding box are discarded. The mask contains classes `38511`, `38512`, `38513` from the layer `vesikivi` in the Topographical database.
113
+ 4. All predictions that contain a lighthouse or a sector light within the bounding box are discarded. Lighthouses and sector lights come from Väylävirasto data, `ty_njr` class ids are 1, 2, 3, 4, 5, 8
114
+ 5. All predictions that are wind turbines, found in Topographical database layer `tuulivoimalat`
115
+ 6. TODO Filter aquaculture and net pens as soon as suitable layer for them is found
116
+ 7. All predictions that are obviously too large are discarded. The prediction is defined to be "too large" if either of its edges is longer than 750 meters.
117
+
118
+ #### Metrics
119
+
120
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
121
+
122
+ Precision and Recall with IoU-threshold of 0.5, mAP50 and mAP.
123
+
124
+ ### Results
125
+
126
+ 5-fold cross-validation results:
127
+
128
+ Model | Precision, max | Precision, mean | Precision, min | Recall, max | Recall, mean | Recall, min | mAP, max | mAP, mean | mAP, min | mAP50, max | mAP50, mean | mAP50, min |
129
+ |:--------|-----------------------:|------------------------:|-----------------------:|--------------------:|---------------------:|--------------------:|-----------------:|------------------:|-----------------:|-------------------:|--------------------:|-------------------:|
130
+ | yolov8n | 0.85001 | 0.840136 | 0.82782 | 0.82951 | 0.804012 | 0.78738 | 0.38816 | 0.380828 | 0.37637 | 0.84525 | 0.833424 | 0.81883 |
131
+ | yolov8s | 0.86717 | 0.854216 | 0.84347 | 0.84939 | 0.84065 | 0.83222 | 0.41098 | 0.406258 | 0.40374 | 0.86933 | 0.861404 | 0.84934 |
132
+ | yolov8m | 0.86108 | 0.853192 | 0.84191 | 0.87385 | 0.846722 | 0.83 | 0.41739 | 0.410742 | 0.40496 | 0.87772 | 0.862594 | 0.84602 |
133
+ | yolov8l | 0.86911 | 0.863254 | 0.85604 | 0.86468 | 0.841572 | 0.82725 | 0.41694 | 0.411712 | 0.40505 | 0.88288 | 0.867134 | 0.85743 |
134
+ | yolov8x | 0.86411 | 0.856008 | 0.85045 | 0.86086 | 0.845044 | 0.83029 | 0.42065 | 0.411532 | 0.40231 | 0.87069 | 0.863538 | 0.85316 |
135
+
136
+ Test set results for each model type:
137
+
138
+ | Model | Precision, max | Precision, mean | Precision, min | Recall, max | Recall, mean | Recall, min | mAP, max | mAP, mean | mAP, min | mAP50, max | mAP50, mean | mAP50, min |
139
+ |:--------|-----------------------:|------------------------:|-----------------------:|--------------------:|---------------------:|--------------------:|-----------------:|------------------:|-----------------:|-------------------:|--------------------:|-------------------:|
140
+ | yolov8n | 0.773252 | 0.755268 | 0.738894 | 0.833611 | 0.821598 | 0.811092 | 0.29 | 0.2844 | 0.277 | 0.766 | 0.7466 | 0.732 |
141
+ | yolov8s | 0.792886 | 0.779485 | 0.768628 | 0.845133 | 0.832765 | 0.825042 | 0.311 | 0.3054 | 0.298 | 0.784 | 0.7756 | 0.764 |
142
+ | yolov8m | 0.825823 | 0.792868 | 0.772218 | 0.857435 | 0.839866 | 0.807621 | 0.324 | 0.318 | 0.314 | 0.801 | 0.7808 | 0.769 |
143
+ | yolov8l | 0.801311 | 0.789472 | 0.772199 | 0.851569 | 0.844953 | 0.835289 | 0.326 | 0.323 | 0.317 | 0.797 | 0.7854 | 0.776 |
144
+ | yolov8x | 0.801969 | 0.784667 | 0.773465 | 0.845454 | 0.832825 | 0.799519 | 0.322 | 0.3154 | 0.304 | 0.788 | 0.7776 | 0.755 |
145
+
146
+
147
+ ### Compute Infrastructure
148
+
149
+ #### Hardware
150
+
151
+ NVIDIA V100 GPU with 32GB of memory, hosted on computing nodes of Puhti supercomputer by CSC - IT Center for Science, Finland.
152
+
153
+ #### Software
154
+
155
+ Models were trained as Slurm batch jobs in Puhti.
156
+
157
+ ## Citation [optional]
158
+
159
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
160
+
161
+ **BibTeX:**
162
+
163
+ TBA
164
+
165
+ **APA:**
166
+
167
+ TBA
168
+
169
+ ## Model Card Contact
170
+
171
+ TBA