Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
@@ -1,13 +1,118 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# [GithubREPO](https://github.com/deepanshudashora/ERAV1/tree/master/session13)
|
2 |
+
|
3 |
+
# Training Procedure
|
4 |
+
|
5 |
+
1. The model is trained on Tesla T4 (15GB GPU memory)
|
6 |
+
2. The training is completed in two phases
|
7 |
+
3. The first phase contains 20 epochs and second phase contains another 20 epochs
|
8 |
+
4. In the first training we see loss dropping correctly but in the second training it drops less
|
9 |
+
5. We run our two training loops separately and do not run any kind of validation on them, except for validation loss
|
10 |
+
6. Later we evaluate the model and get the numbers
|
11 |
+
7. The lightning generally saves the model as .ckpt format, so we convert it to torch format by saving state dict as .pt format
|
12 |
+
8. For doing this we use these two lines of code
|
13 |
+
|
14 |
+
```
|
15 |
+
best_model = torch.load(weights_path)
|
16 |
+
torch.save(best_model['state_dict'], f'best_model.pth')
|
17 |
+
litemodel = YOLOv3(num_classes=num_classes)
|
18 |
+
litemodel.load_state_dict(torch.load("best_model.pth",map_location='cpu'))
|
19 |
+
device = "cpu"
|
20 |
+
torch.save(litemodel.state_dict(), PATH)
|
21 |
+
```
|
22 |
+
|
23 |
+
|
24 |
+
8. The model starts overfitting on the dataset after 30 epochs
|
25 |
+
9. Future Improvements
|
26 |
+
1. Train the model in 1 shot instead of two different phases
|
27 |
+
2. Keep a better batch size (Basically earn more money and buy a good GPU)
|
28 |
+
3. Data transformation also plays a vital role here
|
29 |
+
4. OneCycle LR range needs to be appropriately modified for a better LR
|
30 |
+
|
31 |
+
# Data Transformation
|
32 |
+
|
33 |
+
Along with the transforms mentioned in the [config file](https://github.com/deepanshudashora/ERAV1/blob/master/session13/lightning_version/config.py), we also apply **mosaic transform** on 75% images
|
34 |
+
|
35 |
+
[Reference](https://www.kaggle.com/code/nvnnghia/awesome-augmentation/notebook)
|
36 |
+
|
37 |
+
# Accuracy Report
|
38 |
+
|
39 |
+
```
|
40 |
+
Class accuracy is: 82.999725%
|
41 |
+
No obj accuracy is: 96.828300%
|
42 |
+
Obj accuracy is: 76.898473%
|
43 |
+
|
44 |
+
MAP: 0.29939851760864258
|
45 |
+
|
46 |
+
```
|
47 |
+
|
48 |
+
# [Training Logs](https://github.com/deepanshudashora/ERAV1/blob/master/session13/lightning_version/merged_logs.csv)
|
49 |
+
|
50 |
+
#### For faster execution we run the validation step after 20 epochs for the first 20 epochs of training and after that after every 5 epochs till 40 epochs
|
51 |
+
|
52 |
+
```
|
53 |
+
Unnamed: 0 lr-Adam step train_loss epoch val_loss
|
54 |
+
8150 8150 NaN 164299 4.186745 39.0 NaN
|
55 |
+
8151 8151 0.000132 164349 NaN NaN NaN
|
56 |
+
8152 8152 NaN 164349 2.936086 39.0 NaN
|
57 |
+
8153 8153 0.000132 164399 NaN NaN NaN
|
58 |
+
8154 8154 NaN 164399 4.777130 39.0 NaN
|
59 |
+
8155 8155 0.000132 164449 NaN NaN NaN
|
60 |
+
8156 8156 NaN 164449 3.139145 39.0 NaN
|
61 |
+
8157 8157 0.000132 164499 NaN NaN NaN
|
62 |
+
8158 8158 NaN 164499 4.596097 39.0 NaN
|
63 |
+
8159 8159 0.000132 164549 NaN NaN NaN
|
64 |
+
8160 8160 NaN 164549 5.587294 39.0 NaN
|
65 |
+
8161 8161 0.000132 164599 NaN NaN NaN
|
66 |
+
8162 8162 NaN 164599 4.592830 39.0 NaN
|
67 |
+
8163 8163 0.000132 164649 NaN NaN NaN
|
68 |
+
8164 8164 NaN 164649 3.914468 39.0 NaN
|
69 |
+
8165 8165 0.000132 164699 NaN NaN NaN
|
70 |
+
8166 8166 NaN 164699 3.180615 39.0 NaN
|
71 |
+
8167 8167 0.000132 164749 NaN NaN NaN
|
72 |
+
8168 8168 NaN 164749 5.772174 39.0 NaN
|
73 |
+
8169 8169 0.000132 164799 NaN NaN NaN
|
74 |
+
8170 8170 NaN 164799 2.894014 39.0 NaN
|
75 |
+
8171 8171 0.000132 164849 NaN NaN NaN
|
76 |
+
8172 8172 NaN 164849 4.473828 39.0 NaN
|
77 |
+
8173 8173 0.000132 164899 NaN NaN NaN
|
78 |
+
8174 8174 NaN 164899 6.397766 39.0 NaN
|
79 |
+
8175 8175 0.000132 164949 NaN NaN NaN
|
80 |
+
8176 8176 NaN 164949 3.789242 39.0 NaN
|
81 |
+
8177 8177 0.000132 164999 NaN NaN NaN
|
82 |
+
8178 8178 NaN 164999 5.182691 39.0 NaN
|
83 |
+
8179 8179 0.000132 165049 NaN NaN NaN
|
84 |
+
8180 8180 NaN 165049 4.845749 39.0 NaN
|
85 |
+
8181 8181 0.000132 165099 NaN NaN NaN
|
86 |
+
8182 8182 NaN 165099 3.672542 39.0 NaN
|
87 |
+
8183 8183 0.000132 165149 NaN NaN NaN
|
88 |
+
8184 8184 NaN 165149 4.230726 39.0 NaN
|
89 |
+
8185 8185 0.000132 165199 NaN NaN NaN
|
90 |
+
8186 8186 NaN 165199 4.625024 39.0 NaN
|
91 |
+
8187 8187 0.000132 165249 NaN NaN NaN
|
92 |
+
8188 8188 NaN 165249 6.549682 39.0 NaN
|
93 |
+
8189 8189 0.000132 165299 NaN NaN NaN
|
94 |
+
8190 8190 NaN 165299 5.040627 39.0 NaN
|
95 |
+
8191 8191 0.000132 165349 NaN NaN NaN
|
96 |
+
8192 8192 NaN 165349 5.857126 39.0 NaN
|
97 |
+
8193 8193 0.000132 165399 NaN NaN NaN
|
98 |
+
8194 8194 NaN 165399 3.081895 39.0 NaN
|
99 |
+
8195 8195 0.000132 165449 NaN NaN NaN
|
100 |
+
8196 8196 NaN 165449 3.945353 39.0 NaN
|
101 |
+
8197 8197 0.000132 165499 NaN NaN NaN
|
102 |
+
8198 8198 NaN 165499 3.513420 39.0 NaN
|
103 |
+
8199 8199 NaN 165519 NaN 39.0 6.084875
|
104 |
+
|
105 |
+
|
106 |
+
```
|
107 |
+
|
108 |
+
# Results
|
109 |
+
|
110 |
+
## For epochs 0 to 19
|
111 |
+
![accuracy_per_class.png](images/train_logs_1.png.png)
|
112 |
+
|
113 |
+
## From 19 to 20
|
114 |
+
![accuracy_per_class.png](images/train_logs_2.png)
|
115 |
+
|
116 |
+
## Full training logs for loss
|
117 |
+
|
118 |
+
![accuracy_per_class.png](images/full_training.png)
|