upside_down_detector / README.md

Added possible improvements

2b639e1 over 3 years ago

2.04 kB

metadata

license: apache-2.0

General Information:

Used Label: Randomly Images are flipped and labels for the flipped images are 1, otherwise 0.

Used Library: Pytorch

Used Model: ResNet18 from torchvision

Number of classes: 2 (0 means No flip and 1 means Flipped image)

Train Test Split: 70-30

Some sample Images and Labels from created dataset

The following files from the dataset were removed during the training because those files are broken/ corrupted.

Total Epoch: 5

Pretrained: True (ImageNet weight) (Every layer is trainable)

Image Size: 224 x 224

Batch Size: 128

Optimizer: SGD

Learning Rate: 0.001 (Constant throughout the training)

Momentum: 0.9

Loss: CrossEntropy Loss

Accuracy: 98.4266

F1: 98.4271

Recall: 98.4261

Precision: 98.4265

Most of the misclassified images are occluded by some other objects or partly visible. One possible improvement could be to improve this type of image in the training dataset.
Hyperparameter tuning is another option, we could try to see whether the performance improves or not. For example, instead of using a constant learning rate, we could try a cyclical learning rate. This type of learning rate helps the model overcome local minima.
If we consider the rightmost image in the above figure, we see that the pose of the cat is different than most of the images in the training set. I think Augmentation like CutMix will be helpful in this scenario.