Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,30 @@ Vision Transformer (ViT) model pre-trained on ImageNet-21k (14 million images, 2
|
|
18 |
|
19 |
Finally the ViT was finetuned on the Chaoyang dataset at resolution 384x384, using a fixed 10% of the training set as the validation set and evaluated on the official test set using the best validation model based on the loss
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
# Results
|
22 |
|
23 |
Our model represents the current state-of-the-art in the field, as it outperforms previous state-of-the-art models proposed in papers with code,
|
|
|
18 |
|
19 |
Finally the ViT was finetuned on the Chaoyang dataset at resolution 384x384, using a fixed 10% of the training set as the validation set and evaluated on the official test set using the best validation model based on the loss
|
20 |
|
21 |
+
# Augmentation pipeline
|
22 |
+
To address the issue of class imbalance in our training set, we performed oversampling with repetition.
|
23 |
+
Specifically, we duplicated the minority classes images until we obtained an even distribution across all classes.
|
24 |
+
This resulted in a larger training set, but ensured that our model was exposed to an equal number of samples from each class during training.
|
25 |
+
We verified that this approach did not lead to overfitting or other issues by using a validation set with the original class distribution.
|
26 |
+
We used the following augmentation pipeline for our experiments:
|
27 |
+
|
28 |
+
A.Resize(img_size, img_size),
|
29 |
+
A.HorizontalFlip(p=0.5),
|
30 |
+
A.VerticalFlip(p=0.5),
|
31 |
+
A.RandomRotate90(p=0.5),
|
32 |
+
A.RandomResizedCrop(img_size, img_size, scale=(0.5, 1.0), p=0.5),
|
33 |
+
ToTensorV2(p=1.0)
|
34 |
+
|
35 |
+
This pipeline consists of the following transformations:
|
36 |
+
|
37 |
+
- Resize: resizes the image to a fixed size of (img_size, img_size).
|
38 |
+
- HorizontalFlip: flips the image horizontally with a probability of 0.5.
|
39 |
+
- VerticalFlip: flips the image vertically with a probability of 0.5.
|
40 |
+
- RandomRotate90: randomly rotates the image by 90, 180, or 270 degrees with a probability of 0.5.
|
41 |
+
- RandomResizedCrop: randomly crops and resizes the image to a size between 50% and 100% of the original size, with a probability of 0.5.
|
42 |
+
- ToTensorV2: converts the image to a PyTorch tensor.
|
43 |
+
|
44 |
+
These transformations were chosen to augment the dataset with a variety of geometric transformations, while preserving important visual features.
|
45 |
# Results
|
46 |
|
47 |
Our model represents the current state-of-the-art in the field, as it outperforms previous state-of-the-art models proposed in papers with code,
|