File size: 2,917 Bytes
71cc2a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8d4d7b8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71cc2a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
library_name: pytorch
pipeline_tag: image-classification
license: mit
tags:
  - automl
  - pytorch
  - torchvision
  - optuna
  - early-stopping
model_name: Tomato vs Not-Tomato β€” AutoML (Compact CNN / Transfer Learning)
language:
  - en
---

# Tomato vs Not-Tomato β€” AutoML (Compact NN)

## Purpose
Course assignment to practice AutoML for neural networks on a small, real dataset.  
We train a compact image classifier to predict whether an image **is a tomato (1) or not (0).**

## Dataset
- **Source:** classmate dataset on Hugging Face β†’ `Iris314/Food_tomatoes_dataset`
- **Task:** Binary classification (`0 = not_tomato`, `1 = tomato`)
- **Splits:** Stratified **60/20/20** (train/val/test) created in the notebook
- **Size:** ~30 images total (very small)
- **Input resolution:** 224Γ—224

## Preprocessing & Augmentation
- **Normalization:** mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225]
- **Train augmentations:** RandomResizedCrop, HorizontalFlip(0.5), ColorJitter
- **Eval transforms:** Resize β†’ CenterCrop β†’ Normalize

## AutoML Setup
- **Search framework:** Optuna (budgeted search with pruning)
- **Architectures:** `smallcnn` (from scratch), `resnet18`, `mobilenet_v3_small`
- **Hyperparams:** optimizer ∈ {adamw, sgd}, lr ∈ [1e-5, 5e-3] (log), weight_decay ∈ [1e-6, 1e-2] (log),  
  dropout ∈ [0, 0.6], batch_size ∈ {8, 12, 16}, `freeze_backbone` ∈ {True, False} (for pretrained)
- **Early stopping:** patience = 6 epochs on validation F1
- **Budget:** 10 trials, max 20 epochs per trial, ~5 min wall-clock
- **Seed:** 42
- **Compute:** Google Colab GPU runtime

## Best Model & Hyperparameters
```json
{
  "arch": "mobilenet_v3_small",
  "freeze_backbone": false,
  "dropout": 0.4761270681732692,
  "optimizer": "adamw",
  "lr": 1.1860369117967872e-05,
  "weight_decay": 0.00043282443346186894,
  "batch_size": 16
}
```

## Results on Held out Test
accuracy: 0.83, f1: 0.80

## Training curves and Early Stopping
Validation F1 was tracked each epoch with patience = 6. Training stopped once performance plateaued, preventing overfitting.

## Reproducability 
- Seed: 42
- Python: 3.12
- PyTorch: 2.4.1
- TorchVision: 0.19.1
- Optuna: 4.0.0
- Compute: Google Colab GPU (T4)

## Limitations & Known Failure Modes
- Extremely small dataset β†’ risk of overfitting and unstable metrics.
- Backgrounds and lighting variations can bias predictions.
- Out-of-distribution images (e.g., tomato cartoons, extreme angles) may fail.

## Ethics
- This model is for coursework demonstration only; not for production or consequential decisions.

## License
- Code & weights: MIT (adjust per course requirements)
- Dataset: follow the original dataset’s license/terms

## Acknowledgments
- Dataset: Iris314/Food_tomatoes_dataset
- AutoML: Optuna
- Backbones: torchvision models
- Trained in Google Colab
- GenAI tools assisted with boilerplate organization and documentation