|
| **Component** | **Description** | |
|
|-------------------------------|-----------------------------------------------------------------------------------------------| |
|
| **Backbone** | ResNet-50 with FPN (Feature Pyramid Network) | |
|
| **Pretrained Weights** | Trained on ImageNet for feature extraction. | |
|
| **RPN (Region Proposal Network)** | Generates region proposals based on extracted features from the backbone. | |
|
| **ROI Align** | Aligns region proposals to a fixed size for consistent feature extraction. | |
|
| **Box Head** | Fully connected layers for refining bounding boxes and classifying objects. | |
|
| **Box Predictor** | Replaced with a custom predictor: `FastRCNNPredictor` for handling custom classes. | |
|
| **Number of Classes** | Configurable (including background). | |
|
| **Loss Function** | Combines classification and regression losses for multi-task optimization. | |
|
| **Optimizer** | Stochastic Gradient Descent (SGD) with momentum for optimization. | |
|
| **Learning Rate Scheduler** | StepLR to decay learning rate every few epochs for better convergence. | |
|
| **Batch Normalization** | Applied within the backbone for stable training. | |
|
| **Data Format** | Input: Tensor of shape `(Batch Size, Channels, Height, Width)` in PyTorch's NCHW format. | |
|
| **Output** | - Class probabilities for each region proposal. | |
|
| | - Refined bounding box coordinates for each detected object. | |
|
|