RetinaNet
Collection
1 item
โข
Updated
A Keras model implementing the RetinaNet meta-architecture.
Implements the RetinaNet architecture for object detection. The constructor
requires num_classes
, bounding_box_format
, and a backbone. Optionally,
a custom label encoder, and prediction decoder may be provided.
Keras and KerasHub can be installed with:
pip install -U -q keras-hub
pip install -U -q keras
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.
The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
Preset name | Parameters | Description |
---|---|---|
retinanet_resnet50_fpn_coco | 34.12M | RetinaNet model with ResNet50 backbone fine-tuned on COCO in 800x800 resolution. |
Arguments
keras.Model
. If the default feature_pyramid
is used,
must implement the pyramid_level_inputs
property with keys "P3", "P4",
and "P5" and layer names as values. A somewhat sensible backbone
to use in many cases is the:
keras_cv.models.ResNetBackbone.from_preset("resnet50_imagenet")
keras_cv.layers.AnchorGenerator
. If
provided, the anchor generator will be passed to both the
label_encoder
and the prediction_decoder
. Only to be used when
both label_encoder
and prediction_decoder
are both None
.
Defaults to an anchor generator with the parameterization:
strides=[2**i for i in range(3, 8)]
,
scales=[2**x for x in [0, 1 / 3, 2 / 3]]
,
sizes=[32.0, 64.0, 128.0, 256.0, 512.0]
,
and aspect_ratios=[0.5, 1.0, 2.0]
.call()
method, and returns RetinaNet training targets. By default, a
KerasCV standard RetinaNetLabelEncoder
is created and used.
Results of this object's call()
method are passed to the loss
object for box_loss
and classification_loss
the y_true
argument.keras.layers.Layer
that is
responsible for transforming RetinaNet predictions into usable
bounding box Tensors. If not provided, a default is provided. The
default prediction_decoder
layer is a
keras_cv.layers.MultiClassNonMaxSuppression
layer, which uses
a Non-Max Suppression for box pruning.keras.layers.Layer
that produces
a list of 4D feature maps (batch dimension included)
when called on the pyramid-level outputs of the backbone
.
If not provided, the reference implementation from the paper will be used.keras.Layer
that performs
classification of the bounding boxes. If not provided, a simple
ConvNet with 3 layers will be used.keras.Layer
that performs regression of the
bounding boxes. If not provided, a simple ConvNet with 3 layers
will be used.object_detector = keras_hub.models.ImageObjectDetector.from_preset(
"retinanet_resnet50_fpn_coco"
)
input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)
backbone = keras_hub.models.Backbone.from_preset(
"retinanet_resnet50_fpn_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
"retinanet_resnet50_fpn_coco"
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
image_converter = keras_hub.layers.RetinaNetImageConverter(
scale=1/255
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
image_converter=image_converter
)
# Load a pre-trained ResNet50 model.
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
"resnet_50_imagenet"
)
# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
image_encoder=image_encoder,
min_level=3,
max_level=5,
use_p5=False
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
object_detector = keras_hub.models.ImageObjectDetector.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)
backbone = keras_hub.models.Backbone.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
image_converter = keras_hub.layers.RetinaNetImageConverter(
scale=1/255
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
image_converter=image_converter
)
# Load a pre-trained ResNet50 model.
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
"resnet_50_imagenet"
)
# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
image_encoder=image_encoder,
min_level=3,
max_level=5,
use_p5=False
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)