Resource retrieval
Get the pre-trained net:
NetModel parameters
This model consists of a family of individual nets, each identified by a specific parameter combination. Inspect the available parameters:
Pick a non-default net by specifying the parameters:
Pick a non-default uninitialized net:
Evaluation function
Define the label list for this model:
Write an evaluation function to scale the result to the input image size and suppress the least probable detections:
Basic usage
Obtain the detected bounding boxes with their corresponding classes and confidences for a given image:
Inspect which classes are detected:
Visualize the detection:
Network result
For the default input size of 512x512, the net produces 128x128 bounding boxes whose centers mostly follow a square grid. For each bounding box, the net produces the box’s size and the offset of the box’s center with respect to the square grid:
Change coordinate system into a graphics domain:
Compute and visualize the box center positions:
Visualize the box center positions. They follow a square grid with offsets:
Compute the boxes coordinates:
Define a function to rescale the box coordinates to the original image size:
Visualize all the boxes predicted by the net scaled by their "objectness" measures:
Visualize all the boxes scaled by the probability that they contain a cat:
Superimpose the cat prediction on top of the scaled input received by the net:
Heat map visualization
Every box is associated to a scalar strength value indicating the likelihood that the patch contains an object:
The strength of each patch is the maximal element aggregated across all classes. Obtain the strength of each patch:
Visualize the strength of each patch as a heat map:
Stretch and unpad the heat map to the original image domain:
Overlay the heat map on the image:
Obtain and visualize the strength of each patch for the "cat" class:
Overlay the heat map on the image:
Adapt to any size
Automatic image resizing can be avoided by replacing the NetEncoder. First get the NetEncoder:
Note that the NetEncoder resizes the image by keeping the aspect ratio and then pads the result to have a fixed shape of 512x512. Visualize the output of NetEncoder adjusting for brightness:
Create a new NetEncoder with the desired dimensions:
Attach the new NetEncoder:
Obtain the detected bounding boxes with their corresponding classes and confidences for a given image:
Visualize the detection:
Note that even though the localization results and the box confidences are slightly worse compared to the original net, the resized network runs significantly faster:
Net information
Inspect the number of parameters of all arrays in the net:
Obtain the total number of parameters:
Obtain the layer type counts:
Display the summary graphic: