Resource retrieval
Get the pre-trained net:
NetModel parameters
This model consists of a family of individual nets, each identified by a specific architecture. Inspect the available parameters:
Pick a non-default net by specifying the architecture:
Pick a non-default uninitialized net:
Evaluation function
Write an evaluation function to scale the result to the input image size and suppress the least probable detections:
Basic usage
Obtain the detected bounding boxes with their corresponding scores for a given image:
The model's output is an Association containing the detected "BoundingBoxes" and "Scores":
The "BoundingBoxes" key is a list of Parallelogram expressions corresponding to the bounding regions of the detected objects:
Visualize the bounding regions:
The "Scores" key contains the score values of the detected objects:
Advanced usage
Get an image:
Get the text region masks via the option "Output"->"Masks" and visualize it:
Obtain the boxes using the default evaluation and visualize them:
Increase the "MinPerimeter" to remove small boxes:
Visualize the selected and filtered out boxes:
Increase the "AcceptanceThreshold" to remove low probability boxes:
Visualize the selected and filtered out boxes:
Change the box padding with the "ScaledPadding" option:
Visualize the original and padded boxes:
The "MaskThreshold" option can help to filter noisy detections. Increasing the "MaskThreshold" helps to select the boxes with the strongest probability map:
Visualize the selected boxes:
Network result
Get an image:
Get the probability map for the detected text:
Adjust the result dimensions to the original image shape:
Visualize the probability map:
Binarize the probability map to obtain the mask:
Visualize the bounding boxes around the masked regions:
Scale the boxes to the cover the whole text:
Net information
Inspect the number of parameters of all arrays in the net:
Obtain the total number of parameters:
Obtain the layer type counts:
Display the summary graphic:
Export to ONNX
Export the net to the ONNX format:
Get the size of the ONNX file:
The size is similar to the byte count of the resource object:
Check some metadata of the ONNX model:
Import the model back into Wolfram Language. However, the NetEncoder and NetDecoder will be absent because they are not supported by ONNX: