Resource retrieval
Get the pre-trained net:
NetModel parameters
This model consists of a family of individual nets, each identified by a specific architecture. Inspect the available parameters:
Pick a non-default net by specifying the architecture:
Pick a non-default uninitialized net:
Evaluation function
Write an evaluation function to extract the bounding regions and masks for each text instance:
Basic usage
Obtain the bounding boxes and masks for each text instance in a given image:
The output is an Association containing the detected bounding boxes with their labels:
Visualize the bounding regions:
Advanced usage
Get an image:
Obtain the bounding regions using the default evaluation and visualize them:
Get the individual masks via the option "Output"->"Masks":
Increase the "MinTextArea" to remove small regions:
Set the region type to "MinConvexPolygon" to generate arbitrarily shaped regions:
Network result
Get an image:
Run the model on the image:
The model's outputs are the "TextRegion", "Kernel" and "Similarity" components. The text region matrix outlines the entire area of each text instance, while the kernel matrix helps distinguish between individual text instances. The similarity vector then guides the grouping of pixels within each instance:
Binarize the text probability map and the kernel. Multiply both images to obtain the final kernel:
Split the detected instances:
Use the expandComponent function to expand the kernel region using the similarity matrices as a guide:
Filter the small areas:
All outputs contain rectangular matrices with fixed dimensions, specifically 160×192. Adjust the result dimensions to the original image shape:
Visualize the detected text instances:
Net information
Inspect the number of parameters of all arrays in the net:
Obtain the total number of parameters:
Obtain the layer type counts:
Display the summary graphic:
Export to ONNX
Export the net to the ONNX format:
Get the size of the ONNX file:
The size is similar to the byte count of the resource object:
Check some metadata of the ONNX model:
Import the model back into Wolfram Language. However, the NetEncoder and NetDecoder will be absent because they are not supported by ONNX: