Resource retrieval
Get the pre-trained net:
NetModel parameters
This model consists of a family of individual nets, each identified by a specific parameter combination. Inspect the available parameters:
Pick a non-default net by specifying the parameters:
Pick a non-default uninitialized net:
Basic usage
Identify the main action in a video:
Obtain the probabilities of the 10 most likely entities predicted by the net:
Obtain the list of names of all available classes:
NetModel architecture
ShuffleNet-3D V1 features an efficient and elegant implementation of a channel shuffle operation: a feature map with g⨯n channels is reshaped to expand the channel dimension to two dimensions of sizes (g, n), then transposed and further flattened back as the input of the next layer:
Feature extraction
Remove the last two layers of the trained net so that the net produces a vector representation of an image:
Get a set of videos:
Visualize the features of a set of videos:
Transfer learning
Use the pre-trained model to build a classifier for telling apart images from two action classes not present in the dataset. Create a test set and a training set:
Remove the linear layer from the pre-trained net:
Create a new net composed of the pre-trained net followed by a linear layer and a softmax layer:
Train on the dataset, freezing all the weights except for those in the "Linear" new layer (use TargetDevice -> "GPU" for training on a GPU):
Perfect accuracy is obtained on the test set:
Net information
Inspect the number of parameters of all arrays in the net:
Obtain the total number of parameters:
Obtain the layer type counts:
Display the summary graphic:
Export to ONNX
Export the net to the ONNX format:
Get the size of the ONNX file:
Check some metadata of the ONNX model:
Import the model back into the Wolfram Language. However, the NetEncoder and NetDecoder will be absent because they are not supported by ONNX: