SlowFast Video Action Classification
Trained on
Kinetics-400 Data
Inspired by human biology, this family of video recognition models was released in 2021 and features a slow pathway, operating at low frame rate, to capture spatial semantics and a fast pathway, operating at high frame rate, to capture motion at fine temporal resolution.
Examples
Resource retrieval
Get the pre-trained net:
NetModel parameters
This model consists of a family of individual nets, each identified by a specific parameter combination. Inspect the available parameters:
Pick a non-default net by specifying the parameters:
Pick a non-default uninitialized net:
Basic usage
Classify a video:
Obtain the probabilities predicted by the net:
Feature extraction
Remove the last three layers of the trained net so that the net produces a vector representation of an image:
Get a set of videos:
Visualize the features of a set of videos:
Transfer learning
Use the pre-trained model to build a classifier for telling apart videos from two action classes not present in the dataset. Create a test set and a training set:
Remove the last three layers from the pre-trained net:
Create a new net composed of the pre-trained net followed by a linear layer, an aggregation layer and a softmax layer:
Train on the dataset, freezing all the weights except for those in the "Linear" layer (use TargetDevice -> "GPU" for training on a GPU):
Perfect accuracy is obtained on the test set:
Net information
Inspect the number of parameters of all arrays in the net:
Obtain the total number of parameters:
Obtain the layer type counts:
Display the summary graphic:
Requirements
Wolfram Language
13.1
(June 2022)
or above
Resource History
Reference