Developed in 2017 at the Computer Vision Laboratory at the University of Nottingham, this net predicts the locations of 68 2D keypoints (17 for face contour, 10 for eyebrows, 9 for nose, 12 for eyes, 20 for mouth) from a facial image. For each keypoint, a heat map for its location is produced. Its complex architecture features a combination of hourglass modules and multiscale parallel blocks.
Number of layers: 967 |
Parameter count: 23,874,320 |
Trained size: 97 MB |
Examples
Resource retrieval
Get the pre-trained net:
Basic usage
This net outputs a 64x64 heat map for each of the 68 landmarks:
Obtain the dimensions of the heat map:
Visualize heat maps 1, 12 and 29:
Evaluation function
Write an evaluation function that picks the maximum position of each heat map and returns a list of landmark positions:
Landmark positions
Get the landmarks using the evaluation function. Coordinates are rescaled to the input image size so that the bottom-left corner is identified by {0, 0} and the top-right corner by {1, 1}:
Group landmarks associated with different facial features by colors:
Visualize the landmarks:
Preprocessing
The net must be evaluated on facial crops only. Get an image with multiple faces:
Write an evaluation function that crops the input image around faces and returns the crops and facial landmarks:
Evaluate the function on the image:
Visualize the landmarks:
Robustness to facial crop size
Get an image:
Crop the image at various sizes:
Inspect the network performance across the crops:
Net information
Inspect the number of parameters of all arrays in the net:
Obtain the total number of parameters:
Obtain the layer type counts:
Display the summary graphic:
Export to MXNet
Export the net into a format that can be opened in MXNet:
Export also creates a net.params file containing parameters:
Get the size of the parameter file:
The size is similar to the byte count of the resource object:
Requirements
Wolfram Language 11.2
(September 2017) or above
Resource History
Reference