3D Face Alignment Net Trained on 300W Large Pose Data

Determine the 2D projection of 3D keypoints from a facial image

Developed in 2017 at the Computer Vision Laboratory at the University of Nottingham, this net predicts the locations of the 2D projections of 68 3D keypoints (17 for face contour, 10 for eyebrows, 9 for nose, 12 for eyes, 20 for mouth) from a facial image. For each keypoint, a heat map for its location is produced. Its complex architecture features a combination of hourglass modules and multiscale parallel blocks.

Number of layers: 967 | Parameter count: 23,874,320 | Trained size: 97 MB |

Training Set Information

300W-LP dataset, a synthetic expansion of the 300W dataset consisting of 120,000 examples. Augmentation of 300W was performed in order to obtain face appearances in larger poses. This dataset provides annotations for both 2D landmarks and the 2D projections of 3D landmarks.

Performance

This model achieves, respectively, 73.5%, 74.6% and 68.8% AUC (Area Under Curve) score for yaw angles of 0°-30°, 30°-60° and 60°-90° on the LS3D-W dataset.

Examples

Download Example Notebook

Open in Wolfram Cloud

Resource retrieval

Get the pre-trained net:

In[1]:=

Out[1]=

Basic usage

This net outputs a 64x64 heat map for each of the 68 landmarks:

In[2]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/7e6b30e0-7512-4ded-b129-c630df59a774"]

Obtain the dimensions of the heat maps:

In[3]:=

Out[3]=

Visualize heat maps 1, 12 and 29:

In[4]:=

Out[4]=

Evaluation function

Write an evaluation function that picks the maximum position of each heat map and returns a list of landmark positions:

In[5]:=

$netevaluation[img_] := Block[ {heatmaps, posFlattened, posMat}, heatmaps = NetModel["3D Face Alignment Net Trained on 300W Large Pose Data"][ img]; posFlattened = Map[First@Ordering[#, -1] &, Flatten[heatmaps, {{1}, {2, 3}}]]; posMat = QuotientRemainder[posFlattened - 1, 64] + 1; 1/64.*Map[{#[[2]], 64 - #[[1]] + 1} - 0.5 &, posMat] ]$

Landmark positions

Get the landmarks using the evaluation function. Coordinates are rescaled to the input image size so that the bottom-left corner is identified by {0, 0} and the top-right corner by {1, 1}:

In[6]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/e20454c0-8f59-4191-bb9d-5e0a81542ba6"]

Out[6]=

Group landmarks associated with different facial features by colors:

In[7]:=

Visualize the landmarks:

In[8]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/bc4065dd-c511-4e2e-8b52-17dcd6bc3e97"]

Out[8]=

Robustness to facial crop size

Get an image:

In[9]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/051c1bf9-fadd-419e-a0ff-565e0b9881be"]

Crop the image at various sizes:

In[10]:=

Out[10]=

Inspect the network performance across the crops:

In[11]:=

HighlightImage[#, Graphics@
Riffle[Thread@Hue[Range[8]/8.], Map[Point, Function[p, Part[netevaluation[#], p]] /@ groupings]],
DataRange -> {{0, 1}, {0, 1}}, ImageSize -> 250] & /@ crops

Out[11]=

Net information

Inspect the number of parameters of all arrays in the net:

In[12]:=

$NetInformation[ NetModel["3D Face Alignment Net Trained on 300W Large Pose Data"], \ "ArraysElementCounts"]$

Out[12]=

Obtain the total number of parameters:

In[13]:=

$NetInformation[ NetModel["3D Face Alignment Net Trained on 300W Large Pose Data"], \ "ArraysTotalElementCount"]$

Out[13]=

Obtain the layer type counts:

In[14]:=

$NetInformation[ NetModel["3D Face Alignment Net Trained on 300W Large Pose Data"], \ "LayerTypeCounts"]$

Out[14]=

Display the summary graphic:

In[15]:=

$NetInformation[ NetModel["3D Face Alignment Net Trained on 300W Large Pose Data"], \ "SummaryGraphic"]$

Out[15]=

Export to MXNet

Export the net into a format that can be opened in MXNet:

In[16]:=

Out[16]=

Export also creates a net.params file containing parameters:

In[17]:=

Out[17]=

Get the size of the parameter file:

In[18]:=

Out[18]=

The size is similar to the byte count of the resource object:

In[19]:=

$NetModel["3D Face Alignment Net Trained on 300W Large Pose Data", \ "ByteCount"]$

Out[19]=

Construction Notebook

Download Construction Notebook

Open in Wolfram Cloud

Requirements

Wolfram Language 11.2 (September 2017) or above

Resource History

Date Created: 18 October 2017
Latest Update: 21 June 2018

Reference

A. Bulat, G. Tzimiropoulos, "How Far Are We from Solving the 2D & 3D Face Alignment Problem? (And a Dataset of 230,000 3D Facial Landmarks)," arXiv:1703.07332 (2017)
Available from: https://github.com/1adrianb/2D-and-3D-face-alignment
Rights: BSD 3-Clause License