Vanilla CNN for Facial Landmark Regression

Determine the locations of the eyes, nose and mouth from a facial image

Released in 2015, this net is a regressor for locating five facial landmarks from facial images: eyes, nose and mouth corners. The net output is to be interpreted as the positions of {EyeLeft, EyeRight, Nose, MouthLeft, MouthRight}, where values are rescaled to the input image size so that the bottom-left corner is identified by {0, 0} and the top-right corner by {1, 1}.

Number of layers: 24 | Parameter count: 111,050 | Trained size: 485 KB |

Training Set Information

A training set containing 5,590 annotated facial images extracted from the Labeled Faces in the Wild dataset and 7,876 images downloaded from the web.

Performance

This model achieves 8.1% detector error rate on the AFLW Dataset.

Examples

Download Example Notebook

Open in Wolfram Cloud

Resource retrieval

Get the pre-trained net:

In[1]:=

Out[1]=

Basic usage

Get a facial image and the net:

In[2]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/3053cf62-c347-4351-8a91-8917991b0c1f"]

In[3]:=

Get the locations of the eyes, nose and mouth corners:

In[4]:=

Out[4]=

Show the prediction:

In[5]:=

Out[5]=

Preprocessing

The net must be evaluated on facial crops only. Get an image with multiple faces:

In[6]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/186b437f-d1ff-4390-9170-170ba9a70246"]

Write an evaluation function that crops the input image around faces and returns the crops and facial landmarks:

In[7]:=

$findFacialLandmarks[img_Image] := Block[ {crops, points}, crops = ImageTrim[img, #] & /@ FindFaces[img]; points = If[Length[crops] > 0, net[crops], {}]; MapThread[<|"Crop" -> #1, "Landmarks" -> AssociationThread[{"LeftEye", "RightEye", "Nose", "LeftMouth", "RightMouth"}, #2]|> &, {crops, points}] ]$

Evaluate the function on the image:

In[8]:=

Out[8]=

Write a simple function to show the landmarks:

In[9]:=

In[10]:=

$showLandmarks[data_] := Table[HighlightImage[ face["Crop"], {PointSize[0.04], Riffle[Values@colorCodes, Values@face["Landmarks"]]}, DataRange -> {{0, 1}, {0, 1}}], {face, data}]$

Evaluate the function on the previous output:

In[11]:=

Out[11]=

Net information

Inspect the number of parameters of all arrays in the net:

In[12]:=

$NetInformation[ NetModel["Vanilla CNN for Facial Landmark Regression"], \ "ArraysElementCounts"]$

Out[12]=

Obtain the total number of parameters:

In[13]:=

$NetInformation[ NetModel["Vanilla CNN for Facial Landmark Regression"], \ "ArraysTotalElementCount"]$

Out[13]=

Obtain the layer type counts:

In[14]:=

$NetInformation[ NetModel["Vanilla CNN for Facial Landmark Regression"], \ "LayerTypeCounts"]$

Out[14]=

Display the summary graphic:

In[15]:=

$NetInformation[ NetModel["Vanilla CNN for Facial Landmark Regression"], \ "SummaryGraphic"]$

Out[15]=

Export to MXNet

Export the net into a format that can be opened in MXNet:

In[16]:=

Out[16]=

Export also creates a net.params file containing parameters:

In[17]:=

Out[17]=

Get the size of the parameter file:

In[18]:=

Out[18]=

The size is similar to the byte count of the resource object:

In[19]:=

Out[19]=

Represent the MXNet net as a graph:

In[20]:=

Out[20]=

Construction Notebook

Download Construction Notebook

Open in Wolfram Cloud

Requirements

Wolfram Language 11.2 (September 2017) or above

Resource History

Date Created: 15 August 2017
Latest Update: 21 June 2018

Reference

Y. Wu, T. Hassner, K. Kim, G. Medioni, P. Natarajan, "Facial Landmark Detection with Tweaked Convolutional Neural Networks," arXiv:1511.04031 (2015)
Available from: https://github.com/ishay2b/VanillaCNN
Rights: Unrestricted use