Vanilla CNN for Facial Landmark Regression

Determine the locations of the eyes, nose and mouth from a facial image

Released in 2015, this net is a regressor for locating five facial landmarks from facial images: eyes, nose and mouth corners. The net output is to be interpreted as the positions of {EyeLeft, EyeRight, Nose, MouthLeft, MouthRight}, where values are rescaled to the input image size so that the bottom-left corner is identified by {0, 0} and the top-right corner by {1, 1}.

Number of layers: 24 | Parameter count: 111,050 | Trained size: 485 KB |

Training Set Information

Performance

Examples

Resource retrieval

Get the pre-trained net:

In[1]:=
NetModel["Vanilla CNN for Facial Landmark Regression"]
Out[1]=

Basic usage

Get a facial image and the net:

In[2]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/3053cf62-c347-4351-8a91-8917991b0c1f"]
In[3]:=
net = NetModel["Vanilla CNN for Facial Landmark Regression"];

Get the locations of the eyes, nose and mouth corners:

In[4]:=
landmarks = net[img]
Out[4]=

Show the prediction:

In[5]:=
HighlightImage[img, {PointSize[0.04], landmarks}, DataRange -> {{0, 1}, {0, 1}}]
Out[5]=

Preprocessing

The net must be evaluated on facial crops only. Get an image with multiple faces:

In[6]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/186b437f-d1ff-4390-9170-170ba9a70246"]

Write an evaluation function that crops the input image around faces and returns the crops and facial landmarks:

In[7]:=
findFacialLandmarks[img_Image] := Block[
  {crops, points},
  crops = ImageTrim[img, #] & /@ FindFaces[img];
  points = If[Length[crops] > 0, net[crops], {}];
  MapThread[<|"Crop" -> #1, "Landmarks" -> AssociationThread[{"LeftEye", "RightEye", "Nose", "LeftMouth", "RightMouth"}, #2]|> &, {crops, points}]
  ]

Evaluate the function on the image:

In[8]:=
output = findFacialLandmarks[img]
Out[8]=

Write a simple function to show the landmarks:

In[9]:=
colorCodes = <|"LeftEye" -> Hue[0.2], "RightEye" -> Hue[0.4], "Nose" -> Hue[0.6], "LeftMouth" -> Hue[0.8], "RightMouth" -> Hue[1]|>;
In[10]:=
showLandmarks[data_] := Table[HighlightImage[
   face["Crop"], {PointSize[0.04], Riffle[Values@colorCodes, Values@face["Landmarks"]]}, DataRange -> {{0, 1}, {0, 1}}], {face, data}]

Evaluate the function on the previous output:

In[11]:=
showLandmarks[output]
Out[11]=

Net information

Inspect the number of parameters of all arrays in the net:

In[12]:=
NetInformation[
 NetModel["Vanilla CNN for Facial Landmark Regression"], \
"ArraysElementCounts"]
Out[12]=

Obtain the total number of parameters:

In[13]:=
NetInformation[
 NetModel["Vanilla CNN for Facial Landmark Regression"], \
"ArraysTotalElementCount"]
Out[13]=

Obtain the layer type counts:

In[14]:=
NetInformation[
 NetModel["Vanilla CNN for Facial Landmark Regression"], \
"LayerTypeCounts"]
Out[14]=

Display the summary graphic:

In[15]:=
NetInformation[
 NetModel["Vanilla CNN for Facial Landmark Regression"], \
"SummaryGraphic"]
Out[15]=

Export to MXNet

Export the net into a format that can be opened in MXNet:

In[16]:=
jsonPath = Export[FileNameJoin[{$TemporaryDirectory, "net.json"}], NetModel["Vanilla CNN for Facial Landmark Regression"], "MXNet"]
Out[16]=

Export also creates a net.params file containing parameters:

In[17]:=
paramPath = FileNameJoin[{DirectoryName[jsonPath], "net.params"}]
Out[17]=

Get the size of the parameter file:

In[18]:=
FileByteCount[paramPath]
Out[18]=

The size is similar to the byte count of the resource object:

In[19]:=
ResourceObject[
  "Vanilla CNN for Facial Landmark Regression"]["ByteCount"]
Out[19]=

Represent the MXNet net as a graph:

In[20]:=
Import[jsonPath, {"MXNet", "NodeGraphPlot"}]
Out[20]=

Requirements

Wolfram Language 11.2 (September 2017) or above

Resource History

Reference

  • Y. Wu, T. Hassner, K. Kim, G. Medioni, P. Natarajan, "Facial Landmark Detection with Tweaked Convolutional Neural Networks," arXiv:1511.04031 (2015)
  • Available from: https://github.com/ishay2b/VanillaCNN
  • Rights: Unrestricted use