Wolfram Computation Meets Knowledge

Vanilla CNN for Facial Landmark Regression

Determine the locations of the eyes, nose and mouth from a facial image

Released in 2015, this net is a regressor for locating five facial landmarks from facial images: eyes, nose and mouth corners. The net output is to be interpreted as the positions of {EyeLeft, EyeRight, Nose, MouthLeft, MouthRight}, where values are rescaled to the input image size so that the bottom-left corner is identified by {0, 0} and the top-right corner by {1, 1}.

Number of layers: 24 | Parameter count: 111,050 | Trained size: 485 KB

Training Set Information

Examples

Resource retrieval

Retrieve the resource object:

In[1]:=
ResourceObject["Vanilla CNN for Facial Landmark Regression"]
Out[1]=

Get the pre-trained net:

In[2]:=
NetModel["Vanilla CNN for Facial Landmark Regression"]
Out[2]=

Basic usage

Get a facial image and the net:

In[3]:=
In[4]:=
net = NetModel["Vanilla CNN for Facial Landmark Regression"];

Get the locations of the eyes, nose and mouth corners:

In[5]:=
landmarks = net[img]
Out[5]=

Show the prediction:

In[6]:=
HighlightImage[img, {PointSize[0.04], landmarks}, 
 DataRange -> {{0, 1}, {0, 1}}]
Out[6]=

Preprocessing

The net must be evaluated on facial crops only. Get an image with multiple faces:

In[7]:=

Write an evaluation function that crops the input image around faces and returns the crops and facial landmarks:

In[8]:=
findFacialLandmarks[img_Image] := Block[
  {crops, points},
  crops = ImageTrim[img, #] & /@ FindFaces[img];
  points = If[Length[crops] > 0, net[crops], {}];
  MapThread[<|"Crop" -> #1, 
     "Landmarks" -> 
      AssociationThread[{"LeftEye", "RightEye", "Nose", "LeftMouth", 
        "RightMouth"}, #2]|> &, {crops, points}]
  ]

Evaluate the function on the image:

In[9]:=
output = findFacialLandmarks[img]
Out[9]=

Write a simple function to show the landmarks:

In[10]:=
colorCodes = <|"LeftEye" -> Hue[0.2], "RightEye" -> Hue[0.4], 
   "Nose" -> Hue[0.6], "LeftMouth" -> Hue[0.8], 
   "RightMouth" -> Hue[1]|>;
In[11]:=
showLandmarks[data_] := 
 Table[HighlightImage[
   face["Crop"], {PointSize[0.04], 
    Riffle[Values@colorCodes, Values@face["Landmarks"]]}, 
   DataRange -> {{0, 1}, {0, 1}}], {face, data}]

Evaluate the function on the previous output:

In[12]:=
showLandmarks[output]
Out[12]=

Export to MXNet

Export the net into a format that can be opened in MXNet:

In[13]:=
jsonPath = 
 Export[FileNameJoin[{$TemporaryDirectory, "net.json"}], 
  NetModel["Vanilla CNN for Facial Landmark Regression"], "MXNet"]
Out[13]=

Export also creates a net.params file containing parameters:

In[14]:=
paramPath = FileNameJoin[{DirectoryName[jsonPath], "net.params"}]
Out[14]=

Get the size of the parameter file:

In[15]:=
FileByteCount[paramPath]
Out[15]=

The size is similar to the byte count of the resource object:

In[16]:=
ResourceObject[
  "Vanilla CNN for Facial Landmark Regression"]["ByteCount"]
Out[16]=

Represent the MXNet net as a graph:

In[17]:=
Import[jsonPath, {"MXNet", "NodeGraphPlot"}]
Out[17]=

Requirements

Wolfram Language 11.2 (September 2017) or above

Reference

  • Y. Wu, T. Hassner, K. Kim, G. Medioni, P. Natarajan, "Facial Landmark Detection with Tweaked Convolutional Neural Networks," arXiv:1511.04031 (2015)
  • (available from https://github.com/ishay2b/VanillaCNN)
  • Rights: Unrestricted use