CapsNet Trained on MNIST Data

Identify the handwritten digit in an image

Released in 2017, this model makes use of capsules as the fundamental building blocks to replace neurons in artificial neural networks. As opposed to scalar neurons, whose activation reacts to the presence of a particular feature or object, a capsule is a group of neurons all reacting to the same entity, whose vectorial activation can also encode the properties of the detected instance. In addition, features are passed from one capsule layer to another using a novel dynamic routing technique.

Number of layers: 52 | Parameter count: 8,141,840 | Trained size: 33 MB |

Training Set Information

MNIST Database of Handwritten Digits, consisting of 60,000 training and 10,000 test grayscale images of size 28x28.

Training Set Data

MNIST

Performance

This model achieves 99.5% accuracy on the MNIST dataset.

Examples

Download Example Notebook

Open in Wolfram Cloud

Resource retrieval

Retrieve the pre-trained net:

In[1]:=

Out[1]=

Basic usage

Apply the trained net to a set of inputs:

In[2]:=

$NetModel["CapsNet Trained on MNIST Data"][{\!$\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/6cK2MHQhEtqiTTTJVxyQUy633BIneZSeolLWwSTBS6pP1ZM+bjkzjEJ 4ZL6X81kjkvqKydTHS65L0y45c4wMS3FJdfOxHRtcW3tkY/Y5cBA+T5WOaPk 5GgmJkNMOQWwW/6FMMliyqWB5X46MlliNTPo//+JTExbMOU+Ad0h683ElP8P U+73fCGQM4XuYPHD//97LZmYLPdhlRrUAABgHMjK "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]$, \!$\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x9IsJWBKew2dql3okzMzH6/sMr5M7tPNGWeglXOQPf//yPiZdjlgMI7 mbHKVTOVQQkM8FCaef+v3dIih7HIfQphnnKCmdntBzYzdzKLizEzM+/DJvdD hxnoP+xueafNxCvJwWSG3S0SV/+7M/tikbvAzHzyP1AOi5lvNJj9fwPlVJdj ytUzMx8HUu7xWIw0ZdJ49f/PJUWTL5hyTMwT//+/xMxcgkUfE/Ou/92KzKGf sMqpGLBzTb6MRer/Sh1gmGDzG/0AACEauS8= "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]$, \!$\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x9IsN/BdW1Vstx8LFK7eZgYmYBAcepvNJkvG/mZIHJMTLfR5OaBBAtX r44CKqn4hSK1TRgoZQNipQEZp5ClPpoCRSr+gJiXgarykeWWAqUqoSalocrd 52ViCnwJMx5VLpeJyQVufyOK3Fo2JqYWOM+WiWkFkjZGRjs4p4mRkXE5nPcp GegQGOeXM9BZ1vth3EAgT3I+lHMOyLE7Adf3y4uJqeY1hP3MFShXhbDuTwgT 0ywIa5s9Wnh+AnIFErYBgSVIRrz4G0LumxoTFIDiQaztPzLIQ5JzPogi9f+Z NkzOof77fzTwY1EcSKps5y90GToCAMITbxU= "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]$}]$

Out[2]=

Give class probabilities for a single input:

In[3]:=

$NetModel["CapsNet Trained on MNIST Data"][\!$\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x9IsJWBKew2dql3okzMzH6/sMr5M7tPNGWeglXOQPf//yPiZdjlgMI7 mbHKVTOVQQkM8FCaef+v3dIih7HIfQphnnKCmdntBzYzdzKLizEzM+/DJvdD hxnoP+xueafNxCvJwWSG3S0SV/+7M/tikbvAzHzyP1AOi5lvNJj9fwPlVJdj ytUzMx8HUu7xWIw0ZdJ49f/PJUWTL5hyTMwT//+/xMxcgkUfE/Ou/92KzKGf sMqpGLBzTb6MRer/Sh1gmGDzG/0AACEauS8= "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]$, "Classification" -> "Probabilities"]$

Out[3]=

Feature extraction

Create a subset of the MNIST dataset:

In[4]:=

Out[4]=

Remove the last linear layer of the net, which will be used as a feature extractor:

In[5]:=

Out[5]=

Visualize the features of a subset of the MNIST dataset:

In[6]:=

Out[6]=

Image generation

Extract the image reconstruction part:

In[7]:=

reconstructor = NetReplacePart[
NetExtract[NetModel["CapsNet Trained on MNIST Data"], "Reconstruct"], "Output" -> NetDecoder["Image"]]

Out[7]=

Extract the DigitCaps feature vector for a given digit image:

In[8]:=

$featureVect = NetModel["CapsNet Trained on MNIST Data"][\!$\* GraphicsBox[ TagBox[RasterBox[CompressedData[" 1:eJxTTMoPSmNiYGAo5gASQYnljkVFiZXBAkBOaF5xZnpeaopnXklqemqRRRJI mQwU/x848KaBMRGH1EFrJqZC7FKT+JiYmDY0T96w4QGazI8eTqAUEyOI4PVF lUsHCQulgIAdI3sFstRMRka+zIdQzhFGxqsIqe+pYja74bx+Jia1JzgcrMDE xH8dh1wyEz+6U2FgswRTIA6p+yEMfA+xS/3wYGL0xhC99RwkFcbE5PEZRfzT iXIRHn6V8jIlJia37yhSp7SY/FIEIWE28SOqaQIiS4HUDk5GBkbGbFS56TzH /r9fHwuMiEBVDia3U8hyyQIbakWAxunO/v9/NQeTcPiG2wh9TEwcWqmT3nwD cXY4AFUZwuU+Hj16DsmYdeZMpdg9T2MAAC64lw4= "], {{0, 28}, {28, 0}}, {0, 255}, ColorFunction->GrayLevel], BoxForm`ImageTag[ "Byte", ColorSpace -> Automatic, Interleaving -> None], Selectable->False], DefaultBaseStyle->"ImageGraphics", ImageSizeRaw->{28, 28}, PlotRange->{{0, 28}, {0, 28}}]$, NetPort["Pick", "Output"]]$