Self-Normalizing Net for Numeric Data

Perform classification or regression on numeric data

Released in 2017, self-normalizing neural networks outperform other fully connected networks for a variety of classification and regression tasks on numeric data. This class of models provides optimal propagation of activations that are close to zero mean and unit variance across many layers. This is achieved using a special activation function named Scaled Exponential Linear Unit (SELU) and a special type of dropout named Alpha Dropout.

Number of models: 2

Examples

Resource retrieval

Get the uninitialized net (there are no pre-trained nets in this model):

In[1]:=
NetModel["Self-Normalizing Net for Numeric Data"]
Out[1]=

NetModel parameters

This model consists of a family of individual nets, each identified by a specific parameter combination. Inspect the available parameters:

In[2]:=
NetModel["Self-Normalizing Net for Numeric Data", \
"ParametersInformation"]
Out[2]=

Pick a non-default model by specifying the parameters:

In[3]:=
NetModel[{"Self-Normalizing Net for Numeric Data", "TaskType" -> "Regression", "Depth" -> 5, "DropoutProbability" -> 0.02}]
Out[3]=

Check the default parameter combination:

In[4]:=
NetModel["Self-Normalizing Net for Numeric Data", "DefaultVariant"]
Out[4]=

Basic usage

Classification on numerical data: In this example, we use an eight-layer self-normalizing network to perform classification on the UCI Letter dataset. First, obtain the training and test data:

In[5]:=
train = ResourceData["Sample Data: UCI Letter", "TrainingData"];
test = ResourceData["Sample Data: UCI Letter", "TestData"];

View two random training examples:

In[6]:=
RandomSample[train, 2]
Out[6]=

Self-normalizing nets assume that the input data has a mean of 0 and variance of 1. Standardize the test and training data:

In[7]:=
extractor = FeatureExtraction[N@Keys[train], "StandardizedVector"];
trainStandardized = extractor[Keys[train]] -> Values[train];
testStandardized = extractor[Keys[test]] -> Values[test];

Get the training net:

In[8]:=
net = NetModel["Self-Normalizing Net for Numeric Data"]
Out[8]=

Specify a decoder for the net:

In[9]:=
dec = NetDecoder[{"Class", Union@Values[train]}]
Out[9]=

Train the net for 150 rounds, leaving 5% of the data for a validation set:

In[10]:=
trainedNet = NetTrain[NetReplacePart[net, "Output" -> dec], trainStandardized, ValidationSet -> Scaled[0.05], MaxTrainingRounds -> 150]
Out[10]=

Obtain the accuracy of the trained net on the standardized test set:

In[11]:=
ClassifierMeasurements[trainedNet, testStandardized, "Accuracy"]
Out[11]=

Compare the accuracy against all the methods in Classify:

In[12]:=
Dataset@ReverseSort@AssociationMap[ ClassifierMeasurements[
     Classify[train, Method -> #], test, "Accuracy"] &, {"RandomForest", "NaiveBayes", "SupportVectorMachine", "NearestNeighbors", "LogisticRegression", "GradientBoostedTrees", "DecisionTree", "Markov"}]
Out[12]=

Obtain a random sample of standardized test data:

In[13]:=
sample = RandomSample[
  Thread[Keys[testStandardized] -> Values[testStandardized]], 2]
Out[13]=

Test the trained net on a sample of standardized test data:

In[14]:=
trainedNet[Keys[sample]]
Out[14]=

Improving accuracy of the classifier net

Using the same example for data, first we obtain the dataset:

In[15]:=
train = ResourceData["Sample Data: UCI Letter", "TrainingData"];
test = ResourceData["Sample Data: UCI Letter", "TestData"];

Standardize the test and training data in similar fashion:

In[16]:=
extractor = FeatureExtraction[N@Keys[train], "StandardizedVector"];
trainStandardized = extractor[Keys[train]] -> Values[train];
testStandardized = extractor[Keys[test]] -> Values[test];

Get the training net:

In[17]:=
net = NetModel["Self-Normalizing Net for Numeric Data"]
Out[17]=

Specify a decoder for the net:

In[18]:=
dec = NetDecoder[{"Class", Union@Values[train]}]
Out[18]=

To improve the final accuracy, it is possible to average multiple trained networks obtained from different training runs. The following function runs multiple trainings and creates an ensemble network, which averages the outputs of the trained nets:

In[19]:=
ensembleNet[n_] := Module[{ens},
  ens = Table[
    NetTrain[NetReplacePart[net, "Output" -> dec], trainStandardized,
     ValidationSet -> Scaled[0.05],
     MaxTrainingRounds -> 100,
     RandomSeeding -> RandomInteger[10^6] + n ],
    {i, 1, n}];
  NetGraph[
   {Sequence @@ ens,
    TotalLayer[], ElementwiseLayer[#/N@n &]},
   {NetPort["Input"] -> Range[n] -> n + 1 -> n + 2},
   "Output" -> NetDecoder[{"Class", Union@Values[train]}]
   ]]

Specify the number of nets in the ensemble and create the ensemble network:

In[20]:=
n = 4;
ensTrained = ensembleNet[n]
Out[21]=

Obtain the accuracy of the ensemble network:

In[22]:=
accensemble = ClassifierMeasurements[ensTrained, testStandardized, "Accuracy"]
Out[22]=

Compare it with the accuracy of the individual nets:

In[23]:=
acc = Table[
  ClassifierMeasurements[
   NetReplacePart[NetExtract[ensTrained, i], "Output" -> dec],
   testStandardized,
   "Accuracy"
   ],
  {i, 1, n}
  ]
Out[23]=

Classification on nominal data

In this example, we use an eight-layer self-normalizing network to perform classification on the Mushroom Classification dataset. First, obtain the training and test data:

In[24]:=
train = ResourceData["Sample Data: Mushroom Classification", "TrainingData"];
test = ResourceData["Sample Data: Mushroom Classification", "TestData"];

View two random training examples:

In[25]:=
RandomSample[train, 2]
Out[25]=

To standardize this data, we first need to convert all the nominal input classes into indicator vectors:

In[26]:=
trainInputsString = Map[ToString, Keys[train], {2}];
testInputsString = Map[ToString, Keys[test], {2}]; extractor1 = FeatureExtraction[trainInputsString, "IndicatorVector"]
Out[27]=

Then standardize the numeric vectors:

In[28]:=
extractor2 = FeatureExtraction[extractor1[trainInputsString], "StandardizedVector"]
Out[28]=

Create the standardized test and training dataset:

In[29]:=
finalExtractor = extractor2@*extractor1; trainStandardized = finalExtractor[trainInputsString] -> Values[train];
testStandardized = finalExtractor[testInputsString] -> Values[test];

Get the training net:

In[30]:=
net = NetModel["Self-Normalizing Net for Numeric Data"]
Out[30]=

Specify a decoder for the net:

In[31]:=
dec = NetDecoder[{"Class", Union@Values[train]}]
Out[31]=

Train the net for 150 rounds, leaving 5% of the data for a validation set:

In[32]:=
trainedNet = NetTrain[NetReplacePart[net, "Output" -> dec], trainStandardized, ValidationSet -> Scaled[0.05], MaxTrainingRounds -> 150]
Out[32]=
In[33]:=
ClassifierMeasurements[trainedNet, testStandardized, "Accuracy"]
Out[33]=

Compare the accuracy against all the methods in Classify:

In[34]:=
Dataset@ReverseSort@
  AssociationMap[ ClassifierMeasurements[
     Classify[train,
      Method -> #, PerformanceGoal -> "Quality"],
     test, "Accuracy"] &, {"RandomForest", "NaiveBayes", "SupportVectorMachine", "NearestNeighbors", "LogisticRegression", "GradientBoostedTrees", "DecisionTree", "Markov"}]
Out[34]=

Obtain a sample of standardized test data and view the actual class labels:

In[35]:=
sample = RandomSample[
   Thread[Keys[testStandardized] -> Values[testStandardized]], 5];
Values[sample]
Out[36]=

Test the trained net on a sample of standardized test data:

In[37]:=
trainedNet[Keys[sample]]
Out[37]=

Regression on numerical data

In this example, we use an eight-layer self-normalizing network to predict the median value of properties in a neighborhood of Boston, given some features of the neighborhood. First, obtain the training and test data:

In[38]:=
train = ResourceData["Sample Data: Boston Homes", "TrainingData"];
test = ResourceData["Sample Data: Boston Homes", "TestData"];

View two random training examples:

In[39]:=
RandomSample[train, 2]
Out[39]=

Self-normalizing nets assume that the input data has a mean of 0 and variance of 1. Standardize the test and training data:

In[40]:=
extractor = FeatureExtraction[N@Keys[train], "StandardizedVector"];
trainStandardized = extractor[Keys[train]] -> Values[train]; testStandardized = extractor[Keys[test]] -> Values[test];

Get the training net:

In[41]:=
net = NetModel["Self-Normalizing Net for Numeric Data", "TaskType" -> "Regression"]
Out[41]=

Train the net for 250 rounds leaving 7% of the data for a validation set and return both the trained net and the lowest validation loss:

In[42]:=
results = NetTrain[net, trainStandardized, All, ValidationSet -> Scaled[0.07], MaxTrainingRounds -> 250, Method -> "ADAM"]
Out[42]=

Compute the test-set standard deviation:

In[43]:=
NetMeasurements[
 results["TrainedNet"], testStandardized, "StandardDeviation"]
Out[43]=

Compare the standard deviation against all the methods in Predict:

In[44]:=
ReverseSort@Dataset@AssociationMap[ PredictorMeasurements[
     Predict[train, Method -> #], test, "StandardDeviation"] &,
   {"RandomForest", "DecisionTree", "GradientBoostedTrees", "NearestNeighbors", "LinearRegression", "GaussianProcess"}]
Out[44]=

Obtain a sample of standardized test data and view the actual class labels:

In[45]:=
sample = RandomSample[
   Thread[Keys[testStandardized] -> Values[testStandardized]], 5];
Values[sample]
Out[46]=

Test the trained net on a sample of standardized test data:

In[47]:=
results["TrainedNet"][Keys[sample]]
Out[47]=

Regression on nominal data

Create a dataset of the average monthly temperature (in degrees Celsius) as a function of the city, the year and the month:

In[48]:=
dataset = RandomSample[{#2, ToExpression[#3], #4} -> (#1 - 32)/1.8 & @@@ ExampleData[{"Statistics", "USCityTemperature"}]];

View two random examples:

In[49]:=
RandomSample[dataset, 5]
Out[49]=

Split the data into training (80%) and test (20%) sets:

In[50]:=
{train, test} = TakeDrop[dataset, 300];
data = Union[train, test];

To standardize this data, we need to convert all the nominal classes to a indicator vectors:

In[51]:=
trainInputsString = Map[ToString, Keys[train], {2}];
testInputsString = Map[ToString, Keys[test], {2}];
extractor1 = FeatureExtraction[trainInputsString, "IndicatorVector"]
Out[53]=

Then standardize the numeric vector:

In[54]:=
extractor2 = FeatureExtraction[extractor1[trainInputsString], "StandardizedVector"]
Out[54]=

Create the standardized test and training dataset:

In[55]:=
finalExtractor = extractor2@*extractor1;
trainStandardized = finalExtractor[trainInputsString] -> Values[train];
testStandardized = finalExtractor[testInputsString] -> Values[test];

Get the training net:

In[56]:=
net = NetModel["Self-Normalizing Net for Numeric Data", "TaskType" -> "Regression"]
Out[56]=

Train the net for 1000 rounds leaving 7% of the data for a validation set and return both the trained net and the lowest validation loss:

In[57]:=
results = NetTrain[net, trainStandardized, All, ValidationSet -> Scaled[0.07], MaxTrainingRounds -> 100, Method -> "ADAM"]
Out[57]=

Compute the test-set standard deviation:

In[58]:=
NetMeasurements[
 results["TrainedNet"], testStandardized, "StandardDeviation"]
Out[58]=

Compare the standard deviation against all the methods in Predict:

In[59]:=
ReverseSort@Dataset@AssociationMap[ PredictorMeasurements[
     Predict[train, Method -> #], test, "StandardDeviation"] &,
   {"RandomForest", "DecisionTree", "GradientBoostedTrees", "NearestNeighbors", "LinearRegression", "GaussianProcess"}]
Out[59]=

Obtain a sample of standardized test data and view the actual class labels:

In[60]:=
sample = RandomSample[
   Thread[Keys[testStandardized] -> Values[testStandardized]], 5];
Values[sample]
Out[61]=

Test the trained net on a sample of standardized test data:

In[62]:=
results["TrainedNet"][Keys[sample]]
Out[62]=

Net information

Obtain the layer type counts:

In[63]:=
NetInformation[
 NetModel["Self-Normalizing Net for Numeric Data"], "LayerTypeCounts"]
Out[63]=

Display the summary graphic:

In[64]:=
NetInformation[
 NetModel["Self-Normalizing Net for Numeric Data"], \
"FullSummaryGraphic"]
Out[64]=

Export to MXNet

Export the net into a format that can be opened in MXNet. Input and output size must be specified before exporting:

In[65]:=
net = NetReplacePart[
  NetModel["Self-Normalizing Net for Numeric Data"],
  {"Input" -> 10, "Output" -> 3}
  ]
Out[65]=
In[66]:=
jsonPath = Export[FileNameJoin[{$TemporaryDirectory, "net.json"}], net, "MXNet"]
Out[66]=

Represent the MXNet net as a graph:

In[67]:=
Import[jsonPath, {"MXNet", "NodeGraphPlot"}]
Out[67]=

Requirements

Wolfram Language 12.0 (April 2019) or above

Resource History

Reference