Function Repository Resource:

NetFuseBatchNorms

Source Notebook

Fuse a BatchNormalization layer preceded by a ConvolutionLayer into a single ConvolutionLayer

Contributed by: Maria Sargsyan

ResourceFunction["NetFuseBatchNorms"][convLayer, bnLayer]

fuses the weights of an initialized BatchNormalizationLayer and ConvolutionLayer into a single ConvolutionLayer whenever possible.

ResourceFunction["NetFuseBatchNorms"][net]

repeatedly fuses the weights of an initialized BatchNormalizationLayer preceded by an initialized ConvolutionLayer into a single ConvolutionLayer in a net.

Details

ResourceFunction["NetFuseBatchNorms"] does not do anything on an uninitialized net.

Examples

Basic Examples (6) 

Define a non-zero initialized ConvolutionLayer followed by a BatchNormalizationLayer:

In[1]:=
convLayer = NetInitialize@
  ConvolutionLayer[64, {3, 3}, "Biases" -> RandomReal[1, 64], "Input" -> {3, 100, 100}]
Out[1]=
In[2]:=
bnLayer = NetInitialize@ BatchNormalizationLayer[
   "Scaling" -> RandomReal[1, 64],
   "Biases" -> RandomReal[1, 64],
   "MovingMean" -> RandomReal[1, 64],
   "MovingVariance" -> RandomReal[1, 64],
   "Input" -> {64, Automatic, Automatic}
   ]
Out[2]=

Fuse the convLayer and bnLayer into a single ConvolutionLayer:

In[3]:=
ResourceFunction["NetFuseBatchNorms"][convLayer, bnLayer]
Out[3]=

Create a NetChain:

In[4]:=
net = NetChain[{convLayer, bnLayer}]
Out[4]=

Perform the same combination within a NetChain:

In[5]:=
fusedConvLayer = ResourceFunction["NetFuseBatchNorms"][net]
Out[5]=

Compare the outputs on a random input:

In[6]:=
randomInput = RandomReal[1, {3, 100, 100}];
Max@Abs[fusedConvLayer@randomInput - net@randomInput]
Out[7]=

Note that the fused layer is faster:

In[8]:=
fusedConvLayer@randomInput; // RepeatedTiming
Out[8]=
In[9]:=
net@randomInput; // RepeatedTiming
Out[9]=

Scope (1) 

Note that NetFuseBatchNorms does not do anything on an uninitialized net:

In[10]:=
uninit = NeuralNetworks`NetDeinitialize@
  NetChain[{NetInitialize@
     ConvolutionLayer[64, {3, 3}, "Biases" -> RandomReal[1, 64], "Input" -> {3, 100, 100}], NetInitialize@ BatchNormalizationLayer[
      "Scaling" -> RandomReal[1, 64],
      "Biases" -> RandomReal[1, 64],
      "MovingMean" -> RandomReal[1, 64],
      "MovingVariance" -> RandomReal[1, 64],
      "Input" -> {64, Automatic, Automatic}
      ]}]
Out[10]=
In[11]:=
ResourceFunction["NetFuseBatchNorms"]@
 NeuralNetworks`NetDeinitialize[uninit]
Out[11]=
In[12]:=
% === %%
Out[12]=

Applications (4) 

Get pretrained ShuffleNet-V2:

In[13]:=
shuffleNet = NetFlatten@
   NetReplacePart[
    NetModel["ShuffleNet-V2 Trained on ImageNet Competition Data"], "Output" -> None];

Accelerate ShuffleNet-V2:

In[14]:=
fusedShuffleNet = ResourceFunction["NetFuseBatchNorms"][shuffleNet];

Compare the outputs on a random image:

In[15]:=
randomImage = RandomImage[1, {224, 224}, ColorSpace -> "RGB"];
Max@Abs[shuffleNet@randomImage - fusedShuffleNet@randomImage]
Out[16]=

The fused net is lighter as it contains less BatchNormalizationLayer layers:

In[17]:=
1 - N[ByteCount[fusedShuffleNet]/ByteCount[shuffleNet]]
Out[17]=
In[18]:=
Information[shuffleNet, "LayerTypeCounts"][BatchNormalizationLayer] - Information[fusedShuffleNet, "LayerTypeCounts"][
  BatchNormalizationLayer]
Out[18]=

Version History

  • 1.0.0 – 08 December 2022

Related Resources

License Information