Each of these networks has a distinct architecture and is trained on different datasets.
The first network architecture, released in 2016 by S. Iizuka, E. Simo-Serra and H. Ishikawa, is "ColorNet Image Colorization" (Proc. of SIGGRAPH 2016, vol. 35, #4) trained on MIT Places Data and ImageNet Competition Data. The network exploit a combination of local and global image features. Local features are extracted in a fully convolutional fashion, while the extraction of global features was developed leveraging the labels of the training dataset:
Another network architecture by R. Zhang, P. Isola, and A. Efros, also released in 2016, is "Colorful Image Colorization" (arXiv:1603.08511v5 (2016)). It was trained on the ImageNet Competition Data. The model recasts image colorization into a classification problem by dividing the AB components of the Lab color space into 313 bins. The final color for each pixel is picked by performing an ad hoc modified mean of its probability distribution over the bins:
The neural networks above take the 224×224 pixel array representing grayscale values as input and return the two corresponding color channels in the Lab color space as numeric arrays:
In[4]:=
bw=
;
In[5]:=
Information[bw]
Out[5]=
Evaluate the Colorful net on the grayscale image:
In[6]:=
ab=colorfulOnImageNet[bw];Dimensions[ab]
Out[6]=
{2,56,56}
The resulting a and b channels can be combined with the original grayscale as the L channel into an Lab image:
To test the capabilities of the automatic colorization networks, take a color photo of Yosemite valley (by Rakshith Hatwar), extract its color and colorise it again:
In[10]:=
img=
;
In[11]:=
bw=ColorConvert[img,"Grayscale"]
Out[11]=
Apply the 3 different nets to the grayscale image to get 3 predictions for the colorized photo:
Clearly, the last network trained on landscape images (MIT Places Data) provides the most natural colorization, though every network misses the autumn colors: