Single-Image Depth Perception Net Trained on NYU Depth V2 Data

Estimate the depth map of an image

Released in 2016, this neural net was trained to predict the relative depth map from a single image using a novel technique based on sparse ordinal annotations. Each training example only needs to be annotated with a pair of points and its relative distance to the camera. After training, the net is able to reconstruct the full depth map. Its architecture is based on the "hourglass" design.

Number of layers: 501 | Parameter count: 5,385,185 | Trained size: 23 MB |

Training Set Information

Performance

Examples

Resource retrieval

Get the pre-trained net:

In[1]:=
NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \
Data"]
Out[1]=

Basic usage

Obtain the depth map of an image:

In[2]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/d9c710ab-0676-4ac9-96d6-050798358d14"]

Show the depth map:

In[3]:=
ImageAdjust[Image[depthMap]]
Out[3]=

Visualize a 3D model

Get an image:

In[4]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/07284599-973a-40ee-ad47-af78659c4ad4"]

Obtain the depth map:

In[5]:=
depthMap = NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \
Data"][img];

Visualize a 3D model using the depth map:

In[6]:=
ListPlot3D[-Reverse@depthMap, PlotStyle -> Texture[img], PlotTheme -> {"Minimal", "NoAxes"}, ViewPoint -> {0, -Sqrt[2], Sqrt[2]}]
Out[6]=

Adapt to any size

The recommended way to deal with image sizes and aspect ratios is to resample the depth map after the net evaluation. Get an image:

In[7]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/296f018d-1538-4ebd-9ccb-eb7f52df9ba2"]

Obtain the dimensions of the image:

In[8]:=
ImageDimensions[img]
Out[8]=

Obtain the depth map

In[9]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/a0b9e602-89f9-4285-9208-833cc0c862b4"]

Resample the depth map and visualize it:

In[10]:=
ImageAdjust[
 Image[ArrayResample[depthMap, Reverse[ImageDimensions[img]]]]]
Out[10]=

Net information

Inspect the number of parameters of all arrays in the net:

In[11]:=
NetInformation[
 NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \
Data"], "ArraysElementCounts"]
Out[11]=

Obtain the total number of parameters:

In[12]:=
NetInformation[
 NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \
Data"], "ArraysTotalElementCount"]
Out[12]=

Obtain the layer type counts:

In[13]:=
NetInformation[
 NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \
Data"], "LayerTypeCounts"]
Out[13]=

Display the summary graphic:

In[14]:=
NetInformation[
 NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \
Data"], "SummaryGraphic"]
Out[14]=

Export to MXNet

Export the net into a format that can be opened in MXNet:

In[15]:=
jsonPath = Export[FileNameJoin[{$TemporaryDirectory, "net.json"}], NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \
Data"], "MXNet"]
Out[15]=

Export also creates a net.params file containing parameters:

In[16]:=
paramPath = FileNameJoin[{DirectoryName[jsonPath], "net.params"}]
Out[16]=

Get the size of the parameter file:

In[17]:=
FileByteCount[paramPath]
Out[17]=

The size is similar to the byte count of the resource object:

In[18]:=
ResourceObject[
  "Single-Image Depth Perception Net Trained on NYU Depth V2 \
Data"]["ByteCount"]
Out[18]=

Requirements

Wolfram Language 11.2 (September 2017) or above

Resource History

Reference