Single-Image Depth Perception Net Trained on NYU Depth V2 Data

Estimate the depth map of an image

Released in 2016, this neural net was trained to predict the relative depth map from a single image using a novel technique based on sparse ordinal annotations. Each training example only needs to be annotated with a pair of points and its relative distance to the camera. After training, the net is able to reconstruct the full depth map. Its architecture is based on the "hourglass" design.

Number of layers: 501 | Parameter count: 5,385,185 | Trained size: 23 MB |

Training Set Information

NYU Depth V2, consisting of 1,449 RGBD indoor images with dense per-pixel semantic and structural labeling.

Performance

This model achieves 28.3% WKDR (Weighted Kinect Disagreement Rate) error on the NYU Depth V2 dataset and 31.31% WHDR (Weighted Human Disagreement Rate) error on the Depth in the Wild dataset. All weights are set to 1 for both measures.

Examples

Download Example Notebook

Open in Wolfram Cloud

Resource retrieval

Get the pre-trained net:

In[1]:=

$NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \ Data"]$

Out[1]=

Basic usage

Obtain the depth map of an image:

In[2]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/d9c710ab-0676-4ac9-96d6-050798358d14"]

Show the depth map:

In[3]:=

Out[3]=

Visualize a 3D model

Get an image:

In[4]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/07284599-973a-40ee-ad47-af78659c4ad4"]

Obtain the depth map:

In[5]:=

$depthMap = NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \ Data"][img];$

Visualize a 3D model using the depth map:

In[6]:=

Out[6]=

Adapt to any size

The recommended way to deal with image sizes and aspect ratios is to resample the depth map after the net evaluation. Get an image:

In[7]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/296f018d-1538-4ebd-9ccb-eb7f52df9ba2"]

Obtain the dimensions of the image:

In[8]:=

Out[8]=

Obtain the depth map

In[9]:=

(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/a0b9e602-89f9-4285-9208-833cc0c862b4"]

Resample the depth map and visualize it:

In[10]:=

Out[10]=

Net information

Inspect the number of parameters of all arrays in the net:

In[11]:=

$NetInformation[ NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \ Data"], "ArraysElementCounts"]$

Out[11]=

Obtain the total number of parameters:

In[12]:=

$NetInformation[ NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \ Data"], "ArraysTotalElementCount"]$

Out[12]=

Obtain the layer type counts:

In[13]:=

$NetInformation[ NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \ Data"], "LayerTypeCounts"]$

Out[13]=

Display the summary graphic:

In[14]:=

$NetInformation[ NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \ Data"], "SummaryGraphic"]$

Out[14]=

Export to MXNet

Export the net into a format that can be opened in MXNet:

In[15]:=

$jsonPath = Export[FileNameJoin[{$TemporaryDirectory, "net.json"}], NetModel["Single-Image Depth Perception Net Trained on NYU Depth V2 \ Data"], "MXNet"]$

Out[15]=

Export also creates a net.params file containing parameters:

In[16]:=

Out[16]=

Get the size of the parameter file:

In[17]:=

Out[17]=

The size is similar to the byte count of the resource object:

In[18]:=

$ResourceObject[ "Single-Image Depth Perception Net Trained on NYU Depth V2 \ Data"]["ByteCount"]$

Out[18]=

Construction Notebook

Download Construction Notebook

Open in Wolfram Cloud

Requirements

Wolfram Language 11.2 (September 2017) or above

Resource History

Date Created: 13 October 2017
Latest Update: 21 June 2018

Reference

W. Chen, Z. Fu, D. Yang, J. Deng, "Single-Image Depth Perception in the Wild," arXiv:1604.03901 (2016)
Available from: https://github.com/wfchen-umich/relative_depth
Rights: BSD 3-Clause License