D2-Net Trained on MegaDepth Data

Find generic keypoints and their feature vectors in an image

Released in 2019 by Mihai Dusmanu et al., this VGG-like model is able to find generic keypoints in an image and describe each keypoint with a feature vector. Such feature vectors can be used to find correspondences between different images of the same scene, mapping the movement of keypoints from one image to the other. It performs local feature extraction using a describe-and-detect methodology, jointly optimizing the detection and description objectives during training. The joint objective is to minimize the distance between the corresponding keypoints in feature space while maximizing the distance between other confounding points in either image. This objective is similar to the triplet margin ranking loss with an additional detection term.

Number of layers: 22 | Parameter count: 7,635,264 | Trained size: 31 MB |

Training Set Information

Performance

Examples

Resource retrieval

Get the pre-trained net:

In[1]:=
NetModel["D2-Net Trained on MegaDepth Data"]
Out[1]=

Evaluation function

Write an evaluation function to post-process the net output in order to obtain keypoint position, strength and features:

In[2]:=
Options[netevaluate] = {MaxFeatures -> 50};
netevaluate[img_Image, opts : OptionsPattern[]] := Module[
  {dims, featureMap, c, h, w, transposed, normalized, strengthArray, pos, scalex, scaley, keypointStr, keypointPos, keypointFeats},
  dims = ImageDimensions[img];
  featureMap = NetModel["D2-Net Trained on MegaDepth Data"][img];
  {c, h, w} = Dimensions[featureMap];
  transposed = Transpose[featureMap, {3, 1, 2}];
  normalized = transposed/Map[Norm, transposed, {2}]; (* Matrix containing the strengths of each keypoint *) strengthArray = Map[Max, normalized, {2}];
  (* Find positions of (up to) MaxFeatures strongest keypoints *) pos = Ordering[
     Flatten@strengthArray, -Min[OptionValue[MaxFeatures], w*h]] - 1;
  pos = QuotientRemainder[#, w] + {1, 1} & /@ pos; (* matrix position *)
  (* From array positions to image keypoint positions *)
  {scalex, scaley} = N[dims/{w, h}];
  keypointPos = {scalex*(#[[1]] - 0.5), scaley*(h - #[[2]] + 0.5)} & /@
     Reverse /@ pos;
  (* Extract the features and strengths *) keypointFeats = Extract[normalized, pos];
  keypointStr = Extract[strengthArray, pos];
  {keypointPos, keypointStr, keypointFeats}
  ]

Basic usage

Obtain the keypoints of a given image:

In[3]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/ae785a2c-eb63-4af1-ad3c-70c6946b852b"]
In[4]:=
{keypointPos, keypointStr, keypointFeats} = netevaluate[testImage];

Visualize the keypoints:

In[5]:=
HighlightImage[testImage, keypointPos]
Out[5]=

Specify a maximum of 15 keypoints and visualize the new detection:

In[6]:=
{keypointPos, keypointStr, keypointFeats} = netevaluate[testImage, MaxFeatures -> 15];
In[7]:=
HighlightImage[testImage, keypointPos]
Out[7]=

Resource History

Reference