superglue learning feature matching with graph neural
play

SuperGlue: Learning Feature Matching with Graph Neural Networks - PowerPoint PPT Presentation

SuperGlue: Learning Feature Matching with Graph Neural Networks Paul-Edouard Sarlin 1 Daniel DeTone 2 Tomasz Malisiewicz 2 Andrew Rabinovich 2 Feature matching is ubiquitous 3D reconstruction Visual localization SLAM Place


  1. SuperGlue: Learning Feature Matching with Graph Neural Networks Paul-Edouard Sarlin 1 Daniel DeTone 2 Tomasz Malisiewicz 2 Andrew Rabinovich 2

  2. Feature matching is ubiquitous ● 3D reconstruction ● Visual localization ● SLAM ● Place recognition [Image Matching Workshop 2020] [ScanNet] [Google VPS]

  3. SuperGlue = Graph Neural Nets + Optimal Transport ● Extreme wide-baseline image pairs in real-time on GPU ● State-of-the-art indoor + outdoor matching with SIFT & SuperPoint

  4. Visual SLAM ● Front-end : images to constraints ○ Recent works: deep learning for feature extraction → Convolutional Nets! ● Back-end : optimize pose and 3D structure [Cadena et al, 2016]

  5. A middle-end front-end middle-end back-end feature MAP data extraction association estimation ● Our position: learn the data association! ● We propose a new middle-end : SuperGlue ● 2D-to-2D feature matching

  6. A minimal matching pipeline SuperGlue : context aggregation + matching + filtering image pair feature outlier pose detection description matching filtering estimation Nearest > Classical: SIFT, ORB > Heuristics: ratio test, mutual check Neighbor > Learned: SuperPoint, D2-Net > Learned: classifier on set Matching deep net [DeTone et al, 2018] [Yi et al, 2018]

  7. The importance of context no SuperGlue with SuperGlue

  8. Problem formulation S u Inputs p Outputs e r G l u e ● Images A and B Single a match per keypoint ● 2 sets of M , N local features + occlusion and noise → a soft partial assignment : ○ Keypoints: - Coordinates - Confidence sum ≤ 1 ○ Visual descriptors: sum ≤ 1

  9. local Attentional Aggregation Sinkhorn Algorithm matching features descriptors partial Self Cross visual descriptor + assignment score matrix row normalization position Keypoint M+1 Encoder column norm. + T L dustbin N+1 =1 score A Graph Neural Network Solving a partial with attention assignment problem Encodes contextual cues & priors Differentiable solver Reasons about the 3D scene Enforces the assignment constraints = domain knowledge

  10. Attentional Graph Neural Network Optimal Matching Layer local Attentional Aggregation Sinkhorn Algorithm matching features descriptors partial Self Cross visual descriptor + assignment score matrix row normalization position Keypoint M+1 Encoder column norm. + T L dustbin N+1 =1 score ● Initial representation for each keypoints : ● Combines visual appearance and position with an MLP: Multi-Layer Perceptron

  11. Attentional Graph Neural Network Optimal Matching Layer local Attentional Aggregation Sinkhorn Algorithm matching features descriptors partial Self Cross visual descriptor + assignment score matrix row normalization position Keypoint M+1 Encoder column norm. + T L dustbin N+1 =1 score Update the representation based on other keypoints: - in the same image: “ self ” edges - in the other image: “ cross ” edges → A complete graph with two types of edges feature in image at layer

  12. Attentional Graph Neural Network Optimal Matching Layer local Attentional Aggregation Sinkhorn Algorithm matching features descriptors partial Self Cross visual descriptor + assignment score matrix row normalization position Keypoint M+1 Encoder column norm. + T L dustbin N+1 =1 score Update the representation using a Message Passing Neural Network the message

  13. Attentional Aggregation ● Compute the message using self and cross attention ● Soft database retrieval: query , key , and value = [tile, pos. (80, 110)] query neighbors = [corner, pos. (60, 90)] = [tile, position (70, 100)] query = [grid, pos. (400, 600)] salient points [Vaswani et al, 2017]

  14. A B Self-attention = intra-image information flow distinctive points A B Cross-attention candidate = inter-image matches Attention builds a soft , dynamic , sparse graph

  15. Attentional Graph Neural Network Optimal Matching Layer local Attentional Aggregation Sinkhorn Algorithm matching features descriptors partial Self Cross visual descriptor + assignment score matrix row normalization position Keypoint M+1 Encoder column norm. + T L dustbin N+1 =1 score Compute a score matrix for all matches:

  16. Attentional Graph Neural Network Optimal Matching Layer local Attentional Aggregation Sinkhorn Algorithm matching features descriptors partial Self Cross visual descriptor + assignment score matrix row normalization position Keypoint M+1 Encoder column norm. + T L dustbin N+1 =1 score ● Occlusion and noise: unmatched keypoints are assigned to a dustbin ● Augment the scores with a learnable dustbin score

  17. Attentional Graph Neural Network Optimal Matching Layer local Attentional Aggregation Sinkhorn Algorithm matching features descriptors partial Self Cross visual descriptor + assignment score matrix row normalization position Keypoint M+1 Encoder column norm. + T L dustbin N+1 =1 score ● Compute the assignment that maximizes ● Solve an optimal transport problem ● With the Sinkhorn algorithm : differentiable & soft Hungarian algorithm [Sinkhorn & Knopp, 1967]

  18. Attentional Graph Neural Network Optimal Matching Layer local Attentional Aggregation Sinkhorn Algorithm matching features descriptors partial Self Cross visual descriptor + assignment score matrix row normalization position Keypoint M+1 Encoder column norm. + T L dustbin N+1 =1 score ● Compute ground truth correspondences from pose and depth ● Find which keypoints should be unmatched ● Loss: maximize the log-likelihood of the GT cells

  19. Results: indoor - ScanNet SuperPoint + NN + heuristics SuperPoint + SuperGlue SuperGlue: more correct matches and fewer mismatches

  20. Results: outdoor - SfM SuperPoint + NN + OA-Net (inlier classifier) SuperPoint + NN + mutual check SuperPoint + SuperGlue SuperGlue: more correct matches and fewer mismatches

  21. Results: attention patterns global context neighborhood distinctive keypoints self-similarities match candidates Flexibility of attention → diversity of patterns 21

  22. Evaluation Heuristics Learned inlier classifier SuperGlue yields large improvements in all cases

  23. SuperGlue @ CVPR 2020 First place in the following competitions: - Image matching challenge vision.uvic.ca/image-matching-challenge - Local features for visual localization www.visuallocalization.net - Visual localization for handheld devices

  24. SuperGlue Learning Feature Matching with Graph Neural Networks A major step towards end-to-end deep SLAM & SfM psarlin.com/superglue

  25. Thank you psarlin.com/superglue

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend