FusionNet: 3D Object Classification Using Multiple Data - - PowerPoint PPT Presentation

fusionnet 3d object classification using multiple data
SMART_READER_LITE
LIVE PREVIEW

FusionNet: 3D Object Classification Using Multiple Data - - PowerPoint PPT Presentation

FusionNet: 3D Object Classification Using Multiple Data Representations Vishakh Hegde Reza Zadeh Paper: http://matroid.com/papers/fusionnet.pdf Twitter: @Reza_Zadeh Object recognition Given 3D model, figure out what it is bathtub


slide-1
SLIDE 1

Vishakh Hegde Reza Zadeh

FusionNet: 3D Object Classification Using Multiple Data Representations

Paper: http://matroid.com/papers/fusionnet.pdf Twitter: @Reza_Zadeh

slide-2
SLIDE 2

Object recognition

Given 3D model, figure out what it is

» bathtub

@Reza_Zadeh

slide-3
SLIDE 3

Princeton ModelNet

662 object classes, 127,915 CAD models ModelNet40: 40 class subset http://modelnet.cs.princeton.edu

@Reza_Zadeh

slide-4
SLIDE 4

Princeton ModelNet

Problem of input representation Try using image recognition on projections, but that only goes so far.

@Reza_Zadeh

slide-5
SLIDE 5

From Image Recognition to Object Recognition

@Reza_Zadeh

slide-6
SLIDE 6

Convolutional Network

Slide a two-dimensional patch over pixels. How to adapt to three dimensions?

Figure: Google image search for “convolutional neural network”

slide-7
SLIDE 7

Multi-View CNN

Rotate camera around object

Figure: H. Su, S. Maji, E. Kalogerakis, E. Learned-Miller. Multi-view Convolutional Neural Networks for 3D Shape Recognition. ICCV2015

slide-8
SLIDE 8

Representations

@Reza_Zadeh

slide-9
SLIDE 9

Volumetric (V-CNN)

Simple idea: slide a three-dimensional volume

  • ver voxels.

@Reza_Zadeh

slide-10
SLIDE 10

Volumetric CNNs

Use two different Volumetric CNNs (VCNN-I and VCNN-II). Example of one:

@Reza_Zadeh

slide-11
SLIDE 11

FusionNet

Fusion of two volumetric representation CNNs and one pixel representation CNN Hyper- parameters tuned on a cluster

http://matroid.com/papers/fusionnet.pdf

slide-12
SLIDE 12

Machine Learning Pipeline

Data Learning Algorithm Trained Model Replicate model Serve Model Repeat entire pipeline

@Reza_Zadeh

slide-13
SLIDE 13

Deeper Dive into Networks

@Reza_Zadeh

slide-14
SLIDE 14

Multi-View CNN

View positions: Corners of icosahedron (20 faces) Base network: AlexNet (# parameters ~ 60M) Pre-training on ImageNet, fine-tune last three layers.

slide-15
SLIDE 15

VCNN-I

Long kernels learn features spanning the size

  • f the 3D model

Data Augmentation: Gaussian noise added to vertex coordinates in CAD model Better than VCNN II on: Table, Plant, Bench

slide-16
SLIDE 16

VCNN-II

GoogLeNet inspired inception modules Kernel sizes: 1x1x30, 3x3x30, 5x5x30 Hope: Learn features at multiple scales Better than VCNN I: Radio, Wardrobe, Xbox

slide-17
SLIDE 17

Results

@Reza_Zadeh

slide-18
SLIDE 18

Results

slide-19
SLIDE 19

FusionNet

At the time of submission (July 17th 2016)

slide-20
SLIDE 20

ModelNet now

Recent (December 5th 2016)

slide-21
SLIDE 21

Conclusions

3D convolutions on different kernel sizes help Combination MVCNN + VCNN helps Hyper-parameter tuning helps

slide-22
SLIDE 22

DEEM workshop

Held in conjunction with SIGMOD/PODS May 14th, 2017 – Submissions open!

slide-23
SLIDE 23

Thank you!

FusionNet paper

http://matroid.com/papers/fusionnet.pdf

@Reza_Zadeh