Point-Voxel CNN for E ffi cient 3D Deep Learning Zhijian Liu* , - PowerPoint PPT Presentation

H ardware, A I and N eural-nets Point-Voxel CNN for E ffi cient 3D Deep Learning Zhijian Liu* , Haotian Tang* , Yujun Lin , and Song Han Project Page: http://pvcnn.mit.edu/

3D Deep Learning 3D Part Segmentation 3D Semantic Segmentation 3D Object Detection (for Robotic Systems) (for VR/AR Headsets) (for Self-Driving Cars)

E ffi cient 3D Deep Learning Bandwidth (GB/s) Sequential Memory Access 668 1 2 3 4 5 6 7 8 20x slower 167 30 Random Memory Access 7 5 2 4 6 1 8 3 Mult and Add SRAM Memory DRAM Memory O ff -chip DRAM access is much more Random memory access is ine ffi cient expensive than arithmetic operation! due to the potential bank con fl icts!

Voxel-Based Models: Cubically-Growing Memory 80 128 x 128 x 128 resolution 83 GB (Titan XP x 7 ) 7% information loss GPU Memory (GB) 60 64 x 64 x 64 resolution 40 11 GB (Titan XP x 1 ) 42% information loss 20 * ) 3D ShapeNets [CVPR’15] 0 VoxNet [IROS’15] 20 40 60 80 100 120 3D U-Net [MICCAI’16] Voxel Resolution

Point-Based Models: Sparsity Overheads * DGCNN PointCNN SpiderCNN Ours ' 95.1 Runtime (%) 57.4 51.8 51.5 45.3 36.3 27.0 + ) 15.6 12.2 PointNet [CVPR’17] 4.9 2.9 0.0 PointCNN [NeurIPS’18] Irregular Access Dynamic Kernel Actual Computation DGCNN [SIGGRAPH’19]

Point-Voxel Convolution (PVConv) Voxelize Convolve Devoxelize Fuse Normalize Multi-Layer Perceptron

Point-Voxel Convolution (PVConv) Voxelize Convolve Devoxelize Fuse Normalize Multi-Layer Perceptron Point-Based Feature Transformation (Fine-Grained)

Point-Voxel Convolution (PVConv) Voxel-Based Feature Aggregation (Coarse-Grained) Voxelize Convolve Devoxelize Fuse Normalize Multi-Layer Perceptron

Point-Voxel Convolution (PVConv) Voxel-Based Feature Aggregation (Coarse-Grained) Voxelize Convolve Devoxelize Fuse Normalize Multi-Layer Perceptron Point-Based Feature Transformation (Fine-Grained)

Point-Voxel Convolution (PVConv) Features from Voxel-Based Branch : Features from Point-Based Branch :

Results: 3D Part Segmentation (ShapeNet) PVCNN PointCNN DGCNN RSNet 3D-UNet SpiderCNN PointNet++ PointNet 86.0 85.5 Mean IoU 85.0 84.5 84.0 83.5 0 30 60 90 120 150 180 210 0.7 1.0 1.3 1.6 1.9 2.2 2.5 2.8 3.1 GPU Latency (ms) GPU Memory (GB)

Results: 3D Part Segmentation (ShapeNet) PVCNN PointCNN DGCNN RSNet 3D-UNet SpiderCNN PointNet++ PointNet 2.7x speedup 1.5x reduction 86.0 85.5 Mean IoU 85.0 84.5 84.0 83.5 0 30 60 90 120 150 180 210 0.7 1.0 1.3 1.6 1.9 2.2 2.5 2.8 3.1 GPU Latency (ms) GPU Memory (GB)

Results: 3D Part Segmentation (ShapeNet) PointNet (83.7 mIoU) PVCNN (85.2 mIoU) 139.9 Objects per Second 76.0 42.6 20.3 19.9 8.2 Jetson Nano Jetson TX2 Jetson AGX Xavier

Results: 3D Semantic Segmentation (S3DIS) PVCNN PVCNN++ 3D-UNet PointCNN RSNet DGCNN PointNet 57.5 55.0 52.5 Mean IoU 50.0 47.5 45.0 42.5 20 60 100 140 180 220 260 300 0.4 1.0 1.6 2.2 2.8 3.4 4.0 4.6 GPU Latency (ms) GPU Memory (GB)

Results: 3D Semantic Segmentation (S3DIS) PVCNN PVCNN++ 3D-UNet PointCNN RSNet DGCNN PointNet 57.5 6.9x speedup 5.7x reduction 55.0 52.5 Mean IoU 50.0 47.5 45.0 42.5 20 60 100 140 180 220 260 300 0.4 1.0 1.6 2.2 2.8 3.4 4.0 4.6 GPU Latency (ms) GPU Memory (GB)

Results: 3D Semantic Segmentation (S3DIS) PVCNN Input Scene PointNet Ground Truth ( 1.8x faster)

Results: 3D Object Detection (KITTI) GPU Latency GPU Memory Pedestrian Cyclist Car F-PointNet++ 105.2 ms 2.0 GB 61.6 62.4 72.8 58.9 ms 1.4 GB PVCNN 60.7 63.6 73.0 (1.8x) (1.4x) (-0.9) (+1.2) (+0.2) (e ffi cient) 69.6 ms 1.4 GB PVCNN 64.9 65.9 73.1 (1.4x) (+3.3) (+3.5) (+0.3) (1.5x) (complete) Faster Lower More Accurate

Results: 3D Object Detection (KITTI) F-PointNet++ PVCNN (10 FPS) ( 17 FPS, 1.8x faster)

Point-Voxel CNN for E ffi cient 3D Deep Learning 2.7x measured speedup 6.9x measured speedup 1.8x measured speedup 1.5x memory reduction 5.7x memory reduction 1.4x memory reduction Gold Medal in Lyft Challenge on 3D Object Detection for Autonomous Vehicles Poster: 10:45-12:45 PM @ East Exhibition Hall B + C #112 GitHub: https://github.com/mit-han-lab/pvcnn Project Page: http://pvcnn.mit.edu

Point-Voxel CNN for E ffi cient 3D Deep Learning Zhijian Liu* , - PowerPoint PPT Presentation

H ardware, A I and N eural-nets Point-Voxel CNN for E ffi cient 3D Deep Learning Zhijian Liu* , Haotian Tang* , Yujun Lin , and Song Han Project Page: http://pvcnn.mit.edu/ 3D Deep Learning 3D Part Segmentation 3D Semantic Segmentation 3D

Lecture 3. Su ffi ciency Lecture 3. Su ffi ciency 1 (114) 3. Su ffi ciency 3.1. Su ffi cient

Point-Voxel CNN for E ffi cient 3D Deep Learning Zhijian Liu* , Haotian Tang* , Yujun Lin , and

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only

Immutability, or Putting the Dream Machine to Work The trie memory scheme is ine ffi cient for

Immutability, or Putting the Dream Machine to Work The trie memory scheme is ine ffi cient for

PRACTICAL REAL-TIME VOXEL-BASED GLOBAL ILLUMINATION FOR CURRENT GPUS Alexey Panteleev NVIDIA

1 Splatting Splatting Algorithm: Process from closest voxel to furthest voxel

An E ffi cient A ffi ne-Scaling Algorithm for Hyperbolic Programming Jim Renegar joint work

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie

FFI The good, the bad and the ugly Esteban Lorenzano (The Pharo firefighter) Current status of

Dynamic Graph CNN for learning on point clouds Wang Yue, et al. Otakar Jaek March 25, 2019

15 E ffi cient mesh models Steve Marschner CS5625 Spring 2020 Follows chapter 16 in RTR 4e Basics

Supere ffi cient estimation of the intensity of a stationary Poisson point process via the Stein

E ffi cient Modeling of Latent Information in Supervised Learning using Gaussian Processes

Classification of Point Cloud for Road Scene Understanding with Multiscale Voxel Deep Network

Taming the C Monster Haskell FFI Techniques Fraser Tweedale @hackuador May 22, 2018 FFI basics

Current through a very small conductor nano HUB .org online simulations and more 2 /

Protecting Immigrant Families Advancing Our Future Campaign Public Charge Finalization Webinar

Lecture 20: Motion estimation Most slides from S. Lazebnik, which are based on other slides from

A Method to Evaluate CFG Comparison Algorithms Patrick P.F. Chan Christian Collberg Research

Mendelian Genecs in Humans What are Mendelian Genecs?

Readiness July 2015 CAPT Celissa Stephens Director, Division of Nursing Office of Clinical and

Heidi Eichorn, PHR Susie DeMoss, M.Ed. www.engineering.iastate.edu/staff-mentoring Mission

dimension d v ( H ) most p oints H an shatter 10 10 5 Sop e of V C

Point-Voxel CNN for E ffi cient 3D Deep Learning Zhijian Liu* , - PowerPoint PPT Presentation

H ardware, A I and N eural-nets Point-Voxel CNN for E ffi cient 3D Deep Learning Zhijian Liu* , Haotian Tang* , Yujun Lin , and Song Han Project Page: http://pvcnn.mit.edu/ 3D Deep Learning 3D Part Segmentation 3D Semantic Segmentation 3D

Lecture 3. Su ffi ciency Lecture 3. Su ffi ciency 1 (114) 3. Su ffi ciency 3.1. Su ffi cient

Point-Voxel CNN for E ffi cient 3D Deep Learning Zhijian Liu* , Haotian Tang* , Yujun Lin , and

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only

Immutability, or Putting the Dream Machine to Work The trie memory scheme is ine ffi cient for

Immutability, or Putting the Dream Machine to Work The trie memory scheme is ine ffi cient for

PRACTICAL REAL-TIME VOXEL-BASED GLOBAL ILLUMINATION FOR CURRENT GPUS Alexey Panteleev NVIDIA

1 Splatting Splatting Algorithm: Process from closest voxel to furthest voxel

An E ffi cient A ffi ne-Scaling Algorithm for Hyperbolic Programming Jim Renegar joint work

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie

FFI The good, the bad and the ugly Esteban Lorenzano (The Pharo firefighter) Current status of

Dynamic Graph CNN for learning on point clouds Wang Yue, et al. Otakar Jaek March 25, 2019

15 E ffi cient mesh models Steve Marschner CS5625 Spring 2020 Follows chapter 16 in RTR 4e Basics

Supere ffi cient estimation of the intensity of a stationary Poisson point process via the Stein

E ffi cient Modeling of Latent Information in Supervised Learning using Gaussian Processes

Classification of Point Cloud for Road Scene Understanding with Multiscale Voxel Deep Network

Taming the C Monster Haskell FFI Techniques Fraser Tweedale @hackuador May 22, 2018 FFI basics

Current through a very small conductor nano HUB .org online simulations and more 2 /

Protecting Immigrant Families Advancing Our Future Campaign Public Charge Finalization Webinar

Lecture 20: Motion estimation Most slides from S. Lazebnik, which are based on other slides from

A Method to Evaluate CFG Comparison Algorithms Patrick P.F. Chan Christian Collberg Research

Mendelian Gene*cs in Humans What are Mendelian Gene*cs?

Readiness July 2015 CAPT Celissa Stephens Director, Division of Nursing Office of Clinical and

Heidi Eichorn, PHR Susie DeMoss, M.Ed. www.engineering.iastate.edu/staff-mentoring Mission

dimension d v ( H ) most p oints H an shatter 10 10 5 Sop e of V C

Mendelian Genecs in Humans What are Mendelian Genecs?