 
              1 Autonomous driving visual perception on the DRIVE PX2 Dr. Antonio Espinosa http:// grupsderecerca.uab.cat/hpca4se/en/content/gpu Dr. Antonio M. López www.cvc.uab.es/~antonio http://adas.cvc.uab.es/elektra/ http://adas.cvc.uab.es/elektra/
Our background: camera-based ADAS 2 http://adas.cvc.uab.es/elektra/
Our current research: 3 http://adas.cvc.uab.es/elektra/
Stereo Vision for Depth Computation 4 10 meters Disparity: distance between same point in left & right images higher disparity = Objects are closer http://adas.cvc.uab.es/elektra/
Optimize Stereo Matching: Semi-Global Matching (SGM) 5 HOST DEVICE (GPU) HOST (CPU) (CPU) Input: Matching Smoothed Left and … Cost Right Cost Images Output: Disparity Image Total Computation Work  ( Height × Width × MaxDisp × Image Resolution Max. Disparity: 256 Path Directions ) 640 x 480 0.63 B ops. 1280 x 480 1.26 B ops. 1280 x 960 2.51 B ops. http://adas.cvc.uab.es/elektra/
GPU Implementation: Proposal 6 First level parallelism (big granularity) … Third level parallelism (fine-grained) Collaborative work Second level parallelism (medium granularity) d L  y C Dependency : Serialized … … SGM … … stencil x x SGM stencil http://adas.cvc.uab.es/elektra/
GPU Implementation: Performance & Energy Efficiency 7 100 TEGRA X1: Performance (Frames/Second, fps) Energy Efficiency 640x480 D= 128 1280x480 1280x960 50 real-time 0 2 4 8 # path directions Tegra X1 achieves real-time ( 20 FPS ) with > 2X efficiency than Titan X • Newer GPUs, with higher memory bandwidth, will achieve faster solutions • http://adas.cvc.uab.es/elektra/
GPU Implementation: Results 8 http://adas.cvc.uab.es/elektra/
Sky Stixel World: Compact representation of the world 9 Obj. Horizon slope Stereo Images Stereo + Horizon Line + Road Slope Stixels Obj. • Stereo of 1280 x 960 = 1,228,800 pixels => Too much data to process • Medium-level representation with only relevant information Obj. • Fixed width stixels, variable number of stixels per column • Stixel = Stick + Pixel Grnd http://adas.cvc.uab.es/elektra/
Stixel World: a Mid-level Image Representation 10 10 Image Resolution Computation Sky: Far pixels, near 0 disparity 640 x 480 147 M ops. Object: constant disparity 1280 x 480 294 M ops. Ground: close to expectedmodel 1280 x 960 1179 M ops. Total Computation Work  ( Width × Height 2 ) Find best configuration Computed independently for each column Enforces constraints: no sky below horizon, no neighbors objects at the same distance… Combinatorial explosion (of possible segments): dynamic programming technique http://adas.cvc.uab.es/elektra/
GPU Implementation: Proposal 11 11 First level parallelism (big granularity): Second level parallelism (fine-grained): CTA … ··· h h Stereo Disparity • Independent / Task level: Typical CPU parallelization • Each image column is processed in parallel by a CTA • CTA = Cooperating Threads Array http://adas.cvc.uab.es/elektra/ CTA
GPU Implementation: Second Level Parallelism 12 12 Second level parallelism (fine-grained): Computational Analysis h × w Thread Parallelism Compute Work per thread h h 2 × w Total Global Data Reads h × w Total Global Data Stores C C C C i i i i ….. h step 1 step 2 step 3 ….. step h • Extra parallelism level needed for efficient GPU use • Sequentially perform h (image height) steps • CTA threads collaborate sharing info each step • Decreasing Parallelism: Each step uses one thread less CTA http://adas.cvc.uab.es/elektra/
GPU Implementation: Performance & Energy Efficiency 13 13 GPU Performance (Frames/Second, fps) Energy Efficiency 1000.0 10 1280 x 240 1280 x 240 9 640 x 480 frames per second / Watt 640 x 480 8 frames per second 1280 x 480 7 1280 x 480 100.0 6 1000 5 581.0 373.0 8.68 4 10.0 86.8 3 45.7 4.57 22.3 2 4.00 2.32 2.23 1 1.49 1.0 0 Tegra X1 Titan X Tegra X1 Titan X • Real-time performance for energy efficient GPU: NVIDIA DRIVE PX • NVIDIA Drive PX has better energetic efficiency than high-end GPUs http://adas.cvc.uab.es/elektra/ 13
GPU Implementation: Result (using SYNTHIA) 14 14 http://adas.cvc.uab.es/elektra/
15 15 Image generator to acquire thousands of precise data with several kinds of ground truth . • RGB & Per pixel: depth, semantic class, instance ID, optical flow • Covering popular Cityscapes classes (see www.synthia-dataset.net) What is this for? teaching cars to perceive the environment using machine learning techniques (e.g. deep learning, reinforcement learning). http://adas.cvc.uab.es/elektra/
16 16 http://adas.cvc.uab.es/elektra/
Recommend
More recommend