V-PANE Virtual Perspectives Augmenting Natural Experiences GTC 2017 - - PowerPoint PPT Presentation

v pane virtual perspectives augmenting natural experiences
SMART_READER_LITE
LIVE PREVIEW

V-PANE Virtual Perspectives Augmenting Natural Experiences GTC 2017 - - PowerPoint PPT Presentation

V-PANE Virtual Perspectives Augmenting Natural Experiences GTC 2017 The views, opinions and/or findings expressed are those of the author(s) and should not be interpreted as representing the official Kerry Moffitt views or policies of the


slide-1
SLIDE 1

V-PANE Virtual Perspectives Augmenting Natural Experiences

Kerry Moffitt Scientist

Kerry.Moffitt@Raytheon.com

The views, opinions and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. Distribution Statement “A” (Approved for Public Release, Distribution Unlimited). DARPA DISTAR #27938

GTC 2017

slide-2
SLIDE 2

Contents

  • Background: DARPA, GXV-T
  • V-PANE Overview and Architecture
  • Geometry
  • Video
  • Ongoing Work
slide-3
SLIDE 3

DARPA Ground X-Vehicle Technologies (GXV-T)

Goal: Improved vehicle survivability and mobility

Development Areas:

  • Increased agility
  • Enhanced mobility
  • Crew augmentation
  • Signature management

Source: http://www.darpa.mil/program/ground-x-vehicle-technologies

slide-4
SLIDE 4

Objective: Develop the next generation of ground-vehicle human- machine interfaces that fuse real-time sensor feeds with video data projected onto a 3D geometric model of the environment. The program aims to develop a fully user-controlled, multiple- perspective, live virtual representation of the vehicle’s surroundings.

V-PANE Overview

IED

Boomerang Shot Detection Networked IED Report ATAK Route Plan Secondary View & Touchscreen Synthetic 1st Person View

Popular Press: Wired Magazine National Defense American Security Today Product Design and Development:

slide-5
SLIDE 5

High-Level V-PANE Vision

LWIR Color Point Cloud Video Image Array LWIR Video Imagery Projected onto 3D Geometry Ground X-Vehicle Boomerang Shot Detection Static Imagery & Maps

3D View Renderer

Gunner View & Controls Driver/Commander View & Controls SA System (e.g., ATAK) Path Plan Lidar Additional Onboard Sensors Future

‘Live’ 3D Model

  • 1. The real-time fusion of:
  • A. Lidar point clouds into a 3D model and
  • B. Multiple 2D video streams onto that model and
  • C. Other 2D or 3D threat, position or mapping data
  • 2. The real-time rendering of that model with video

into 2D displays from multiple perspectives

  • 3. The real-time control of the multiple perspectives
slide-6
SLIDE 6

V-PANE User Capabilities

360˚ Visualization Arbitrary Visualization Perspectives Multi-Spectral Image Fusion Location Probing Cue, Slew, Track a Location Fused Semantic Information Real-time Route Analysis Object Detection

Range, Bearing, Elevation Slope (pitch/roll) Position Reconnaissance After- Action Reports Obstacles Obstructions Slope Shot Reports Blue Force Locations Routing

Offline Viewing Include Pre-existing Data

Dynamic Controls People Vehicles

slide-7
SLIDE 7

Sensor Array

slide-8
SLIDE 8

Real-Time Scanning, Modeling, Rendering

  • Lidar + Video + IMU = The World
slide-9
SLIDE 9

Real-Time Scanning, Modeling, Rendering

slide-10
SLIDE 10

V-PANE Workstation Architecture

Lidar IMU/GPS Cameras

USB LAN M6000 Video Cache, Rendering 2 x BMD Q2 Video Grabbers CPU1 P100 Fusion

5 BMD HD 1 NovAtel 1 VectorNav 2 x Velodyne HDL-32E 2 LWIR NTSC

P100 Raycasting P100 Video Frusta Projection CPU2

1 IOI 4KSDI

K80 Compression SSD

slide-11
SLIDE 11

V-PANE Data Pipeline

Lidar IMU/GPS Cameras USB LAN M6000 Video Cache, Rendering 2 x BMD Q2 Video Grabbers CPU1 P100 Fusion

5 BMD HD 1 NovAtel 1 VectorNav 2 x Velodyne HDL-32E Position, Orientation 800 Hz 2 x 40 KB/s Raw IMU 80 KB/s 1400k pts/s 3 MB/s Raw Lidar 3 MB/s Live Video 2.9 GB/s Processed Lidar 18 MB/s Pixel Positions 3 GB/s 2 LWIR NTSC

P100 Raycasting

HD Video 500 MB/s Depths, Indices 4 GB/s

P100 Video Frusta Projection

Compressed Video 500 MB/s

CPU2

Video 3.3 GB/s Depths, Indices 4 GB/s

SSD

Video, Voxels 600 MB/s Voxels 1 GB/s Voxels 5 GB/s

NB:

1 - P2P links require no transfer to or from CPU 2 - Non-P2P links count twice 3 - QPI link hits both CPUs

Inter-GPGPU (P2P) CPU-Driven QPI DisplayPort 1 IOI 4KSDI

Processing and Recording in a Single Server – Maximum-Load Analysis with PEX 8747 PCIe Switches K80 Compression

Live Video 2.9 GB/s HD Video 500 MB/s Inter-GPGPU (non-P2P) Voxels 5 GB/s Depths, Indices 4 GB/s Voxels 100 MB/s Video, Voxels 1.4 GB/s Voxels 5 GB/s

slide-12
SLIDE 12

1 2 3 4

Lidar Projection: Geometry Fusion

  • Lidar lasers (1) reflect back

from surfaces in the world (2), sampling depth

  • During each fusion update,

every voxel reverse- projects (3) to lidar focal point to determine distance from voxel to surface

  • Depth samples stored as

rectangular array with fixed distance between samples (4) to optimize lookup (this requires resampling fixed-angle- delta lidar data)

slide-13
SLIDE 13

Beam Modeling

100 m 2.32 m 1.33°

slide-14
SLIDE 14

Wobbler

  • Fill in the vertical gaps between lasers
slide-15
SLIDE 15
  • Over pixels in current view

– Project into voxel array – Find nearest zero-crossing

Ray Casting

slide-16
SLIDE 16

Raycast: Determine 3D Point per Pixel

2 3 1

  • For every pixel to be

rendered on screen (1), project from virtual camera through pixel (2) into voxel space, to find intersection point with nearest surface

  • n that ray (3)
  • Output is a 3-space point

per pixel to be rendered

slide-17
SLIDE 17

Video Projection: Color Index per Pixel

2 3 1

  • For every pixel to be

rendered on screen (1), reverse-project from 3D point to real-world camera frame (2), to find intersection point with image captured in that frame (3)

  • Any given scene may

involve 100s of frames

slide-18
SLIDE 18

Grabber GPU 1 GPGPU 3 GPGPU 2 GPGPU 1

Video Capture

V-PANE – Data Processing Frequency and Latency

Geometry Fusion Cop y Transfer Voxels Ray Cast Video Project

< 50 ms < 16 ms

Update Frequency Geometry: 20 Hz Video Capture: 60 Hz Rendering: 60 Hz

< 16 ms

Geometry Fusion

< 50 ms

Render

< 16 ms

Video Capture

. . .

< 16 ms < 100 ms

Latency Geometry: < 148 ms Video: < 132 ms

Geometry Latency: < 148 ms Video Latency: < 132 ms

slide-19
SLIDE 19

Profiling Geometry Fusion and Ray Casting

Geometry Fusion Cycle Time: < 50 ms Ray Casting: < 16 ms

slide-20
SLIDE 20

V-PANE Workstation Architecture

USB LAN M6000 Video Cache, Rendering 2 x BMD Q2 Video Grabbers CPU1 P100 Fusion

5 BMD HD 1 NovAtel 1 VectorNav 2 x Velodyne HDL-32E 2 LWIR NTSC

P100 Raycasting P100 Video Frusta Projection CPU2

1 IOI 4KSDI

K80 Compression SSD

Lidar IMU/GPS Cameras

slide-21
SLIDE 21

R e n d e r

CPU GPU Host CUDA OpenGL

Color Convert cudaMemcpy Undistort YUV to Pinned Host Memory Copy Frame

From Live Cameras

[Cache] Fusion Projection Build Frame DXT5 to Pinned Host Memory Copy Frame

From Disk

Upload Decompress Undistort [Cache] Fusion Projection Build Frame Copy Bits

Image Processing in V-PANE

Compress Copy Bits

Compression

Download

slide-22
SLIDE 22

Thread Activity (Video I/O)

Main Thread (M6000) Video Load/Receive Threads

Enqueue Enqueue… Decompress VBLANK Render Process Upload

Dequeue

Decompress…

Dequeue

Compression Thread (K80)

Compress Process Upload

Dequeue

Process… Store

16.6 ms

slide-23
SLIDE 23

Cycle: 17 ms

Video Latency: Best Observed Case

Time (milliseconds)

25 50 75 100

Display Hardware Host Processing Grabber Capture Camera Capture

D M A S w a p D r a w U p l d Wait for VB D M A S w a p

Cycle: 17 ms

Event in World Visible on Display Hardware * Software Capture Thread Render Thread Captured Signal Output Signal After Capture, Render Can Upload

* Total latency and host processing latency are observed Hardware latency is inferred

Single 1080p60 Stream

slide-24
SLIDE 24

LWIR Integration

slide-25
SLIDE 25

Ongoing Work: Object Detection

  • Image classification
slide-26
SLIDE 26

Ongoing Work: Level of Detail

  • Add a second voxel array:
  • 10x range, to 1 km radius
  • 1/10th voxel resolution per dimension
  • Ray caster uses per pixel when no hit in primary (hi-res) voxels
  • Requires only 0.1% compute to update given 100 m lidar range

2,000 m 200 m

slide-27
SLIDE 27

Ongoing Work and Challenges

  • Voxel cache, Voxel LOD
  • GPS/altitude
  • Occlusion testing for video projection

– Bandwidth vs. render quality

  • Timestamps

– GPS from IMU and lidar, but not from cameras – Off by 100 ms = 2.5m

  • What if video but no geometry? (Skybox)
  • Voxel precision

– Image quality vs. geometry update rate (20 cm voxels? 10? 5?)

slide-28
SLIDE 28

The End

Questions?