SLIDE 1 GPU-Based Large-Scale Scientific Visualization
Johanna Beyer, Harvard University Markus Hadwiger, KAUST
Course Website: http://johanna-b.github.io/LargeSciVis2018/index.html
SLIDE 2
Part 3 - GPU-Based Ray-Guided Volume Rendering Algorithms & Efficient Empty Space Skipping
SLIDE 3
- Working set determination on GPU
- Single-pass rendering
- Traversal on GPU
- Virtual texturing
RAY-GUIDED VOLUME RENDERING
SLIDE 4 Examples using octree traversal (kd-restart):
- Gigavoxels [Crassin et al., 2009]
- Gigavoxel isosurface and volume rendering
- Tera-CVR [Engel, 2011]
- Teravoxel volume rendering with dynamic transfer
functions RAY-GUIDED VOLUME RENDERING (2)
SLIDE 5 Examples using virtual texturing instead of tree traversal
- Petascale volume exploration of microscopy streams
[Hadwiger et al., 2012]
- Visualization-driven pipeline, including data construction
- ImageVis3D [Fogal et al., 2013]
- Analysis of different settings (brick size, …)
RAY-GUIDED VOLUME RENDERING (2)
SLIDE 6
Ray-guided Volume Rendering Examples
SLIDE 7
[Gobbetti et al., The Visual Computer, 2008] EARLY ‘RAY-GUIDED’ OCTREE RAY-CASTING (1) Volume representation Octree Rendering GPU octree traversal Working set determination Interleaved occlusion queries
SLIDE 8 Data structure: Octree with ropes
- Pointers to 8 children, 6 neighbors and volume data
- Active subtree stored in spatial index structure and
texture pool on GPU EARLY ‘RAY-GUIDED’ OCTREE RAY-CASTING (1)
Volume representation Octree
[Gobbetti et al.]
SLIDE 9 Rendering:
- Stackless GPU octree traversal (rope tree)
EARLY ‘RAY-GUIDED’ OCTREE RAY-CASTING (2)
Rendering GPU octree traversal
[Gobbetti et al.]
SLIDE 10 Culling: Culling on CPU
- Culling uses global transfer function, iso-value, view frustum
- Only visible nodes of previous rendering pass get refined
- Occlusion queries to check bounding box of node against depth
- f last sample during raycasting
EARLY ‘RAY-GUIDED’ OCTREE RAY-CASTING (2)
Working set determination Interleaved occlusion queries [Gobbetti et al.]
SLIDE 11
RAY-GUIDED OCTREE RAY-CASTING (1) [Crassin et al., ACM SIGGRAPH i3D, 2009] Volume representation Octree Rendering GPU octree traversal Working set determination Ray-guided
SLIDE 12 Data structure: N3 tree + multi-resolution volume
- Subtree stored on GPU in node/brick pool
- Node: 1 pointer to children, 1 pointer to volume brick
- Children stored together in node pool
RAY-GUIDED OCTREE RAY-CASTING (1)
Volume representation Octree
[Crassin et al.]
SLIDE 13 Rendering:
- Stackless GPU octree traversal (Kd-restart)
- 3 mipmap levels for correct filtering
- Missing data substituted by lower-res data
RAY-GUIDED OCTREE RAY-CASTING (2)
Rendering GPU octree traversal
[Crassin et al.]
SLIDE 14 Culling:
- Multiple render targets write out data usage
- Exploits temporal and spatial coherence
RAY-GUIDED OCTREE RAY-CASTING (2)
Working set determination Ray-guided [Crassin et al.]
SLIDE 15
RAY-GUIDED MULTI-LEVEL PAGETABLE RAY-CASTING (1) [Hadwiger et al., IEEE SciVis 2012] Volume representation Multi-resolution grid Rendering Multi-level virtual texture ray-casting Working set determination Ray-guided
SLIDE 16 Data structure: Multi-res grid
- On-the-fly reconstruction of bricks
- Stored on disk in 2D multi-resolution grid
- Multi-level multi-res. page table on GPU
RAY-GUIDED MULTI-LEVEL PAGETABLE RAY-CASTING (1)
Volume representation Multi-resolution grid
[Hadwiger et al.]
SLIDE 17 Rendering:
- Multi-level virtual texture ray-casting
- LOD chosen per individual sample
- Data reconstruction triggered by ray-caster
RAY-GUIDED MULTI-LEVEL PAGETABLE RAY-CASTING (2)
Rendering Multi-level virtual texture ray-casting
[Hadwiger et al.]
SLIDE 18 Culling:
- GPU hash table to report missing blocks
- Exploits temporal and spatial coherence
RAY-GUIDED MULTI-LEVEL PAGETABLE RAY-CASTING (2)
Working set determination Ray-guided [Hadwiger et al.]
SLIDE 19
RAY-GUIDED MULTI-LEVEL PAGETABLE RAY-CASTING - ANALYSIS [Fogal et al., IEEE LDAV 2013] Volume representation Multi-resolution grid Rendering (Multi-level) virtual texture ray-casting Working set determination Ray-guided
SLIDE 20 Implementation differences:
- Lock-free hash table, pagetable lookup only per brick
- Fallback for multi-pass rendering
RAY-GUIDED MULTI-LEVEL PAGETABLE RAY-CASTING - ANALYSIS
Volume representation Multi-resolution grid Rendering (Multi-level) virtual texture ray-casting
Working set determination Ray-guided [Fogal et al.]
SLIDE 21 Analysis:
- Many detailed performance numbers (see paper)
- Working set size: typically lower than GPU memory
- Brick size: larger on disk (>= 643), smaller for rendering (163, 323)
RAY-GUIDED MULTI-LEVEL PAGETABLE RAY-CASTING - ANALYSIS
Volume representation Multi-resolution grid Rendering (Multi-level) virtual texture ray-casting
Working set determination Ray-guided [Fogal et al.]
SLIDE 22
Scalable Empty-Space Skipping
SLIDE 23 Large volumes, finely detailed structures, many segmented objects connectomics electron microscopy volume 21,000 x 25,000 x 2,000 > 1 teravoxels > 4,000 objects
MOTIVATION
SLIDE 24
MOTIVATION
SLIDE 25
SLIDE 26 no skipping
non-empty space sampling whole volume
SLIDE 27 no skipping look-up overhead: none look-ups
sampling whole volume non-empty space
SLIDE 28
look-up overhead: high look-ups
SLIDE 29 SparseLeap look-up overhead: small look-ups
SLIDE 30
SLIDE 31
Octree
SLIDE 32
SparseLeap
SLIDE 33 Track volume occupancy
Extract nested occupancy
Rasterize occupancy
Empty space skipping: Linear list traversal
SPARSELEAP PIPELINE
SLIDE 34 Track volume occupancy
Extract nested occupancy
Rasterize occupancy
Empty space skipping: Linear list traversal
SPARSELEAP PIPELINE
? ? ?
SLIDE 35 Track volume occupancy
Extract nested occupancy
Rasterize occupancy
Empty space skipping: Linear list traversal
SPARSELEAP PIPELINE
smaller boxes
? ? ?
SLIDE 36 Track volume occupancy
Extract nested occupancy
Rasterize occupancy
Empty space skipping: Linear list traversal
SPARSELEAP PIPELINE
? ? ?
SLIDE 37 Track volume occupancy
Extract nested occupancy
Rasterize occupancy
Empty space skipping: Linear list traversal
SPARSELEAP PIPELINE
unknown empty wn non-emnon-empty empty unknown empty
empty unknown
SLIDE 38 empty non-empty unknown
?
Occupancy classes Node count in each class over whole subtree OCCUPANCY HISTOGRAM TREE
? ? ?
SLIDE 39 empty non-empty unknown
?
Occupancy classes Node count in each class over whole subtree OCCUPANCY HISTOGRAM TREE
? ? ?
* * enables deferred culling
SLIDE 40 empty non-empty unknown
?
Occupancy classes Node count in each class over whole subtree build bottom-up OCCUPANCY HISTOGRAM TREE
? ? ?
* enables deferred culling *
SLIDE 41 Occupancy classes Node count in each class over whole subtree build bottom-up OCCUPANCY HISTOGRAM TREE
? ? ?
unknown
?
empty non-empty
* * enables deferred culling
SLIDE 42 Traverse histogram tree top-down Pick majority class in each node
OCCUPANCY GEOMETRY
? ? ?
SLIDE 43 Traverse histogram tree top-down Pick majority class in each node Emit box on class change
OCCUPANCY GEOMETRY
? ? ?
nge: lass nge:
? ?
SLIDE 44 Traverse histogram tree top-down Pick majority class in each node Emit box on class change
OCCUPANCY GEOMETRY
? ? ?
nge: lass nge:
? ?
SLIDE 45 Traverse histogram tree top-down Pick majority class in each node Emit box on class change
OCCUPANCY GEOMETRY
? ? ?
nge: lass nge:
? ?
SLIDE 46 Traverse histogram tree top-down Pick majority class in each node Emit box on class change
OCCUPANCY GEOMETRY
? ? ?
nge: lass nge:
? ?
SLIDE 47 Traverse histogram tree top-down Pick majority class in each node Emit box on class change
OCCUPANCY GEOMETRY
? ? ?
nge: lass nge:
? ?
SLIDE 48 extracted geometry
OCCUPANCY GEOMETRY
? ? ? ? ? ?
SLIDE 49 extracted geometry
OCCUPANCY GEOMETRY
? ? ? ? ? ?
SLIDE 50 smaller boxes
flattened
extracted geometry
OCCUPANCY GEOMETRY
? ? ?
? ? ? ? ? ?
SLIDE 51
subdivision
COMPARISON
SLIDE 52
geometry
COMPARISON
SLIDE 53
RASTERIZATION: OVERVIEW
? ? ?
SLIDE 54
- ccupancy geometry rasterize front-to-back
merge consecutive segments
RASTERIZATION: OVERVIEW
? ? ? ? ? ?
SLIDE 55 Ray Segment List screen pixels per-pixel linked list
- ccupancy geometry rasterize front-to-back ray segment lists
merge consecutive segments
RASTERIZATION: OVERVIEW
? ? ? ? ? ?
non-empty empty unknown
SLIDE 56 Linear traversal of ray segment list Deferred culling for large volumes: Occupancy class unknown
RAY-CASTING
unknown empty wn non-empty unkno non-empty empty unknown empty
SLIDE 57 The occupancy class unknown causes
DEFERRED CULLING
unknown empty non-empty
SLIDE 58 RESULTS: DEPTH COMPLEXITY
more sparse less sparse more sparse less sparse
SLIDE 59 RESULTS: DEPTH COMPLEXITY
more sparse less sparse more sparse less sparse
SLIDE 60 block size fps
323 163 83 43
Sparse Volume
RESULTS: PERFORMANCE
no skipping ERT no skipping more sparse less sparse more sparse
SLIDE 61 block size fps
323 163 83 43
Sparse Volume
RESULTS: PERFORMANCE
no skipping ERT no skipping T Octree ERT Octree more sparse less sparse more sparse
SLIDE 62 block size fps
323 163 83 43
Sparse Volume
RESULTS: PERFORMANCE
no skipping ERT no skipping T Octree ERT Octree SparseLeap ERT SparseLeap more sparse less sparse more sparse
SLIDE 63 block size fps
323 163 83 43
Sparse Volume
RESULTS: PERFORMANCE
block size
323 163 83 43
Dense Volume
more sparse less sparse more sparse less sparse no skipping ERT no skipping T Octree ERT Octree SparseLeap ERT SparseLeap
SLIDE 64 block size fps
323 163 83 43
Sparse Volume
RESULTS: PERFORMANCE
block size
323 163 83 43
Dense Volume
more sparse less sparse more sparse less sparse no skipping ERT no skipping T Octree ERT Octree SparseLeap ERT SparseLeap
SLIDE 65 block size fps
323 163 83 43
Sparse Volume
RESULTS: PERFORMANCE
block size
323 163 83 43
3
Dense Volume
no skipping ERT no skipping T Octree ERT Octree SparseLeap ERT SparseLeap more sparse less sparse more sparse less sparse
SLIDE 66
SLIDE 67
SLIDE 68
SLIDE 69
SLIDE 70 Cost of empty space skipping moved out of ray-casting loop Attractive alternative for complex volumes Memory consumption (GPU)
- Occupancy geometry: very low; much lower than octree storage
- Lists: depends on screen resolution and average depth complexity
SUMMARY
SLIDE 71
Scalable Culling for Large Segmentation Volumes
SLIDE 72 LARGE SEGMENTATION VOLUMES
Raw image volume Image + Label volumes
SLIDE 73 Visual Queries Fast Volume Rendering with Empty Space Skipping
MOTIVATION – INTERACTIVE VIS APPLICATIONS
[SparseLeap. Hadwiger et al., SciVis 2018] [ConnectomeExplorer. Beyer et al., SciVis 2013]
SLIDE 74
EXAMPLE: CULLING FOR EMPTY SPACE SKIPPING Raw image volume Single label within volume Volume blocks after culling (<0.1% of volume blocks)
SLIDE 75
CHALLENGES Large label volumes, stored as up to 64-bit integer data. > 250 GB > 13 million labels (24 bit data) discrete labels
SLIDE 76 OUR APPROACH FOR SCALABLE CULLING
Label volume Data structure: Label List Tree ? User Culling Query Culling Result Optimized Culling ?
9 5
Spatial Query (Distance)
Interactive Pre-processing
SLIDE 77
Data Structure: Label List Tree
SLIDE 78
- Which labels are present in a volume block?
- Store a list (or set) of labels per volume block
LABEL LISTS
Volume Block Label List
SLIDE 79 Label Volume
LABEL LISTS
SLIDE 80 HYBRID LABEL LIST ENCODING
Data Structure Data Access Time Culling
Deterministic Roaring Bitmap [1] Logarithmic Exact Probabilistic Bloom Filter [2] Constant Conservative Best representation chosen based on:
- Memory size
- Expected run time query performance
- User preferences
[1] Better bitmap performance with roaring bitmaps. Chambi et al., 2016. [2] Space/time trade-offs in hash coding with allowable errors. Bloom, 1970.
SLIDE 81 { { {
Bit string buckets Bitmap (dense) RLE (runs) Sorted list (sparse) Roaring bitmaps
LABEL LIST ENCODING - DETERMINISTIC
SLIDE 82 Bit string Bloom filter
Label List Encoding - Probabilistic
bit array hash function
SLIDE 83 Label Volume Resolution-adjusted Resolution-independent
MULTI-RESOLUTION LABEL LIST TREE
Spatial Queries Rendering
SLIDE 84
Optimized Culling
SLIDE 85
- Culling input: Culling Query, set of labels we are interested in
- Culling output: List of volume blocks that contain labels from
query CULLING
Culling query Culling result Volume blocks
SLIDE 86 Culling query Label lists
HIERARCHICAL CULLING
SLIDE 87 Query Label list Query Label list Query empty
HIERARCHICAL QUERY PRUNING
SLIDE 88 Query Label list
HIERARCHICAL QUERY PRUNING
SLIDE 89 Label list Roaring Bloom filter Bloom filter Query
QUERY-ADAPTIVE LABEL LIST REQUESTS
SLIDE 90
Results
SLIDE 91
SLIDE 92
- 3 neuroscience volumes
- 2 phantom datasets
- 16 - 24 bit label data
- 4,000 - 13 million labels
- 4 GB - 1.5 TB data size
RESULTS - DATASETS
SLIDE 93 RESULTS – MEMORY CONSUMPTION OF LABEL LISTS
Label list size (log-scale)
SLIDE 94 RESULTS – CULLING PERFORMANCE
MC 1 KESM PS2K MC 2
Culling time (log-scale)
Hybrid (ours) Standard culling (visible blocks)
time (ms) 1 100 10,000 10
SLIDE 95 RESULTS – CULLING PERFORMANCE
Label data touched (log-scale)
MC 1 KESM PS2K MC 2
size (kb)
Hybrid (ours) Standard culling (visible blocks)
1 100 10,000 10
SLIDE 96 Our method
- 1. Novel hybrid data structure
- 2. Hierarchical culling algorithm
SUMMARY
compact storage of integer label lists fast, memory efficient culling
SLIDE 97
Questions?
SLIDE 98 GPU-Based Large-Scale Scientific Visualization
Johanna Beyer, Harvard University Markus Hadwiger, KAUST
Course Website: http://johanna-b.github.io/LargeSciVis2018/index.html