Analysis of Large-Scale Scalar Data Using Hixels Joshua A. Levine 2 - PowerPoint PPT Presentation

LDAV 2011 Analysis of Large-Scale Scalar Data Using Hixels Joshua A. Levine 2 , in collaboration with D. Thompson 1 , J.C. Bennett 1 , P.-T. Bremer 3 , A. Gyulassy 1 , P.P. Pébay 4 , V. Pascucci 1 1 2 3 4

HPC Has Lead to Increases in Both Data Size and Complexity • “Hero” runs • Increased spatial resolution • Increased number of variables • Uncertainty Quantification (UQ) • Ensembles of runs • Polynomial Chaos • Stochastic Simulations Images courtesy of: National Energy Research Scientific Computing Center, Los • Many analysis methods do not scale Alamos National Laboratory, Argonne National Laboratory, and Oak Ridge Leadership Computing Facility. with size & complexity of the data

Hixels: A Unified Data Representation • A hixel is a point with an associated histogram of scalar values • Hixel samples may represent: h(f) • Spatial down-sampling • Ensemble values • Random variables • Trade data size/complexity for uncertainty f

1D Example of Hixels (Block Compression)

Motivation: Feature-Based Analysis • Characterize and define features • Segmentation domain by function behavior • Answer questions: • How many features are there? • What is the behavior of other variables within these features? • How do you define a good threshold value on which to segment the domain? Data courtesy of: Dr. Jacqueline Chen, SNL

Goal: Extend Topological Methods • What structures are present? • How persistent are they? • How do we visualize features? • Our Contributions: 1. Sampled topology 2. Topological analysis of statistically associated buckets 3. Visualizing fuzzy isosurfaces

Sampled Topology: Algorithm 1. Sample the hixels to construct a scalar field V i 2. Compute the Morse complex for V i a) Identify basins around minima & arcs between adjacent basins b) Encode arc locations in a binary field C i • Boundaries = 1, Rest = 0 3. Construct aggregate A as mean of the C i ’s 4. Visualize variability of arc locations Assumption: hixels are independent

Aggregate Segmentation on Temporal Jet 1 run 16 runs 64 runs 256 runs 16384 runs p = 0 1 A p = 0.008 0 p = 0.128

Convergence of Sampled Topology

Varying Block Size & Persistence 2x2 4x4 8x8 16x16 1x1 512 runs 2048 runs 8196 runs 16384 runs 1 runs p = 0 1 p = 0.004 A p = 0.016 p = 0.064 0 p = 0.256

Topological Analysis of Statistically Associated Buckets: Algorithm • Aimed at recovering prominent features from ensemble data • Exploit dependencies between runs • Identify regions in space & scalar values consistent with positive association • Perform topological segmentation on these regions individually 1. Compute buckets 2. Compute contingency statistics 3. Identify sheets 4. Perform topological analysis on individual sheets

Computing Buckets • Values of high probability associated bins with peaks in the histogram • Identify peaks + range of function values around that peak • Topological segmentation on histogram • Use areal (hypervolume) persistence • Weight of interval = area of the histogram • Merge until the probability of smallest bucket buckets is above a particular threshold

Persistence Simplification of Buckets Persistence Pairs

Persistence Simplification of Buckets

Effect of Persistence on Bucket Count Number of Buckets p = 16 p = 32 p = 64 p = 128 p = 256 p = 512 Persistence Threshold (p)

Contingency Tables on Bucketed Hixels h 1 h 1 - h 2 e f g h 3 h 2 a 4 2 0 a b 2 3 1 h e c 0 5 1 b d 6 0 0 i f f c h 1 - h 3 h i j g j a 5 1 0 d b 1 4 1 c 2 4 0 y x d 0 1 5

Pointwise Mutual Information (PMI) Encodes Association Between Hixels h 1 h 3 h 2 Goal: Identify buckets that co- a occur more frequently than if h e statistically independent b i f      p x , y       X , Y f c pmi x , y : log       g j   p x p y X Y d pmi(x,y)=0 => x independent y y x

Positive PMI Constructs Sheets of Statistically Associated Buckets Before: Bucketed Hixels

Positive PMI Constructs Sheets of Statistically Associated Buckets After: Sheets Connecting Buckets

An Ensemble of Mixed Distributions • 512 x 512 hixels, 128 bins each • 3200 samples from Poisson distribution • l is a 100 at 5 source points in a circle  • l decreases to 12 distance from source points Mean Poisson Surface • 9600 samples from a Gaussian distribution • m & s are min & max at 4 points in a circle µ • m & s vary distance from source points Mean Gaussian Surface

An Ensemble of Mixed Distributions Mean Poisson Surface Mean Gaussian Surface Mean Surface (Yellow) for Combined Samples

“Simple” Topological Tests Fail! • Probability that each hixel corresponds to • Minimum ~ 20% • Maximum ~ 20% Sample Frequency • Saddle ~ 7% • Regular point ~ 53%

Sheets Isolate Prominent Features Basins of Minima Basins of Maxima

Sheets for Lifted Ethylene Jet Buckets per hixel

Visualizing Fuzzy Isosurfaces: Algorithm 1. Compute likelihood function g k  b a   a , b 0      g b , a 0  a b  h(f) , otherwise   b a 2. Volume render g • Provides a fuzzy description of the likelihood of where an isosurface exists f

Comparison to Downsampling 4 3 8 3 16 3 32 3 64 3 Fuzzy iso   Mean g   Lower left

Fuzzy Isosurface of Temporal Jet   g 2 3 8 3 32 3   Likelihood that isovalue k = 0.506 passes through a hixel

Conclusions and Summary • Unified representations of large scalar bins fields from various modalities • 3 proof of concept applications • Sampled topology buckets • Topological analysis of statistically associated buckets • Visualizing fuzzy isosurfaces

Future Work • Larger ensembles/larger data bins • Performance/scaling • Infer sheets from multivariate hixels • Issues to study buckets • What is preserved by hixels vs. resolution loss • Identify appropriate number of bins/hixel • Persistence thresholds for bucketing algorithm • Balance data storage vs. feature preservation • What topological features can/cannot be preserved by hixelation

Acknowledgement Contact: Joshua A. Levine jlevine@sci.utah.edu This work was supported by the Department of Energy Office of Advanced Scientific Computing Research, award number DE-SC0001922. Sandia is a multi-program laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under Contract DE-AC04-94AL85000. This work was also performed under the auspices of the US Department of Energy (DOE) by the Lawrence Livermore National Laboratory under contract nos. DE-AC52-07NA27344, LLNL-JRNL-412904L and by the University of Utah under contract DE-FC02-06ER25781. We are grateful to Dr. Jacqueline Chen for the combustion data sets and M. Eduard Göller, Georg Glaeser, and Johannes Kastner for the stag beetle dataset.

Analysis of Large-Scale Scalar Data Using Hixels Joshua A. Levine 2 - PowerPoint PPT Presentation

LDAV 2011 Analysis of Large-Scale Scalar Data Using Hixels Joshua A. Levine 2 , in collaboration with D. Thompson 1 , J.C. Bennett 1 , P.-T. Bremer 3 , A. Gyulassy 1 , P.P. Pbay 4 , V. Pascucci 1 1 2 3 4 HPC Has Lead to Increases in Both Data

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Higher order black holes of scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T

Hairy black holes in scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T Kolyvaris, E

ANALYSIS OF THE PION SCALAR FORM FACTOR PROVIDES MODEL INDEPENDENT PARAMETERS OF f 0 (500) and f 0

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Large Scale Complex Network Analysis using Large Scale Complex Network Analysis using the Hybrid

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Vectors Slide 2 / 36 Scalar versus Vector A scalar has only a physical quantity such as mass,

Scalar Functions and Arithmetic Unit Objectives After completing this unit, you should be able

Hadron Spectroscopy with a low-mass Hadron Spectroscopy with a low-mass composite scalar in the

Topic 2: Scalar Diffraction Aim: Covers Scalar Diffraction theory to derive Rayleigh-Sommerfled

Massless Scalar & Scalar Condensate from the Quantum Conformal Anomaly E. Mottola, Los Alamos

CS 104 Computer Organization and Design Fancy Pipelines: not just scalar in-order CS104: Fancy

Slowly rolling scalar fields Quintessence - Generic behaviour 1. PE KE 2. KE dom scalar field

Arrays in C Dalhousie University Winter 2019 Arrays vs Scalar Types Values of a scalar types

Never Funconal Programming Language Sawomir Maludziski, Ph.D. Jakub Podgrski Agenda

Interaction of monotonicity and truth-values Jakub Szymanik Marcin Zajenkowski December 20, 2013

Practicing Law with Humility How can Indigenous Laws make Canadian Lawyers better? Amanda Carling

WHY (M)RUBY SHOULD BE YOUR NEXT SCRIPTING LANGUAGE? Tomasz D browski / Rockhard GIC 2016 Pozna

The meanings of indexical words What does a listener understand sustainable meanings of the

Herodotos Herodotou Shivnath Babu Duke University Analysis in the Big Data Era Popular

The factorization of RSA-1024 D. J. Bernstein University of Illinois at Chicago Abstract: This

Finding ECM-friendly curves through a study of Galois properties 10th Algorithmic Number Theory

Sambuz

Useful Links

Newsletter

Mail Us

Analysis of Large-Scale Scalar Data Using Hixels Joshua A. Levine 2 - PowerPoint PPT Presentation

LDAV 2011 Analysis of Large-Scale Scalar Data Using Hixels Joshua A. Levine 2 , in collaboration with D. Thompson 1 , J.C. Bennett 1 , P.-T. Bremer 3 , A. Gyulassy 1 , P.P. Pbay 4 , V. Pascucci 1 1 2 3 4 HPC Has Lead to Increases in Both Data

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Higher order black holes of scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T

Hairy black holes in scalar tensor theories E Babichev and CC gr-qc/1312.3204 CC, T Kolyvaris, E

ANALYSIS OF THE PION SCALAR FORM FACTOR PROVIDES MODEL INDEPENDENT PARAMETERS OF f 0 (500) and f 0

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Large Scale Complex Network Analysis using Large Scale Complex Network Analysis using the Hybrid

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Vectors Slide 2 / 36 Scalar versus Vector A scalar has only a physical quantity such as mass,

Scalar Functions and Arithmetic Unit Objectives After completing this unit, you should be able

Hadron Spectroscopy with a low-mass Hadron Spectroscopy with a low-mass composite scalar in the

Topic 2: Scalar Diffraction Aim: Covers Scalar Diffraction theory to derive Rayleigh-Sommerfled

Massless Scalar &amp; Scalar Condensate from the Quantum Conformal Anomaly E. Mottola, Los Alamos

CS 104 Computer Organization and Design Fancy Pipelines: not just scalar in-order CS104: Fancy

Slowly rolling scalar fields Quintessence - Generic behaviour 1. PE KE 2. KE dom scalar field

Arrays in C Dalhousie University Winter 2019 Arrays vs Scalar Types Values of a scalar types

Never Funconal Programming Language Sawomir Maludziski, Ph.D. Jakub Podgrski Agenda

Interaction of monotonicity and truth-values Jakub Szymanik Marcin Zajenkowski December 20, 2013

Practicing Law with Humility How can Indigenous Laws make Canadian Lawyers better? Amanda Carling

WHY (M)RUBY SHOULD BE YOUR NEXT SCRIPTING LANGUAGE? Tomasz D browski / Rockhard GIC 2016 Pozna

The meanings of indexical words What does a listener understand sustainable meanings of the

Herodotos Herodotou Shivnath Babu Duke University Analysis in the Big Data Era Popular

The factorization of RSA-1024 D. J. Bernstein University of Illinois at Chicago Abstract: This

Finding ECM-friendly curves through a study of Galois properties 10th Algorithmic Number Theory

Sambuz

Useful Links

Newsletter

Mail Us

Massless Scalar & Scalar Condensate from the Quantum Conformal Anomaly E. Mottola, Los Alamos