[PPT] - Active Data Mining of Correspondence for Qualitative Assessment of PowerPoint Presentation

SLIDE 1

Active Data Mining of Correspondence for Qualitative Assessment

f Scientific Computations

Chris Bailey-Kellogg

Purdue Computer Sciences http://www.cs.purdue.edu/homes/cbk/

Naren Ramakrishnan

Virginia Tech Computer Science http://people.cs.vt.edu/~ramakris/

SLIDE 2

Data-Driven Characterization

f Scientific Computations
Choice of solver depends on problem characteristics (e.g. matrix

sensitivity) and algorithm performance (e.g. convergence).

Empirical characterization (rather than analytical) appropriate

under imperfect domain knowledge, lack of theory. Low-level computational experiments → high-level properties.

Example: spectral portrait illustrates eigenstructure under

perturbations of different magnitudes.

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5

2 2 2 3 3 3 4 4 4 5 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 10 10 10

Eigenvalues inside given curve are indistinguishable under perturbation of that magnitude. Suggests numerical precision necessary.

SLIDE 3

Active Data Mining with SAL

Spatial aggregation (bottom-up):

uniform operators and data types for extracting multi-layer struc- tures in spatial data.

Ambiguity-directed

sampling (top-down): focus data collection

n difficult choice points.
Underlying

domain knowledge: continuity, locality.

classes Equivalence

bjects

Spatial N-graph Ambiguities Sample Aggregate Interpolate Localize Redescribe Localize Redescribe

Lower-Level Objects

Higher-Level Objects

Abstract Description

Classify

Input Field

SLIDE 4

Simple example: Input points (values not shown) Aggregate (localize computation)

−2 −1 1 2 3 4 5 6 7 −3 −2 −1 1 2 3 −2 −1 1 2 3 4 5 6 7 −3 −2 −1 1 2 3

Classify (group connected points with similar-enough value)

−2 −1 1 2 3 4 5 6 7 −3 −2 −1 1 2 3

SLIDE 5

Correspondence Extension to SAL

Key idea: identify mutually-reinforcing relationships among

features of spatial objects, in order to combat noise and sparsity.

SAL particularly conducive: aggregation composes hierarchical

spatial objects:

SLIDE 6

Mechanism:
1. Establish analogy as relation among lower-level constituents of

higher-level objects. Ex: adjacent points of neighboring iso-contours

−2 −1 1 2 3 4 5 6 7 −3 −2 −1 1 2 3

2. Abstract lower-level analogy into higher-level correspondence.

Ex: parameterized curve deformation.

Bridge lower-/higher-level gap: analogy’s meaning derived from

higher-level context; abstraction enables computation of global properties (containment, breaks, overall quality).

Directly usable in ambiguity-directed sampling to address

difficulties in correspondence.

SLIDE 7

Application 1: Matrix Spectral Portrait Analysis

Spectral portrait for matrix A plots complex map: P(z) = log10 A2 (A − zI)−12

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5

2 2 2 3 3 3 4 4 4 5 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 10 10 10

Singularities at eigenvalues; level curves capture “equivalent” eigenvalues wrt perturbations (i.e. curve k contains eigenvalues of all perturbed matrices A + E for E a matrix with E2 ≤ kA2). Perturbation-equivalence indicates sensitivity to numerical error. Ex: 2 & 3 most sensitive, then 4, then 1.

SLIDE 8

Correspondence-Based Merge Identification

Approach: compute merge tree, indicating perturbation levels at which eigenvalues become indistinguishable, by finding correspondences among curves.

1. Sample perturbation levels on regular grid; interpolate

iso-curves.

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −2 −1 1 2 3 4 5 6 7 −3 −2 −1 1 2 3

SLIDE 9

2. Aggregate curve points in Delaunay triangulation.

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −2 −1 1 2 3 4 5 6 7 −3 −2 −1 1 2 3

3. Analogy: cross-curve edges in triangulation.

1.5 2 2.5 3 3.5 4 4.5 5 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 1.5 2 2.5 3 3.5 4 4.5 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

SLIDE 10

4. Correspondence abstraction: merge events in tree.

(1,0) (2,0) (3,0) (4,0) 13 13 13 13 12 12 12 12 11 11 11 11 10 10 10 10 9 9 9 9 8 8 7 7 6 6 5 4 3 2 (1,0) (2,0) (3,0) (4,0) 13 13 13 13 12 12 12 12 11 11 11 11 10 10 10 10 9 9 9 9 8 8 8 7 7 6 6 5 4 3 2 1

5. Evaluate confidence in correspondence: fractions of points

matched, angular separation between “separating” samples.

6. Sample to ensure curve locations adequately constrained by

separating samples (so couldn’t have merged at smaller perturbation level).

SLIDE 11

Results

(2n − 3)!! possible binary merge trees; most not explicitly

considered (would be low confidence).

Initial grid: one sample between each eigenvalue, one unit

larger than bounding box.

Subsample or expand grid when merge events poorly separated.
Tested on variety of polynomial companion matrices: different

numbers / spacings of roots.

High-confidence model selection after 1–3 subsamples, 1–3 grid

expansions.

Substantially less computation than “one-size-fits-all”;

confidence metric and explainability.

SLIDE 12

Application 2: Matrix Jordan Form

Jordan form analysis:

Input: matrix A of dimension n, r ≤ n independent

eigenvectors with eigenvalues λi of multiplicity ρi.

Jordan decomposition: r upper triangular “blocks”

B−1AB =        J1 J2 · Jr        , Ji =        λi 1 λi 1 · 1 λi       

Typical algorithms numerically unstable.

SLIDE 13

Graphical Analysis of Jordan Form

Infer multiplicity by eigenvalue perturbations:

λi + |δ|

1 ρi e iφ ρi

Phase φ of perturbation δ ranges over multiples of π

⇒ computed values are vertices of regular 2ρi-gon, centered on λi, with diameter from |δ|.

Ex: 8-by-8 Brunet matrix with structure (−1)1(−2)1(7)3(7)3,

focusing on Jordan block for first (7)3:

6.998 6.9985 6.999 6.9995 7 7.0005 7.001 7.0015 7.002 −1.5 −1 −0.5 0.5 1 1.5 x 10

−3

SLIDE 14

Correspondence-Based Symmetry Analysis

Approach: compute Jordan structure by identifying portrait symmetry (i.e. auto-correspondence), abstracting as rotation by π/ρ around eigenvalue.

1. Sample points by random normwise perturbation at

magnitude(s) of interest.

6.9 7 7.1 −0.1 0.1 6.9 7 7.1 −0.1 0.1

2. Aggregate triples into triangles.

SLIDE 15

3. Analogy among triangle vertices by congruence (computed via

geometric hashing).

6.9 7 7.1 −0.1 0.1 6.9 7 7.1 −0.1 0.1

4. Correspondence as rotation (x, y, θ) overlaying vertices of

congruent triangles.

6.9 7 7.1 −0.1 0.1 6.9 7 7.1 −0.1 0.1

Eigenvalue=(7.00,0.00); rotation=60.46◦; ρ=3 Eigenvalue=(7.00,0.00); rotation=60.13◦; ρ=3

SLIDE 16

5. Evaluate confidence in correspondence: distance between

points and partners, regularity of sides of polygon.

6. Sample when entropy of models is high.

SLIDE 17

Results

10 matrices; 4-10 perturbation levels; 6-8 samples each round.
Vary # models generated by varying congruence tolerance.
Three sample collection policies:
1. Collect at same level: 1.0–2.7 rounds
2. Collect at next higher level: better when #1 uses low level.
3. Collect at same level until begin “hallucinating”: better for

Brunet-type matrices.

Symmetry quickly eliminates bad models.
No real advantage to varying perturbation level (independent

estimates, irresp. of level).

Small amount of computation required for high-confidence

assessment of Jordan form.

SLIDE 18

Some Related Work

F. Chaitin-Chatelin and V. Frayss´

e: graphical analysis of scientific computations (spectral portraits).

A. Edelman and Y. Ma: Jordan perturbation phenomena.
X. Huang and F. Zhao: correspondence in weather data

iso-contours.

Lots of work in vision on computing and tracking

correspondence.

D.A. Cohn, Z. Chahramani, and M.I. Jordan: active learning.

SLIDE 19

Discussion

Correspondence mechanism within Spatial Aggregation

leverages hierarchical spatial objects and relationships.

First systematic algorithms for performing complete imagistic

analyses (not relying on human visual inspection) of matrix eigenstructure.

Efficient, focused sampling and iterative model evaluation until

high confidence obtained.

Overcome noise and sparsity by utilizing locality and

continuity to identify mutually-reinforcing interpretations.

Many thanks to reviewers!
Funding: CBK (NSF IIS-0237654) and NR (NSF EIA-9974956,

Active Data Mining of Correspondence for Qualitative Assessment

Chris Bailey-Kellogg

Purdue Computer Sciences http://www.cs.purdue.edu/homes/cbk/

Naren Ramakrishnan

Virginia Tech Computer Science http://people.cs.vt.edu/~ramakris/

Data-Driven Characterization

sensitivity) and algorithm performance (e.g. convergence).

under imperfect domain knowledge, lack of theory. Low-level computational experiments → high-level properties.

perturbations of different magnitudes.

Eigenvalues inside given curve are indistinguishable under perturbation of that magnitude. Suggests numerical precision necessary.

Active Data Mining with SAL

uniform operators and data types for extracting multi-layer struc- tures in spatial data.

sampling (top-down): focus data collection

domain knowledge: continuity, locality.

Simple example: Input points (values not shown) Aggregate (localize computation)

Classify (group connected points with similar-enough value)

Correspondence Extension to SAL

features of spatial objects, in order to combat noise and sparsity.

spatial objects:

higher-level objects. Ex: adjacent points of neighboring iso-contours

Ex: parameterized curve deformation.

higher-level context; abstraction enables computation of global properties (containment, breaks, overall quality).

difficulties in correspondence.

Application 1: Matrix Spectral Portrait Analysis

Spectral portrait for matrix A plots complex map: P(z) = log10 A2 (A − zI)−12

Correspondence-Based Merge Identification

Approach: compute merge tree, indicating perturbation levels at which eigenvalues become indistinguishable, by finding correspondences among curves.

iso-curves.

matched, angular separation between “separating” samples.

separating samples (so couldn’t have merged at smaller perturbation level).

Results

considered (would be low confidence).

larger than bounding box.

numbers / spacings of roots.

expansions.

confidence metric and explainability.

Application 2: Matrix Jordan Form

Jordan form analysis:

eigenvectors with eigenvalues λi of multiplicity ρi.

B−1AB =        J1 J2 · Jr        , Ji =        λi 1 λi 1 · 1 λi       

Graphical Analysis of Jordan Form

λi + |δ|

⇒ computed values are vertices of regular 2ρi-gon, centered on λi, with diameter from |δ|.

focusing on Jordan block for first (7)3:

Correspondence-Based Symmetry Analysis

Approach: compute Jordan structure by identifying portrait symmetry (i.e. auto-correspondence), abstracting as rotation by π/ρ around eigenvalue.

magnitude(s) of interest.

geometric hashing).

congruent triangles.

Eigenvalue=(7.00,0.00); rotation=60.46◦; ρ=3 Eigenvalue=(7.00,0.00); rotation=60.13◦; ρ=3

points and partners, regularity of sides of polygon.

Results

Brunet-type matrices.

estimates, irresp. of level).

assessment of Jordan form.

Some Related Work

e: graphical analysis of scientific computations (spectral portraits).

iso-contours.

correspondence.

Discussion

leverages hierarchical spatial objects and relationships.

analyses (not relying on human visual inspection) of matrix eigenstructure.

high confidence obtained.

continuity to identify mutually-reinforcing interpretations.

EIA-9984317, and EIA-0103660).