Graphics: Effect Ordering Packages: seriation, gclus, corrgram - - PowerPoint PPT Presentation

graphics effect ordering
SMART_READER_LITE
LIVE PREVIEW

Graphics: Effect Ordering Packages: seriation, gclus, corrgram - - PowerPoint PPT Presentation

E ULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE P AIR V IZ PACKAGE Catherine Hurley R.W. Oldford NUI Maynooth U. Waterloo July 8 2009 UseR! Monday 13 July 2009 Graphics: Effect Ordering Packages: seriation, gclus, corrgram


slide-1
SLIDE 1

EULERIAN TOUR ALGORITHMS

FOR DATA VISUALIZATION AND THE PAIRVIZ PACKAGE

Catherine Hurley NUI Maynooth R.W. Oldford

  • U. Waterloo

July 8 2009 UseR!

Monday 13 July 2009

slide-2
SLIDE 2

Graphics: Effect Ordering

  • Packages: seriation, gclus, corrgram
  • Example: PCP Flea data

Standard order

Tars1 Tars2 Aede1 Aede2 Head Aede3

Correlation order

Tars2 Aede1 Aede2 Aede3 Tars1 Head

  • 0.6

0.2

Monday 13 July 2009

slide-3
SLIDE 3

Pairviz: relationship ordering

  • Statistical graphics are about comparisons

between variables, cases, groups, models

Aede3 Aede2 Aede1 Tars2 Aede2 Aede1 Aede3 Tars2 Tars1 Head Tars2 Tars1 Aede1 Head Aede2 Tars1 Aede3 Head
  • 0.6
0.0 0.6

Flea data: correlation order

Monday 13 July 2009

slide-4
SLIDE 4

A graph model

  • Build a graph where nodes are statistical objects
  • Edges are relationships
  • Example:

Node Vis Edge Vis Group Boxplot Two groups CI for mean diff Var Hist Two vars Scatterplot 2 vars Scat 4-d space Dynamic scat Model Resid 2 Models PCP

A B C D E F

Monday 13 July 2009

slide-5
SLIDE 5

Example: planned comparisons

Mice in 5 diet groups, response is lifetime Nodes are treatments, edges are planned comparisons Weights are p-values

0.0083 0.0147 0.3111 N/N85 N/R40 N/R50 NP R/R50 lopro

N/R50 N/N85 NP lopro N/R50 N/R40 R/R50 N/R50 10 20 30 40 50

Planned comparisons of diets

Lifetime

  • 5

5 10

Differences

Reducing calories and protein increases lifetime

Monday 13 July 2009

slide-6
SLIDE 6

Graph Traversal

  • Traverse all nodes: hamiltonian path
  • Traverse all edges: eulerian path
  • Use gclus, seriation: hamiltonian paths on complete graphs
  • PairViz: eulerian paths

A B C D E F G H A B C D E F G H

Open hamiltonian path Closed hamiltonian path Closed eulerian path on K7

A B C D E F G

Monday 13 July 2009

slide-7
SLIDE 7

Graph Structures

  • Complete graph: all

comparisons are interesting

  • Edge-weighted graphs: low

weight edges are more interesting

  • Bipartite graph

eg only treatment-control comparisons are of interest

Aede3 Aede2 Aede1 Tars2 Aede2 Aede1 Aede3 Tars2 Tars1 Head Tars2 Tars1 Aede1 Head Aede2 Tars1 Aede3 Head
  • 0.6
0.0 0.6

Weight edges by 1-corr, eulerian follows low weight edges

X1 X2 X3 Y1 Y2

Monday 13 July 2009

slide-8
SLIDE 8
  • Hypercube graph
  • r model selection:

Each node in G is a predictor subset edge: add/drop predictor

Graph Structures- cont’d

  • Line graph

transform G to L(G)

eg Each node in G is a var, each node in L(G) is var pair, edge is 3-d transition

Cube for factorial experiment

000 001 010 011 100 101 110 111

A B C D

AB AC AD BC BD CD

Monday 13 July 2009

slide-9
SLIDE 9

Algorithms- Complete graph

  • Closed eulerian path exists when each node has odd number of vertices: ie for K2n+1
  • Hamiltonian decomposition of graph
  • into hamiltonian cycles: eulerian for K2n+1
  • into hamiltonian paths: approx eulerian for K2n
  • classical algorithm: hpaths
  • WHam: weighted_hpaths: pick best for H1, best orientaton and order for others.

1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Monday 13 July 2009

slide-10
SLIDE 10

Algorithms-Complete graph cont’d

  • Recursive algorithm: eseq:
  • Start with eulerian on Kn, append edges to get eulerian on Kn+2

1 2 3 4 5 6 7 Monday 13 July 2009

slide-11
SLIDE 11

Algorithms- general

  • Eulerian graph: connected, all nodes have even number of edges
  • Otherwise, add edges, pairing up odd nodes
  • Classical algorithm (Hierholzer, Fleury)
  • Our version GrEul, (etour) follows weight increasing edges

Chinese postman does this in optimal way

0.0083 0.0147 0.3111 N/N85 N/R40 N/R50 NP R/R50 lopro

Monday 13 July 2009

slide-12
SLIDE 12

Algorithms comparison

Complete-no weights

5 10 15 20 25 30 35 2 4 6 8

Etour 9

5 10 15 20 25 30 35 2 4 6 8

Eseq 9

5 10 15 20 25 30 35 2 4 6 8

hpaths 9

prefers low vertices prefers low edges 4 hamiltonians

Monday 13 July 2009

slide-13
SLIDE 13

Algorithms: complete, weighted

50 100 150 200 1000 2000 3000 4000

Algorithm eseq: Eurodist edge weights

50 100 150 200 1000 2000 3000 4000

Weighted etour on Eurodist

50 100 150 200 1000 2000 3000 4000 Weighted hamiltonians on Eurodist 1 2 3 4 5 6 7 8 9 10

ignores weights Starts in Geneva

hamiltonian decomp, with increasing path lengths

Eurodist: 21 European cities

Monday 13 July 2009

slide-14
SLIDE 14

Example: model selection

Mammal sleep data Y= log brain wt. Predictors A= non dreaming sleep, B=dreaming sleep, C=log body wt, D=life span

A B C D AB AC AD BC BD CD ABC ABD ACD BCD ABCD

  • Hypercube graph represents possible moves in a

stepwise regression algorithm

  • Graph Qn is hamiltonian, and eulerian for even n
  • Edge weights: change in SSE
  • Eulerian starting with full model
  • All models with C are good
  • Bar chart: change in SSE

Sleep data: Model residuals.

ABCD BCD CD ACD ABCD ABC BC C AC ABC AB A AD ABD BD D AD ACD AC A D CD C B BD BCD BC B AB ABD ABCD

Monday 13 July 2009

slide-15
SLIDE 15

More variables

Sleep data: 10 vars (nodes) 45 edges Eulerian has length 50

Eulerian on scagnostics: Outlying

GP Bd L Br Bd SW PS TS SE PS TS D L P L PS Br P TS Bd TS PS P D D Br P D 0.0 0.3 0.6

Using outlying index from scagnostics package for eulerian traversal zoom on first half of display

Monday 13 July 2009

slide-16
SLIDE 16

More variables-cont’d

Reduce the graph NN graph: eliminate edges with outlier index < .2 Reduces graph from 10 to 5 nodes, and 45 to 5 edges Other nodes have no edges

NN Eulerian on scagnostics: Outlying

GP L Bd SW L Br GP 0.0 0.3 0.6

SW Bd Br L GP Monday 13 July 2009

slide-17
SLIDE 17

IN CONCLUSION..

  • Pairviz package: relationship ordering for data visualisation
  • Current version: algorithms presented here
  • Thanks to graph, igraph
  • Work in progress: ordering dynamic visualisations via ggobi.

with Adrian Waddell, UW

Monday 13 July 2009