CS-5630 / CS-6630 Visualization Graphs Alexander Lex - - PowerPoint PPT Presentation

cs 5630 cs 6630 visualization graphs
SMART_READER_LITE
LIVE PREVIEW

CS-5630 / CS-6630 Visualization Graphs Alexander Lex - - PowerPoint PPT Presentation

CS-5630 / CS-6630 Visualization Graphs Alexander Lex alex@sci.utah.edu [xkcd] Applications of Graphs Without graphs, there would be none of these: Michal 2000 www.itechnews.net Graph Visualization Case Study Graph Theory Fundamentals


slide-1
SLIDE 1

CS-5630 / CS-6630 Visualization Graphs

Alexander Lex alex@sci.utah.edu

[xkcd]

slide-2
SLIDE 2

Applications of Graphs

Without graphs, there would be none of these:

slide-3
SLIDE 3

Michal ¡2000

slide-4
SLIDE 4

www.itechnews.net

slide-5
SLIDE 5

Graph Visualization Case Study

slide-6
SLIDE 6

Graph Theory Fundamentals

Network Tree Bipartite ¡Graph Hypergraph

slide-7
SLIDE 7

Königsberg Bridge Problem (1736)

Want to make $1 million? Find an O(n^k) algorithm to find Hamiltonian Paths (path that visits each vertex exactly once) - example of P vs. NP problem.

slide-8
SLIDE 8

Graph Terms (1)

A graph G(V,E) consists of a set of vertices V (also called nodes) and a set of edges E connecting these vertices.

slide-9
SLIDE 9

Graph Terms (2)

A simple graph G(V,E) is a graph which contains no multi-edges and no loops

Not ¡a ¡simple ¡graph!
 à A ¡general ¡graph

slide-10
SLIDE 10

Graph Terms (3)

A directed graph (digraph) is a graph that discerns between the edges and . A hypergraph is a graph
 with edges connecting
 any number of vertices.

Hypergraph ¡Example B A B A

slide-11
SLIDE 11

Graph Terms (4)

Independent Set
 G contains no edges Clique
 G contains all possible edges

Independent ¡Set Clique

slide-12
SLIDE 12

Graph Terms (5)

Path
 G contains only edges that
 can be consecutively traversed Tree
 G contains no cycles Network
 G contains cycles

Path Tree

slide-13
SLIDE 13

Graph Terms (6)

Unconnected graph
 An edge traversal starting from
 a given vertex cannot reach any


  • ther vertex.

Articulation point
 Vertices, which if deleted from
 the graph, would break up the
 graph in multiple sub-graphs.

Unconnected ¡Graph Articulation ¡Point ¡(red)

slide-14
SLIDE 14

Graph Terms (7)

Biconnected graph
 A graph without articulation
 points. 
 Bipartite graph
 The vertices can be partitioned
 in two independent sets.

Biconnected ¡Graph Bipartite ¡Graph

slide-15
SLIDE 15

Tree

A graph with no cycles - or: A collection of nodes contains a root node and 0-n subtrees subtrees are connected to root by an edge

root

T1 T2 T3 Tn …

slide-16
SLIDE 16

A C D B E F G H I A D C B F E G H I

Ordered Tree

slide-17
SLIDE 17

Contains no nodes, or Is comprised of three disjoint sets of nodes:

a root node, a binary tree called its left subtree, and a binary tree called its right subtree

C H G F C H G F

root

LT RT

Binary Trees

slide-18
SLIDE 18

Different Kinds of Graphs

Network Tree Bipartite ¡Graph Hypergraph

  • A. ¡Brandstädt ¡et ¡al. ¡1999

Over ¡1000 ¡different ¡graph ¡classes

slide-19
SLIDE 19

Graph Measures

Node degree deg(x)
 The number of edges being incident to this node. For directed graphs indeg/outdeg are considered separately. Diameter of graph G
 The longest shortest path within G. Pagerank
 count number & quality of links

[Wikipedia]

slide-20
SLIDE 20

Graph Algorithms (1)

Traversal: Breadth First Search, Depth First Search

BFS DFS

  • ­‑

generates ¡neighborhoods ¡

  • ­‑

hierarchy ¡gets ¡rather ¡wide ¡ than ¡deep ¡

  • ­‑

solves ¡single-­‑source ¡shortest ¡ paths ¡(SSSP) ¡

  • ­‑

classical ¡way-­‑finding/back-­‑tracking ¡ strategy ¡

  • ­‑

tree ¡serialization ¡

  • ­‑

topological ¡ordering

slide-21
SLIDE 21

Hard Graph Algorithms 
 (NP-Complete)

Longest path Largest clique Maximum independent set (set of vertices in a graph, no two of which are adjacent) Maximum cut (separation of vertices in two sets that cuts most edges) Hamiltonian path/cycle (path that visits all vertexes once) Coloring / chromatic number (colors for vertices where no adjacent v. have same color) Minimum degree spanning tree

slide-22
SLIDE 22

Graph and Tree Visualization

slide-23
SLIDE 23

Setting the Stage

GRAPH ¡DATA GOAL ¡/ ¡TASK Visualization Interaction GRAPHICAL
 REPRESENTATION

How ¡to ¡decide ¡which ¡representation ¡to ¡use ¡for ¡which ¡type ¡of ¡ graph ¡in ¡order ¡to ¡achieve ¡which ¡kind ¡of ¡goal?

slide-24
SLIDE 24

Different Kinds of Tasks/Goals

Two principal types of tasks: attribute-based (ABT) and topology-based (TBT)
 Localize – find a single or multiple nodes/edges that fulfill a given property

  • ABT: Find the edge(s) with the maximum edge weight.
  • TBT: Find all adjacent nodes of a given node.


Quantify – count or estimate a numerical property of the graph

  • ABT: Give the number of all nodes.
  • TBT: Give the indegree (the number of incoming edges) of a node.


Sort/Order – enumerate the nodes/edges according to a given criterion

  • ABT: Sort all edges according to their weight.
  • TBT: Traverse the graph starting from a given node.

list ¡adapted ¡from ¡Schulz ¡2010

slide-25
SLIDE 25

Three Types of Graph Representations

Matrix Explicit ¡
 (Node-­‑Link) Implicit

slide-26
SLIDE 26

Explicit Graph Representations

Node-link diagrams: vertex = point, edge = line/arc

A C B D E

Free Styled Fixed

HJ ¡Schulz ¡2006

slide-27
SLIDE 27

Criteria for Good 
 Node-Link Layout

Minimized edge crossings Minimized distance of neighboring nodes Minimized drawing area Uniform edge length Minimized edge bends Maximized angular distance between different edges Aspect ratio about 1 (not too long and not too wide) Symmetry: similar graph structures should look similar

list ¡adapted ¡from ¡Battista ¡et ¡al. ¡1999

slide-28
SLIDE 28

Conflicting Criteria

Schulz ¡2004

Minimum ¡number


  • f ¡edge ¡crossings



 vs.
 
 Uniform ¡edge ¡ length Space ¡utilization
 
 vs.
 
 Symmetry

slide-29
SLIDE 29

Force Directed Layouts

Physics model: 
 edges = springs,
 vertices = repulsive magnets in practice: damping Computationally 
 expensive: O(n3) Limit (interactive): ~1000 nodes

Spring ¡Coil
 (pulling ¡nodes ¡together) Expander ¡
 (pushing ¡nodes ¡apart)

slide-30
SLIDE 30

[van ¡Ham ¡et ¡al. ¡2009]

Giant Hairball

slide-31
SLIDE 31

Adress Computational Scalability: Multilevel Approaches

[Schulz ¡2004]

real ¡vertex virtual ¡vertex internal ¡spring external ¡spring virtual ¡spring Metanode ¡A Metanode ¡B Metanode ¡C

slide-32
SLIDE 32

Abstraction/Aggregation

750 ¡nodes 30k ¡nodes 18 ¡nodes 90 ¡nodes

cytoscape.org

slide-33
SLIDE 33

Collapsible Force Layout

Supernodes: aggregate of nodes manual or algorithmic

clustering

slide-34
SLIDE 34

HOLA: Human-like Orthogonal Layout

Study how humans lay-out a graph Try to emulate layout

Left: human, middle: conventional algo, right new algo

[Kieffer et al, InfoVis 2015]

slide-35
SLIDE 35
slide-36
SLIDE 36

Styled / Restricted Layouts

Circular Layout Node ordering Edge Clutter

  • ca. ¡3% ¡of ¡all ¡possible ¡edges
  • ca. ¡6,3% ¡of ¡all ¡possible ¡edges
slide-37
SLIDE 37

Example: ¡MizBee

[Meyer ¡et ¡al. ¡2009] ¡

slide-38
SLIDE 38

Reduce Clutter: Edge Bundling

Holten ¡et ¡al. ¡2006

slide-39
SLIDE 39

Hierarchical Edge Bundling

Bundling ¡Strength

Holten ¡et ¡al. ¡2006

slide-40
SLIDE 40

Fixed Layouts

Can’t vary position of nodes Edge routing important

slide-41
SLIDE 41

Bundling Strength

Michael Bostock

mbostock.github.com/d3/talk/20111116/bundle.html

slide-42
SLIDE 42

Explicit Tree Visualization

Reingold– Tilford layout

http://billmill.org/pymag- trees/

slide-43
SLIDE 43

Tree Interaction, Tree Comparison

slide-44
SLIDE 44

Multivariate Graphs

slide-45
SLIDE 45

Node Attributes

Coloring Glyphs

  • > Limited in scalability
slide-46
SLIDE 46

Small Multiples

Cerebral [Barsky, 2008] Each dimension in its

  • wn window
slide-47
SLIDE 47

Data-driven node positioning

GraphDice Nodes are laid out according to attribute values

[Bezerianos et al, 2010]

slide-48
SLIDE 48

Path Extraction & Multiple Views

slide-49
SLIDE 49

Experi- mental Data and Pathways


Cannot account for variation found in 
 real-world data Branches can be (in)activated due to

mutation, changed gene expression, modulation due to drug treatment, etc.

[Partl, BioVis ‘12]

slide-50
SLIDE 50

Many Node Attributes

Pathway A A F B C E D G

Node Sample 1 Sample 2 Sample 3 … 0.55 0.12 0.33 … 0.95 0.42 0.65 … 0.83 0.16 0.38 … … … … A B C … Node Sample 1 Sample 2 Sample 3 … low normal high … low low very low … very high high normal … … … … A B C …

C

How to visualize experimental data on pathways?

slide-51
SLIDE 51

Good Old Color Coding

A

  • 3.4

B 2.8 C 3.1 D

  • 3

E 0.5 F 0.3

C B D F A E

4.2 5.1 4.2 1.8 1.3 1.1

  • 2.2 2.4 2.2
  • 2.8 1.6 1.0

0.3 -1.1 1.3 0.3 1.8 -0.3

[Lindroos2002]

slide-52
SLIDE 52

Challenge: Data Scale & Heterogeneity

Large number of experiments

Large datasets have more than 500 experiments

Multiple groups/conditions Different types of data

slide-53
SLIDE 53

Challenge: Supporting Multiple Tasks

Two central tasks:

Explore topology of pathway Explore the attributes of the nodes 
 (experimental data)

Need to support both!

C B D F A E

slide-54
SLIDE 54

Pathway View A E C B D F enRoute View

Concept

Group 1 Dataset 1 Group 2 Dataset 1 Group 1 Dataset 2

B C F A D E D A E

Non-Genetic Dataset

slide-55
SLIDE 55

enRoute

slide-56
SLIDE 56

Video

slide-57
SLIDE 57
slide-58
SLIDE 58

Case Study: CCLE Data

22

slide-59
SLIDE 59
slide-60
SLIDE 60

Design Critique

slide-61
SLIDE 61

Connected China

http://china.fathom.info/ https://goo.gl/YXkWYX

slide-62
SLIDE 62

Matrix Representations

slide-63
SLIDE 63

Matrix Representations

Instead of node link diagram, use adjacency matrix

A C B D E A B C D E A B C D E

slide-64
SLIDE 64

Matrix Representations

Examples:

HJ ¡Schulz ¡2007

slide-65
SLIDE 65

Matrix Representations

Well ¡suited ¡for ¡
 neighborhood-­‑related ¡TBTs ¡

van ¡Ham ¡et ¡al. ¡2009 Shen ¡et ¡al. ¡2007

Not ¡suited ¡for ¡
 path-­‑related ¡TBTs

slide-66
SLIDE 66

McGuffin ¡2012

slide-67
SLIDE 67

Order Critical!

slide-68
SLIDE 68

Matrix Representations

Pros:

can represent all graph classes except for hypergraphs puts focus on the edge set, not so much on the node set simple grid -> no elaborate layout or rendering needed well suited for ABT on edges via coloring of the matrix cells well suited for neighborhood-related TBTs via traversing rows/columns

Cons:

quadratic screen space requirement (any possible edge takes up space) not suited for path-related TBTs

slide-69
SLIDE 69

Special Case: Genealogy

slide-70
SLIDE 70

Hybrid Explicit/Matrix

NodeTrix
 [Henry ¡et ¡al. ¡2007]

slide-71
SLIDE 71

Implicit Layouts

Matrix Explicit ¡
 (Node-­‑Link) Implicit

slide-72
SLIDE 72

Explicit vs. Implicit Tree Vis

Schulz 2011

slide-73
SLIDE 73

Tree Maps

Johnson ¡and ¡Shneiderman ¡1991

slide-74
SLIDE 74

Zoomable Treemap

slide-75
SLIDE 75

Example: Interactive TreeMap of a Million Items

Fekete ¡et ¡al. ¡2002

slide-76
SLIDE 76

Sunburst: Radial Layout

[Sunburst by John Stasko, Implementation in Caleydo by Christian Partl]

slide-77
SLIDE 77

Others

Icicle Plot

slide-78
SLIDE 78

Implicit Representations

Pros:

space-efficient because of the lack of explicitly drawn edges: scale well up to very large graphs in most cases well suited for ABTs on the node set depending on the spatial encoding also useful for TBTs

Cons:

can only represent trees since the node positions are used to represent edges, they can no longer be freely arranged (e.g., to reflect geographical positions) useless to pursue any task on the edges spatial relations such as overlap or inclusion lead to occlusion

slide-79
SLIDE 79

Tree Visualization Reference

slide-80
SLIDE 80

Summary

Munzner ¡2014

slide-81
SLIDE 81

Graph Tools & Applications

slide-82
SLIDE 82

Gephi

http://gephi.org

slide-83
SLIDE 83

Cytoscape

Open source platform for complex network analysis

http://www.cytoscape.org/

slide-84
SLIDE 84

Cytoscape Web


http://cytoscapeweb.cytoscape.org/

slide-85
SLIDE 85

NetworkX


https://networkx.github.io/