Part I: Introductory Materials Introduction to Graph Theory Dr. - - PowerPoint PPT Presentation

part i introductory materials
SMART_READER_LITE
LIVE PREVIEW

Part I: Introductory Materials Introduction to Graph Theory Dr. - - PowerPoint PPT Presentation

Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer Science and Mathematics Division Oak Ridge National Laboratory Graphs ( , ) =


slide-1
SLIDE 1

Part I: Introductory Materials

Introduction to Graph Theory

  • Dr. Nagiza F. Samatova

Department of Computer Science North Carolina State University and Computer Science and Mathematics Division Oak Ridge National Laboratory

slide-2
SLIDE 2

2

Graphs

Graph with 7 nodes and 16 edges

Undirected

Edges Nodes / Vertices

Directed

1 2

( , ) { , ,..., } { ( , ) | , , 1,..., }

n k i j i j

G V E V v v v E e v v v v V k m = = = = ∈ =

( , ) ( , )

i j j i

v v v v =

( , ) ( , )

i j j i

v v v v ≠

slide-3
SLIDE 3

3

Types of Graphs

  • Undirected vs. Directed
  • Attributed/Labeled (e.g., vertex, edge) vs. Unlabeled
  • Weighted vs. Unweighted
  • General vs. Bipartite (Multipartite)
  • Trees (no cycles)
  • Hypergraphs
  • Simple vs. w/ loops vs. w/ multi-edges
slide-4
SLIDE 4

4

Labeled Graphs and Induced Subgraphs

Bold: A subgraph induced by vertices b, c and d Labeled graph w/ loops

slide-5
SLIDE 5

Graph Isomorphism

5

Which graphs are isomorphic? (A) (B) (C)

C

slide-6
SLIDE 6

Graph Automorphism

6

Which graphs are automorphic? Automorphism is isomorphism that preserves the labels. (A) (B) (C)

B

slide-7
SLIDE 7

Vertex degree, in-degree, out-degree

7

Directed head tail

t h

In-degree of the vertex is the number of in-coming edges Out-degree of the vertex is the number of out-going edges Degree of the vertex is the number

  • f edges (both in- & out-degree)
slide-8
SLIDE 8

8

Graph Representation and Formats

  • Adjacency Matrix (vertex vs. vertex)
  • Incidence Matrix (vertex vs. edge)
  • Sparse vs. Dense Matrices
  • DIMACS file format
  • In R: igraph object
slide-9
SLIDE 9

9

Adjacency Matrix Representation

A(1) A(2) B (6) A(4) B (5) A(3) B (7) B (8) A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8) A(1) 1 1 1 1 A(2) 1 1 1 1 A(3) 1 1 1 1 A(4) 1 1 1 1 B(5) 1 1 1 1 B(6) 1 1 1 1 B(7) 1 1 1 1 B(8) 1 1 1 1

A(2) A(1) B (6) A(4) B (7) A(3) B (5) B (8)

A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8) A(1) 1 1 1 1 A(2) 1 1 1 1 A(3) 1 1 1 1 A(4) 1 1 1 1 B(5) 1 1 1 1 B(6) 1 1 1 1 B(7) 1 1 1 1 B(8) 1 1 1 1

Representation is NOT unique. Algorithms can be order-sensitive. Src: “Introduction to Data Mining” by Kumar et al

slide-10
SLIDE 10

Families of Graphs

10

  • Cliques
  • Path and simple path
  • Cycle
  • Tree
  • Connected graphs

Read the book chapter for definitions and examples.

slide-11
SLIDE 11

11

Complete Graph, or Clique

Each pair of vertices is connected. Clique

slide-12
SLIDE 12

12

The CLIQUE Problem

Maximum Clique of Size 5

Clique: a complete subgraph Maximal Clique: a clique cannot be enlarged by adding any more vertices Maximum Clique: the largest maximal clique in the graph

{ , | has a clique of size } CLIQUE G k G k = < >

slide-13
SLIDE 13

13

Does this graph contain a 4-clique?

Indeed it does!

But, if it had not, what evidence would have been needed?

slide-14
SLIDE 14

14

Problem: Decision, Optimization or Search

Problem Decision Optimization Search Formulate each version for the CLIQUE problem. (self-reduction) “Yes”-”No” Parameter k max/min Actual solution

  • Which problem is harder to solve?
  • If we solve Decision problem, can we use it for the others?

Enumeration All solutions

slide-15
SLIDE 15

15

Refresher: Class P and Class NP

Definition: P (NP) is the class of languages/problems that are decidable in polynomial time on a (non-)deterministic single-tape Turing machine. Class

P

????

NP

( )

k k

P DTIME n =U

( )

k k

NP NTIME n =U

non-polynomial Non- deterministic polynomial Polynomially verifiable

slide-16
SLIDE 16

16

PSPACE ∑ 2

P

… …

“forget about it”

P vs. NP

The Classic Complexity Theory View:

P NP

“easy” “hard” “About ten years ago some computer scientists came by and said they heard we have some really cool problems. They showed that the problems are NP-complete and went away!”

slide-17
SLIDE 17

17

Classical Graph Theory Problems

CSC505:Algorithms, CSC707 :Complexity Theory, CSC5??:Graph Theory

  • Longest Path
  • Maximum Clique
  • Minimum Vertex Cover
  • Hamiltonian Path/Cycle
  • Traveling Salesman (TSP)
  • Maximum Independent Set
  • Minimum Dominating Set
  • Graph/Subgraph Isomorphism
  • Maximum Common Subgraph

NP-hard Problems

slide-18
SLIDE 18

18

Graph Mining Problems

CSC 422/522 and Our Book

  • Clustering + Maximal Clique Enumeration
  • Classification
  • Association Rule Mining +Frequent Subgraph Mining
  • Anomaly Detection
  • Similarity/Dissimilarity/Distance Measures
  • Graph-based Dimension Reduction
  • Link Analysis

Many graph mining problems have to deal with classical graph problems as part of its data mining pipeline.

slide-19
SLIDE 19

19

Dealing with Computational Intractability

  • Exact Algorithms:

– Small graph problems – Small parameters to graph problems – Special classes of graphs (e.g., bounded tree-width)

  • Approximation Polynomial-Time Algorithms

(O(nc))

– Guaranteed error-bar on the solution

  • Heuristic Polynomial-Time Algorithms

– No guarantee on the quality of the solution – Low degree polynomial solutions Our focus