54 Years of Graph Isomorphism Testing Brendan McKay Australian - - PowerPoint PPT Presentation

54 years of graph isomorphism testing
SMART_READER_LITE
LIVE PREVIEW

54 Years of Graph Isomorphism Testing Brendan McKay Australian - - PowerPoint PPT Presentation

54 Years of Graph Isomorphism Testing Brendan McKay Australian National University isomorphism 1 isomorphism 2 The concept of isomorphism We have some objects consisting of some finite sets and some relations on them. An isomorphism


slide-1
SLIDE 1

54 Years of Graph Isomorphism Testing

Brendan McKay Australian National University

isomorphism — 1

slide-2
SLIDE 2

isomorphism — 2

slide-3
SLIDE 3

The concept of isomorphism We have some objects consisting of some finite sets and some relations

  • n them.

An isomorphism between two objects is a bijection between their sets so that the relations are preserved. Hopefully, isomorphism is an equiv- alence relation on the set of all objects.

isomorphism — 3

slide-4
SLIDE 4

The concept of isomorphism We have some objects consisting of some finite sets and some relations

  • n them.

An isomorphism between two objects is a bijection between their sets so that the relations are preserved. Hopefully, isomorphism is an equiv- alence relation on the set of all objects.

isomorphism — 3

slide-5
SLIDE 5

The concept of isomorphism We have some objects consisting of some finite sets and some relations

  • n them.

An isomorphism between two objects is a bijection between their sets so that the relations are preserved. Hopefully, isomorphism is an equiv- alence relation on the set of all objects.

2 1 3 2 6 5 1 6 5 4 3 4

isomorphism — 3

slide-6
SLIDE 6

The concept of isomorphism (continued) Are these isomorphic? The computer answers in about 5 microseconds.

isomorphism — 4

slide-7
SLIDE 7

Ubiquity of graph isomorphism Very many isomorphism problems can be efficiently expressed as graph isomorphism problems. For convenience, we use coloured graphs, in which the vertices (and possibly the edges) are coloured. Isomorphisms must preserve vertex and edge colour.    1 3 5 4 1 5 4 3 3 4 5 1    Say two matrices are isomorphic if one can be obtained from the

  • ther by permuting rows and columns.

isomorphism — 5

slide-8
SLIDE 8

Ubiquity of graph isomorphism Very many isomorphism problems can be efficiently expressed as graph isomorphism problems. For convenience, we use coloured graphs, in which the vertices (and possibly the edges) are coloured. Isomorphisms must preserve vertex and edge colour.

= 4 = 5 = 3 = 1

   1 3 5 4 1 5 4 3 3 4 5 1    Say two matrices are isomorphic if one can be obtained from the

  • ther by permuting rows and columns.

isomorphism — 5

slide-9
SLIDE 9

Hadamard equivalence    1

−1

1 1

−1

1

−1 −1

1    Say isomorphism of two ±1 matrices allows permutation and negation

  • f rows and columns.

isomorphism — 6

slide-10
SLIDE 10

Hadamard equivalence    1

−1

1 1

−1

1

−1 −1

1    Say isomorphism of two ±1 matrices allows permutation and negation

  • f rows and columns.

isomorphism — 6

slide-11
SLIDE 11

Isotopy    1 3 2 2 1 3 3 2 1    Say isomorphism of two matrices allows permutation of rows, columns and symbols.

isomorphism — 7

slide-12
SLIDE 12

Isotopy

1 2 3 1 2 3 1 2 3

R

  • w

C

  • l

u m n Symbol

   1 3 2 2 1 3 3 2 1    Say isomorphism of two matrices allows permutation of rows, columns and symbols.

isomorphism — 7

slide-13
SLIDE 13

Isotopy

1 2 3 1 2 3 1 2 3

R

  • w

C

  • l

u m n Symbol

1 2 3 1 2 3 1 2 3

R

  • w

C

  • l

u m n Symbol

   1 3 2 2 1 3 3 2 1    Say isomorphism of two matrices allows permutation of rows, columns and symbols.

isomorphism — 7

slide-14
SLIDE 14

Permutation equivalence of linear codes         0 1 1 0 1 0 0 0 0 0 1 1 0 1 1 1 0 0 1 0 0 1 0 0 1 1 1 1 0 1 0 1 1 0 1 1 0 0 1 0 0 0 0 1 0        

isomorphism — 8

slide-15
SLIDE 15

Permutation equivalence of linear codes         0 1 1 0 1 0 0 0 0 0 1 1 0 1 1 1 0 0 1 0 0 1 0 0 1 1 1 1 0 1 0 1 1 0 1 1 0 0 1 0 0 0 0 1 0        

?

Suppose isomorphism of two 0-1 matrices means that their columns can be permutated so that the rows generate the same vector space

  • ver GF (2).

Nobody knows how to make an equivalent graph problem of size which is polynomial in the size of the generator matrix.

isomorphism — 8

slide-16
SLIDE 16

Theoretical complexity The graph isomorphism problem is: GI: Given two graphs, determine whether they are isomorphic.

  • The fastest known algorithm for GI is eO((log n)b) time for constant b

(Babai, 2016); for maximum degree d the best is eO((log d)c) time for constant c (Grohe, Neuen & Schweitzer, 2018).

  • GI is not known to be NP-complete (but there are strong indications

that it isn’t).

  • GI is not known to be in co-NP.
  • There are polynomial time algorithms for many special classes of

graphs (bounded degree, bounded genus, classes with excluded mi- nors or topological subgraphs). Few of these are practical, planar graphs being a notable exception.

isomorphism — 9

slide-17
SLIDE 17

Automorphisms Isomorphisms from an object to itself are automorphisms. The set of automorphisms forms a group under composition called the automor- phism group.

8 1 7 6 5 4 3 2

isomorphism — 10

slide-18
SLIDE 18

Automorphisms Isomorphisms from an object to itself are automorphisms. The set of automorphisms forms a group under composition called the automor- phism group.

8 1 7 6 5 4 3 2

(1) (1 4)(2 3)(5 8)(6 7) (1 8)(2 7)(3 6)(4 5) (1 5)(2 6)(3 7)(4 8)

isomorphism — 10

slide-19
SLIDE 19

Canonical labelling Given a class of objects, and a definition of “isomorphism”, we can choose one member of each isomorphism class and call it the canonical member of the class. The process of finding the canonical member of the isomorphism class containing a given object is called “canonical labelling”.

isomorphism — 11

slide-20
SLIDE 20

Canonical labelling Given a class of objects, and a definition of “isomorphism”, we can choose one member of each isomorphism class and call it the canonical member of the class. The process of finding the canonical member of the isomorphism class containing a given object is called “canonical labelling”. Two labelled objects which are isomorphic become identical when they are canonically labelled. Since identity of objects is usually far easier to check than isomorphism, canonical labelling is the preferred method of reducing a collection of

  • bjects to one member of each isomorphism class: Canonically label

each object then use a sorting or hashing algorithm to detect isomorphs.

isomorphism — 11

slide-21
SLIDE 21

Canonical labelling (continued) A possible definition of the canonical member of an isomorphism class would be the member which maximises some linear representation (such as a list of edges).

{{0, 2}, {0, 3}, {1, 2}, {1, 3}, {2, 4}, {2, 5}, {3, 6}, {4, 6}} < {{0, 2}, {0, 3}, {1, 2}, {1, 4}, {2, 4}, {2, 5}, {3, 6}, {4, 6}}

isomorphism — 12

slide-22
SLIDE 22

Canonical labelling (continued) A possible definition of the canonical member of an isomorphism class would be the member which maximises some linear representation (such as a list of edges).

{{0, 2}, {0, 3}, {1, 2}, {1, 3}, {2, 4}, {2, 5}, {3, 6}, {4, 6}} < {{0, 2}, {0, 3}, {1, 2}, {1, 4}, {2, 4}, {2, 5}, {3, 6}, {4, 6}}

In practice, most programs compute a canonical labelling which is easy for computers to find rather than easy for humans to define.

isomorphism — 12

slide-23
SLIDE 23

Some history

Actual programs for graph isomorphism began to appear in the early 1960s. The main applica- tions were chemistry, physics, linguistics, and combinatorial generation. Unger (1964)

isomorphism — 13

slide-24
SLIDE 24

Sussenguth (1964) B¨

  • hm and Santolini (1964)

isomorphism — 14

slide-25
SLIDE 25

Morgan (1965) Nagle (1966)

isomorphism — 15

slide-26
SLIDE 26

The refinement- individualization tree

Although hints

  • f

it appeared earlier, the first clear-cut definition

  • f

the refinement- individualization tree appeared in 1967. Parris and Read ex- plored it in breadth-first

  • rder.

Parris and Read (1967)

isomorphism — 16

slide-27
SLIDE 27

Systematic use of sym- metries to prune the search tree may have been first introduced by this 1974 paper. It didn’t use partition refinement. Other early examples: Tinhofer (1975), Beyer and Proskurowski (1976), McKay (1976). Arlazarov, Zuev, Uskov and Faradzhev (1974)

isomorphism — 17

slide-28
SLIDE 28

Quantity vs quality

During those years, the subject became so pop- ular that it was known as a “disease”. Most programs were fairly useless and many were just wrong. Production

  • f

wrong programs is still very popular. Read and Corneil (1977)

isomorphism — 18

slide-29
SLIDE 29

nauty

was

  • riginally

called GLABC and was written in a mixture of Fortran and Assembler starting in 1976. The C edition called

nauty

came about 1981. Important advances in the next 15 years: Kocay (1985), Kirk (1985), Leon (1991). McKay (1977)

isomorphism — 19

slide-30
SLIDE 30

Current software for graph isomorphism The most successful programs still supported are:

  • nauty (McKay, 1976–) canonical label and automorphism group
  • VF2 (Cordella, Foggia, Sansone and Vento, 1999–) comparison of

two graphs

  • saucy (Darga, Sakallah and Markov, 2004–) automorphism group
  • bliss (Juntilla and Kaski, 2010–) canonical label and automor-

phism group

  • Traces (Piperno, 2008–) canonical label and automorphism group
  • conauto (L´
  • pez-Presa, Anta and Chiroque, 2009–) automorphism

group and comparison of two graphs

  • PRIB8,vseparnz1e (Stoichev, 1997–) canonical label and auto-

morphism group Many similar principles appear in these programs.

isomorphism — 20

slide-31
SLIDE 31

The individualization-refinement paradigm All of the currently best programs use an individualization-refinement paradigm. A key concept is partition refinement (“partition” = “colouring”). An equitable partition is one where every two vertices of the same colour are adjacent to the same number of vertices of each colour. Two examples of equitable partitions:

isomorphism — 21

slide-32
SLIDE 32

Refinement of a partition means to subdivide its cells. Given a partition (colouring) π, there is a unique equitable partition that is a refinement of π and has the least number of colours.

isomorphism — 22

slide-33
SLIDE 33

Refinement of a partition means to subdivide its cells. Given a partition (colouring) π, there is a unique equitable partition that is a refinement of π and has the least number of colours.

Count green neighbours Initial Count pink neighbours Count yellow neighbours

isomorphism — 22

slide-34
SLIDE 34

Individualization-refinement tree The nodes of the tree correspond to equitable partitions.

isomorphism — 23

slide-35
SLIDE 35

Individualization-refinement tree The nodes of the tree correspond to equitable partitions. The root of the tree corresponds to the initial colouring, refined.

isomorphism — 23

slide-36
SLIDE 36

Individualization-refinement tree The nodes of the tree correspond to equitable partitions. The root of the tree corresponds to the initial colouring, refined. If a node corresponds to a discrete partition (each vertex with a different colour), it has no children and is a leaf.

isomorphism — 23

slide-37
SLIDE 37

Individualization-refinement tree The nodes of the tree correspond to equitable partitions. The root of the tree corresponds to the initial colouring, refined. If a node corresponds to a discrete partition (each vertex with a different colour), it has no children and is a leaf. Otherwise we choose a colour used more than once (the target cell), individualize one of those vertices by giving it a new unique colour, and refine to get a child.

isomorphism — 23

slide-38
SLIDE 38

Individualization-refinement tree The nodes of the tree correspond to equitable partitions. The root of the tree corresponds to the initial colouring, refined. If a node corresponds to a discrete partition (each vertex with a different colour), it has no children and is a leaf. Otherwise we choose a colour used more than once (the target cell), individualize one of those vertices by giving it a new unique colour, and refine to get a child. Each leaf lists the vertices in some order (since colours have a predefined

  • rder), so it corresponds to a labelling of the graph.

If we define an order on labelled graphs, such as lexicographic order, then the greatest labelled graph corresponding to a leaf is a canonical graph.

isomorphism — 23

slide-39
SLIDE 39

Individualization-refinement tree

4 5 [ 4 | | 5 | 2 7 | 6 | 3 ] 1 8 [ 1 4 5 8 | 2 3 6 7 ] 4 3 2 1 8 7 6 5 [ 1 | 4 | 5 | 8 | 6 | 3 | 7 | 2 ] [ 1 | 5 | 4 | 8 | 3 | 6 | 7 | 2 ] [ 4 | 1 | 8 | 5 | 7 | 2 | 6 | 3 ] [ ... ] [ ... ] [ ... ] [ ... ] [ ... ] [ ... ] [ ... ]

1 7 2 3 4 5 6 8 1 7 6 3 2 5 8 1 2 3 4 5 6 8

(1) (3 6)(4 5) (1 4)(2 3)(5 8)(6 7)

4

[ 1 | | 8 | 3 6 | 7 | 2 ]

7

1 4 5 8 4 5 1 8

isomorphism — 24

slide-40
SLIDE 40

Node invariants A node invariant is a value φ(ν) attached to each node in the tree that depends only on the combinatorial properties of ν and its ancestors in the search tree, satisfying some technical conditions.

isomorphism — 25

slide-41
SLIDE 41

Node invariants A node invariant is a value φ(ν) attached to each node in the tree that depends only on the combinatorial properties of ν and its ancestors in the search tree, satisfying some technical conditions. For example,

φ(ν) =

c(ν′), c(ν′′), . . . , c(ν)

  • is a node invariant with lexicographic ordering, where ν′, ν′′, . . . , ν is the

path from the root of the tree to ν, and c( ) is the number of colours.

isomorphism — 25

slide-42
SLIDE 42

Node invariants A node invariant is a value φ(ν) attached to each node in the tree that depends only on the combinatorial properties of ν and its ancestors in the search tree, satisfying some technical conditions. For example,

φ(ν) =

c(ν′), c(ν′′), . . . , c(ν)

  • is a node invariant with lexicographic ordering, where ν′, ν′′, . . . , ν is the

path from the root of the tree to ν, and c( ) is the number of colours. Let φ∗ be the greatest node invariant of a leaf. Then the lexicographi- cally greatest labelled graph corresponding to a leaf ν with φ(ν) = φ∗ is a canonical graph.

isomorphism — 25

slide-43
SLIDE 43

Node invariants A node invariant is a value φ(ν) attached to each node in the tree that depends only on the combinatorial properties of ν and its ancestors in the search tree, satisfying some technical conditions. For example,

φ(ν) =

c(ν′), c(ν′′), . . . , c(ν)

  • is a node invariant with lexicographic ordering, where ν′, ν′′, . . . , ν is the

path from the root of the tree to ν, and c( ) is the number of colours. Let φ∗ be the greatest node invariant of a leaf. Then the lexicographi- cally greatest labelled graph corresponding to a leaf ν with φ(ν) = φ∗ is a canonical graph. This allows parts of the tree which cannot contain a canonical graph to be pruned (branch-and-bound).

isomorphism — 25

slide-44
SLIDE 44

Automorphism group handling Automorphisms can be discovered by several means.

  • by comparing the labelled graphs corresponding to two leaves (all

the programs)

  • by monitoring the effect of refinement closely (saucy and Traces)
  • by using the properties of equitable partitions (nauty and Traces)
  • by being provided by the user (Traces)

isomorphism — 26

slide-45
SLIDE 45

Automorphism group handling Automorphisms can be discovered by several means.

  • by comparing the labelled graphs corresponding to two leaves (all

the programs)

  • by monitoring the effect of refinement closely (saucy and Traces)
  • by using the properties of equitable partitions (nauty and Traces)
  • by being provided by the user (Traces)

In order to perform automorphism-based pruning of the search tree, we need to efficiently determine if two sequences of vertices are equivalent under the group generated by the automorphisms found so far.

isomorphism — 26

slide-46
SLIDE 46

Pruning operations Node invariants, together with automorphisms, allow us to remove parts

  • f the search tree without generating them.
  • 1. If ν1, ν2 are nodes at the same level in the search tree and φ(ν1) >

φ(ν2), then no canonical labelling is descended from ν2.

  • 2. If ν1, ν2 are nodes at the same level in the search tree and φ(ν1) =

φ(ν2), then no labelled graph descended from ν1 is the same as one

descended from ν2.

  • 3. If ν1, ν2 are nodes with ν2 = νg

1 for an automorphism g, then g

maps the subtree descended from ν1 onto the subtree descended from ν2. So any labelled graph descended from ν2 is equal to some labelled graph descended from ν1.

isomorphism — 27

slide-47
SLIDE 47

Variations between programs The competing programs vary from each other in ways that include:

  • 1. Data structures
  • 2. Strength of the refinement procedure
  • 3. Order of traversal of the search tree
  • 4. Means of discovering automorphisms
  • 5. Processing of automorphisms

isomorphism — 28

slide-48
SLIDE 48

Stronger refinement — nauty “invariants” Sometimes refinement to equitable partition is insufficient to separate vertices with clearly different combinatorial properties. Those properties can be used to make the refinement stronger.

isomorphism — 29

slide-49
SLIDE 49

Stronger refinement — nauty “invariants” Sometimes refinement to equitable partition is insufficient to separate vertices with clearly different combinatorial properties. Those properties can be used to make the refinement stronger. For example, we can count the number of 3-cycles and 4-cycles through each vertex and then refine:

isomorphism — 29

slide-50
SLIDE 50

Stronger refinement — nauty “invariants” Sometimes refinement to equitable partition is insufficient to separate vertices with clearly different combinatorial properties. Those properties can be used to make the refinement stronger. For example, we can count the number of 3-cycles and 4-cycles through each vertex and then refine:

isomorphism — 29

slide-51
SLIDE 51

Stronger refinement — nauty “invariants” Sometimes refinement to equitable partition is insufficient to separate vertices with clearly different combinatorial properties. Those properties can be used to make the refinement stronger. For example, we can count the number of 3-cycles and 4-cycles through each vertex and then refine:

isomorphism — 29

slide-52
SLIDE 52

Tree traversal order

isomorphism — 30

slide-53
SLIDE 53

Tree traversal order

isomorphism — 30

slide-54
SLIDE 54

Tree traversal order

nauty order: depth-first search

isomorphism — 30

slide-55
SLIDE 55

Tree traversal order

Traces order: breadth-first search

isomorphism — 30

slide-56
SLIDE 56

Tree traversal order

Traces order: breadth-first search

The problem with BFS is that automorphisms are discovered at leaves.

isomorphism — 30

slide-57
SLIDE 57

Tree traversal order — a Traces innovation

isomorphism — 31

slide-58
SLIDE 58

Tree traversal order — a Traces innovation

isomorphism — 31

slide-59
SLIDE 59

Tree traversal order — a Traces innovation Traces: experimental paths Experimental paths allow automorphism detection during BFS.

isomorphism — 31

slide-60
SLIDE 60

Sparse automorphism detection — a saucy innovation

x w v y

equitable partition (no other blue or green)

isomorphism — 32

slide-61
SLIDE 61

Sparse automorphism detection — a saucy innovation

x w v y

equitable partition (no other blue or green)

x w v y

individualize v

x w v y

individualize w

isomorphism — 32

slide-62
SLIDE 62

Sparse automorphism detection — a saucy innovation

x w v y

equitable partition (no other blue or green)

x w v y

individualize v

x w v y

individualize w This discovers the automorphism (v w)(x y) without comparing leaves

  • f the search tree.

isomorphism — 32

slide-63
SLIDE 63

Johnson graphs in nauty

J(n, k) has all the k-subsets of an n-set as vertices, with two k-subsets

being adjacent if their intersection has k−1 elements. Number of vertices Seconds Experimentally Θ(n3/2)

isomorphism — 33

slide-64
SLIDE 64

Automorphism group Canonical label

101 102 103 10−6 10−5 10−4 10−3 10−2 10−1 101 102 103 10−6 10−5 10−4 10−3 10−2 10−1 100

bliss saucy conauto nauty Traces Random graphs with p = 1

2

isomorphism — 34

slide-65
SLIDE 65

Automorphism group Canonical label

101 102 103 104 10−5 10−4 10−3 10−2 10−1 100 101 101 102 103 104 10−5 10−4 10−3 10−2 10−1 100 101

bliss saucy conauto nauty Traces Miscellaneous vertex-transitive graphs

isomorphism — 35

slide-66
SLIDE 66

Automorphism group Canonical label

500 1,000 1,500 2,000 2,500 10−1 100 101 102 500 1,000 1,500 2,000 2,500 100 101 102

timeout: 600 secs

bliss saucy conauto nauty Traces Large strongly-regular graphs

isomorphism — 36

slide-67
SLIDE 67

Automorphism group Canonical label

200 400 600 800 1,000 10−6 10−5 10−4 10−3 10−2 10−1 100 101 102 103

timeout: 600 secs

200 400 600 800 1,000 10−6 10−5 10−4 10−3 10−2 10−1 100 101 102 103

timeout: 600 secs

bliss saucy conauto nauty Traces Hadamard matrix graphs

isomorphism — 37

slide-68
SLIDE 68

Automorphism group Canonical label

400 800 1,200 1,600 2,000 10−3 10−2 10−1 100 101 400 800 1,200 1,600 2,000 10−3 10−2 10−1 100 101 102

bliss saucy conauto nauty Traces Cai-F¨ urer-Immerman graphs

isomorphism — 38

slide-69
SLIDE 69

The bad news

David Neuen and Pascal Schweitzer (2017) proved that no algorithm using the individualization-refinement paradigm takes less than expo- nential time.

isomorphism — 39

slide-70
SLIDE 70

The bad news

David Neuen and Pascal Schweitzer (2017) proved that no algorithm using the individualization-refinement paradigm takes less than expo- nential time. The result remains true if stronger refinement is used (k-dimensional Weisfeiler-Lehmen refinement) and the algorithm is presented in ad- vance with the full automorphism group.

isomorphism — 39

slide-71
SLIDE 71

The bad news

David Neuen and Pascal Schweitzer (2017) proved that no algorithm using the individualization-refinement paradigm takes less than expo- nential time. The result remains true if stronger refinement is used (k-dimensional Weisfeiler-Lehmen refinement) and the algorithm is presented in ad- vance with the full automorphism group. However, no algorithms performing better in practice are known.

isomorphism — 39

slide-72
SLIDE 72

The bad news

David Neuen and Pascal Schweitzer (2017) proved that no algorithm using the individualization-refinement paradigm takes less than expo- nential time. The result remains true if stronger refinement is used (k-dimensional Weisfeiler-Lehmen refinement) and the algorithm is presented in ad- vance with the full automorphism group. However, no algorithms performing better in practice are known. So it seems that the theoretical and practical galaxies are separating.

isomorphism — 39