Greedy MaxCut Algorithms and their Information Content Yatao Bian , - - PowerPoint PPT Presentation

greedy maxcut algorithms and their information content
SMART_READER_LITE
LIVE PREVIEW

Greedy MaxCut Algorithms and their Information Content Yatao Bian , - - PowerPoint PPT Presentation

Greedy MaxCut Algorithms and their Information Content Yatao Bian , Alexey Gronskiy and Joachim M. Buhmann Machine Learning Institute, ETH Zurich April 27, 2015 1 / 19 Contents Greedy MaxCut Algorithms Approximation Set Coding (ASC) Applying


slide-1
SLIDE 1

Greedy MaxCut Algorithms and their Information Content

Yatao Bian, Alexey Gronskiy and Joachim M. Buhmann

Machine Learning Institute, ETH Zurich

April 27, 2015

1 / 19

slide-2
SLIDE 2

Contents

Greedy MaxCut Algorithms Approximation Set Coding (ASC) Applying ASC: Count the Approximation Sets Applying ASC: Experiments and Analysis

2 / 19

slide-3
SLIDE 3

Contents

Greedy MaxCut Algorithms Approximation Set Coding (ASC) Applying ASC: Count the Approximation Sets Applying ASC: Experiments and Analysis

2 / 19

slide-4
SLIDE 4

MaxCut

MaxCut: classical NP-hard problem

3 / 19

slide-5
SLIDE 5

MaxCut

MaxCut: classical NP-hard problem

  • G = (V, E), vertex set V , edge set E, weights wij ≥ 0

x y z

1 3 2

3=1+2

cut value

3 / 19

slide-6
SLIDE 6

MaxCut

MaxCut: classical NP-hard problem

  • G = (V, E), vertex set V , edge set E, weights wij ≥ 0
  • CUT c := (S, V \S), cut space C (|C| = 2n−1 − 1)

x y z

1 3 2

3=1+2

cut value

3 / 19

slide-7
SLIDE 7

MaxCut

MaxCut: classical NP-hard problem

  • G = (V, E), vertex set V , edge set E, weights wij ≥ 0
  • CUT c := (S, V \S), cut space C (|C| = 2n−1 − 1)
  • Cut value: cut(c, G) :=

i∈S,j∈V \S wij

x y z

1 3 2

3=1+2

cut value

3 / 19

slide-8
SLIDE 8

MaxCut

MaxCut: classical NP-hard problem

  • G = (V, E), vertex set V , edge set E, weights wij ≥ 0
  • CUT c := (S, V \S), cut space C (|C| = 2n−1 − 1)
  • Cut value: cut(c, G) :=

i∈S,j∈V \S wij

x y z

1 3 2

3=1+2

cut value

max cut:-)

x y z

1 3 2

5=2+3

3 / 19

slide-9
SLIDE 9

Greedy Algorithms for MaxCut

Name Greedy Techniques Heuristic Sorting Init. Vertices Deterministic Double Greedy Double SG (Sahni & Gonzales)

  • SG3 (variant of SG)
  • Edge Contraction (EC)

Backward

  • 4 / 19
slide-10
SLIDE 10

Double Greedy Taxonomy

5 / 19

slide-11
SLIDE 11

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

5 / 19

slide-12
SLIDE 12

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

5 / 19

slide-13
SLIDE 13

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

1: init. 2 solutions S := ∅, T := V

  • works on 2 solutions

simultaneously

5 / 19

slide-14
SLIDE 14

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

1: init. 2 solutions S := ∅, T := V

//in random order

2: for each vertex vi ∈ V do 10: end for

  • works on 2 solutions

simultaneously

  • for each vertex, decides

whether it should be added to S, or removed from T

5 / 19

slide-15
SLIDE 15

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

1: init. 2 solutions S := ∅, T := V

//in random order

2: for each vertex vi ∈ V do 3:

ai := gain of adding vi to S

4:

bi := gain of removing vi from T

10: end for

  • works on 2 solutions

simultaneously

  • for each vertex, decides

whether it should be added to S, or removed from T

5 / 19

slide-16
SLIDE 16

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

1: init. 2 solutions S := ∅, T := V

//in random order

2: for each vertex vi ∈ V do 3:

ai := gain of adding vi to S

4:

bi := gain of removing vi from T

5:

if ai ≥ bi then

6:

add vi to S

7:

else

8:

remove vi from T

9:

end if

10: end for

  • works on 2 solutions

simultaneously

  • for each vertex, decides

whether it should be added to S, or removed from T

5 / 19

slide-17
SLIDE 17

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

1: init. 2 solutions S := ∅, T := V

//in random order

2: for each vertex vi ∈ V do 3:

ai := gain of adding vi to S

4:

bi := gain of removing vi from T

5:

if ai ≥ bi then

6:

add vi to S

7:

else

8:

remove vi from T

9:

end if

10: end for 11: return cut: (S, V \S), cut value

  • works on 2 solutions

simultaneously

  • for each vertex, decides

whether it should be added to S, or removed from T

5 / 19

slide-18
SLIDE 18

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

1: init. 2 solutions S := ∅, T := V

//in random order

2: for each vertex vi ∈ V do 3:

ai := gain of adding vi to S

4:

bi := gain of removing vi from T

5:

if ai ≥ bi then

6:

add vi to S

7:

else

8:

remove vi from T

9:

end if

10: end for 11: return cut: (S, V \S), cut value

  • works on 2 solutions

simultaneously

  • for each vertex, decides

whether it should be added to S, or removed from T

Differences between the double greedy algorithms:

5 / 19

slide-19
SLIDE 19

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

1: init. 2 solutions S := ∅, T := V

//in random order

2: for each vertex vi ∈ V do 3:

ai := gain of adding vi to S

4:

bi := gain of removing vi from T

5:

if ai ≥ bi then

6:

add vi to S

7:

else

8:

remove vi from T

9:

end if

10: end for 11: return cut: (S, V \S), cut value

  • works on 2 solutions

simultaneously

  • for each vertex, decides

whether it should be added to S, or removed from T

Differences between the double greedy algorithms: D2Greedy → select the first 2 vertices → SG

5 / 19

slide-20
SLIDE 20

Double Greedy Taxonomy

Deterministic Double Greedy (D2Greedy)

Require: graph G = (V, E) Ensure: cut and the cut value

1: init. 2 solutions S := ∅, T := V

//in random order

2: for each vertex vi ∈ V do 3:

ai := gain of adding vi to S

4:

bi := gain of removing vi from T

5:

if ai ≥ bi then

6:

add vi to S

7:

else

8:

remove vi from T

9:

end if

10: end for 11: return cut: (S, V \S), cut value

  • works on 2 solutions

simultaneously

  • for each vertex, decides

whether it should be added to S, or removed from T

Differences between the double greedy algorithms: D2Greedy → select the first 2 vertices → SG SG → sort the candidates → SG3

5 / 19

slide-21
SLIDE 21

Backward Greedy – Edge Contraction Algorithm

6 / 19

slide-22
SLIDE 22

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

6 / 19

slide-23
SLIDE 23

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

6 / 19

slide-24
SLIDE 24

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

1: repeat 5: until 2 “super" vertices left

6 / 19

slide-25
SLIDE 25

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

1: repeat 5: until 2 “super" vertices left

  • contract the lightest edge in

each step

6 / 19

slide-26
SLIDE 26

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

1: repeat 5: until 2 “super" vertices left

  • contract the lightest edge in

each step x y z

1 3 2

v z

2+3 = 5

contraction

6 / 19

slide-27
SLIDE 27

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

1: repeat 2:

find the lightest edge (x, y) in G

5: until 2 “super" vertices left

  • contract the lightest edge in

each step x y z

1 3 2

v z

2+3 = 5

contraction

6 / 19

slide-28
SLIDE 28

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

1: repeat 2:

find the lightest edge (x, y) in G

3:

contract x, y to be a super vertex v

5: until 2 “super" vertices left

  • contract the lightest edge in

each step x y z

1 3 2

v z

2+3 = 5

contraction

6 / 19

slide-29
SLIDE 29

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

1: repeat 2:

find the lightest edge (x, y) in G

3:

contract x, y to be a super vertex v

4:

set the edge weights connecting v

5: until 2 “super" vertices left

  • contract the lightest edge in

each step x y z

1 3 2

v z

2+3 = 5

contraction

6 / 19

slide-30
SLIDE 30

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

1: repeat 2:

find the lightest edge (x, y) in G

3:

contract x, y to be a super vertex v

4:

set the edge weights connecting v

5: until 2 “super" vertices left 6: return the 2 super vertices

  • contract the lightest edge in

each step x y z

1 3 2

v z

2+3 = 5

contraction

6 / 19

slide-31
SLIDE 31

Backward Greedy – Edge Contraction Algorithm

Edge Contraction (EC)

Require: graph G = (V, E) Ensure: cut, cut value

1: repeat 2:

find the lightest edge (x, y) in G

3:

contract x, y to be a super vertex v

4:

set the edge weights connecting v

5: until 2 “super" vertices left 6: return the 2 super vertices

  • contract the lightest edge in

each step x y z

1 3 2

v z

2+3 = 5

contraction

Backward greedy: EC tries to remove the lightest edge from the cut set in each step

6 / 19

slide-32
SLIDE 32

Contents

Greedy MaxCut Algorithms Approximation Set Coding (ASC) Applying ASC: Count the Approximation Sets Applying ASC: Experiments and Analysis

6 / 19

slide-33
SLIDE 33

Glance of Approximation Set Coding (ASC)

How to measure the robustness of these algorithms facing noise?

7 / 19

slide-34
SLIDE 34

Glance of Approximation Set Coding (ASC)

How to measure the robustness of these algorithms facing noise?

  • ASC: an analogy to Shannon’s communication theory

learning procedure ⇔ communication process [Buhmann 2010]

7 / 19

slide-35
SLIDE 35

Glance of Approximation Set Coding (ASC)

How to measure the robustness of these algorithms facing noise?

  • ASC: an analogy to Shannon’s communication theory

learning procedure ⇔ communication process [Buhmann 2010] 2 instances scenario: training G′, test G′′ (noisy instaces

  • f G)

G′ G′′ G noise noise “Master" Graph Two Instances

7 / 19

slide-36
SLIDE 36

Glance of Approximation Set Coding (ASC)

How to measure the robustness of these algorithms facing noise?

  • ASC: an analogy to Shannon’s communication theory

learning procedure ⇔ communication process [Buhmann 2010] 2 instances scenario: training G′, test G′′ (noisy instaces

  • f G)

G′ G′′ G noise noise “Master" Graph Two Instances

  • Models/algorithms should generalize well from G′ to G′′

7 / 19

slide-37
SLIDE 37

Approximate Solving and Algorithmic Approx. Set

  • Empirical risk minimizer

c⊥(G) := arg minc R(c, G)

8 / 19

slide-38
SLIDE 38

Approximate Solving and Algorithmic Approx. Set

  • Empirical risk minimizer

c⊥(G) := arg minc R(c, G) c⊥(G′)

noise

= c⊥(G′′)

8 / 19

slide-39
SLIDE 39

Approximate Solving and Algorithmic Approx. Set

  • Empirical risk minimizer

c⊥(G) := arg minc R(c, G) c⊥(G′)

noise

= c⊥(G′′)

  • γ-approximation set (solutions γ distant from

c⊥): Cγ(G) :=

  • c ∈ C
  • R(c, G) − R(c⊥, G) ≤ γ
  • γ: resolution

8 / 19

slide-40
SLIDE 40

Approximate Solving and Algorithmic Approx. Set

  • Empirical risk minimizer

c⊥(G) := arg minc R(c, G) c⊥(G′)

noise

= c⊥(G′′)

  • γ-approximation set (solutions γ distant from

c⊥): Cγ(G) :=

  • c ∈ C
  • R(c, G) − R(c⊥, G) ≤ γ
  • γ: resolution

Cγ(G) c⊥ γ

8 / 19

slide-41
SLIDE 41

Approximate Solving and Algorithmic Approx. Set

  • Empirical risk minimizer

c⊥(G) := arg minc R(c, G) c⊥(G′)

noise

= c⊥(G′′)

  • γ-approximation set (solutions γ distant from

c⊥): Cγ(G) :=

  • c ∈ C
  • R(c, G) − R(c⊥, G) ≤ γ
  • γ: resolution

Cγ(G) c⊥ γ

8 / 19

slide-42
SLIDE 42

Approximate Solving and Algorithmic Approx. Set

  • Empirical risk minimizer

c⊥(G) := arg minc R(c, G) c⊥(G′)

noise

= c⊥(G′′)

  • γ-approximation set (solutions γ distant from

c⊥): Cγ(G) :=

  • c ∈ C
  • R(c, G) − R(c⊥, G) ≤ γ
  • γ: resolution

Cγ(G) c⊥ γ

  • Flow of contractive A : sequence of the

available solution sets in each step t

8 / 19

slide-43
SLIDE 43

Approximate Solving and Algorithmic Approx. Set

  • Empirical risk minimizer

c⊥(G) := arg minc R(c, G) c⊥(G′)

noise

= c⊥(G′′)

  • γ-approximation set (solutions γ distant from

c⊥): Cγ(G) :=

  • c ∈ C
  • R(c, G) − R(c⊥, G) ≤ γ
  • γ: resolution

Cγ(G) c⊥ γ

  • Flow of contractive A : sequence of the

available solution sets in each step t Algorithmic t-approximation set [Gronskiy and Buhmann 2014]: CA

t (G)

8 / 19

slide-44
SLIDE 44

Approximate Solving and Algorithmic Approx. Set

  • Empirical risk minimizer

c⊥(G) := arg minc R(c, G) c⊥(G′)

noise

= c⊥(G′′)

  • γ-approximation set (solutions γ distant from

c⊥): Cγ(G) :=

  • c ∈ C
  • R(c, G) − R(c⊥, G) ≤ γ
  • γ: resolution

Cγ(G) c⊥ γ

  • Flow of contractive A : sequence of the

available solution sets in each step t Algorithmic t-approximation set [Gronskiy and Buhmann 2014]: CA

t (G)

ր step t ⇔ ց resolution γ

8 / 19

slide-45
SLIDE 45

Analogy of Communication System

9 / 19

slide-46
SLIDE 46

Analogy of Communication System

(Not going into detail here)

9 / 19

slide-47
SLIDE 47

Analogy of Communication System

(Not going into detail here)

Analogical mutual information in step t

IA

t

:= EG′,G′′

  • log
  • |C|·|∆CA

t (G′,G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • ∆CA

t (G′, G′′) = CA t (G′) ∩ CA t (G′′)

9 / 19

slide-48
SLIDE 48

Analogy of Communication System

(Not going into detail here)

Analogical mutual information in step t

IA

t

:= EG′,G′′

  • log
  • |C|·|∆CA

t (G′,G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • ∆CA

t (G′, G′′) = CA t (G′) ∩ CA t (G′′)

Information content of A

9 / 19

slide-49
SLIDE 49

Analogy of Communication System

(Not going into detail here)

Analogical mutual information in step t

IA

t

:= EG′,G′′

  • log
  • |C|·|∆CA

t (G′,G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • ∆CA

t (G′, G′′) = CA t (G′) ∩ CA t (G′′)

Information content of A

channel capacity IA := maxt IA

t

9 / 19

slide-50
SLIDE 50

Information Content of an Algorithm A

G′ G′′ P(G) A (G′) A (G′′) Data Inputs Algorithm Optimal c⊥(G) mutual information: IA

t

:= E

  • log
  • |C| |CA

t (G′)∩CA t (G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • (stepwise information)

Information content of A : channel capacity IA := maxt IA

t

10 / 19

slide-51
SLIDE 51

Information Content of an Algorithm A

G′ G′′ P(G) A (G′) A (G′′) Data Inputs Algorithm Optimal c⊥(G) mutual information: IA

t

:= E

  • log
  • |C| |CA

t (G′)∩CA t (G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • (stepwise information)

ր step t ⇔ ց resolution γ less informative but more robust Information content of A : channel capacity IA := maxt IA

t

10 / 19

slide-52
SLIDE 52

Information Content of an Algorithm A

G′ G′′ P(G) A (G′) A (G′′) Data Inputs Algorithm Optimal c⊥(G) mutual information: IA

t

:= E

  • log
  • |C| |CA

t (G′)∩CA t (G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • (stepwise information)

ր step t ⇔ ց resolution γ less informative but more robust Information content of A : channel capacity IA := maxt IA

t

10 / 19

slide-53
SLIDE 53

Information Content of an Algorithm A

G′ G′′ P(G) A (G′) A (G′′) Data Inputs Algorithm Optimal c⊥(G) mutual information: IA

t

:= E

  • log
  • |C| |CA

t (G′)∩CA t (G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • (stepwise information)

ր step t ⇔ ց resolution γ less informative but more robust Information content of A : channel capacity IA := maxt IA

t

10 / 19

slide-54
SLIDE 54

Information Content of an Algorithm A

G′ G′′ P(G) A (G′) A (G′′) Data Inputs Algorithm Optimal c⊥(G) mutual information: IA

t

:= E

  • log
  • |C| |CA

t (G′)∩CA t (G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • (stepwise information)

ր step t ⇔ ց resolution γ less informative but more robust Information content of A : channel capacity IA := maxt IA

t

10 / 19

slide-55
SLIDE 55

Information Content of an Algorithm A

G′ G′′ P(G) A (G′) A (G′′) Data Inputs Algorithm Optimal c⊥(G) mutual information: IA

t

:= E

  • log
  • |C| |CA

t (G′)∩CA t (G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • (stepwise information)

ր step t ⇔ ց resolution γ less informative but more robust Information content of A : channel capacity IA := maxt IA

t

10 / 19

slide-56
SLIDE 56

Contents

Greedy MaxCut Algorithms Approximation Set Coding (ASC) Applying ASC: Count the Approximation Sets Applying ASC: Experiments and Analysis

10 / 19

slide-57
SLIDE 57

Counting – Double Greedy Algorithms

Counting methods similar for double greedy algorithms (D2Greedy, SG, SG3)

11 / 19

slide-58
SLIDE 58

Counting – Double Greedy Algorithms

Counting methods similar for double greedy algorithms (D2Greedy, SG, SG3)

  • SG3: assume k vertices

unlabeled in step t, |CA

t (G

′)| = |CA

t (G

′′)| = 2k 11 / 19

slide-59
SLIDE 59

Counting – Double Greedy Algorithms

Counting methods similar for double greedy algorithms (D2Greedy, SG, SG3)

  • SG3: assume k vertices

unlabeled in step t, |CA

t (G

′)| = |CA

t (G

′′)| = 2k

  • |CA

t (G

′) ∩ CA

t (G

′′)| 11 / 19

slide-60
SLIDE 60

Counting – Double Greedy Algorithms

Counting methods similar for double greedy algorithms (D2Greedy, SG, SG3)

  • SG3: assume k vertices

unlabeled in step t, |CA

t (G

′)| = |CA

t (G

′′)| = 2k

  • |CA

t (G

′) ∩ CA

t (G

′′)|

We propose (and prove correctness) polynomial time algorithm to count (not going in detail here):

11 / 19

slide-61
SLIDE 61

Counting – Edge Contraction Algorithm

  • In step t, there are k “super"

vertices, get |CA

t (G

′)| = |CA

t (G

′′)| = 2k−1 − 1 12 / 19

slide-62
SLIDE 62

Counting – Edge Contraction Algorithm

  • In step t, there are k “super"

vertices, get |CA

t (G

′)| = |CA

t (G

′′)| = 2k−1 − 1

  • We propose polynomial time

algorithm (and prove correctness) to exactly count |CA

t (G

′) ∩ CA

t (G

′′)| 12 / 19

slide-63
SLIDE 63

Counting – Edge Contraction Algorithm

  • In step t, there are k “super"

vertices, get |CA

t (G

′)| = |CA

t (G

′′)| = 2k−1 − 1

  • We propose polynomial time

algorithm (and prove correctness) to exactly count |CA

t (G

′) ∩ CA

t (G

′′)| 12 / 19

slide-64
SLIDE 64

Counting – Edge Contraction Algorithm

  • In step t, there are k “super"

vertices, get |CA

t (G

′)| = |CA

t (G

′′)| = 2k−1 − 1

  • We propose polynomial time

algorithm (and prove correctness) to exactly count |CA

t (G

′) ∩ CA

t (G

′′)|

  • Involves calculating max. number
  • f common super vertices between 2

super vertex sets (details in the paper)

12 / 19

slide-65
SLIDE 65

Counting – Edge Contraction Algorithm

  • In step t, there are k “super"

vertices, get |CA

t (G

′)| = |CA

t (G

′′)| = 2k−1 − 1

  • We propose polynomial time

algorithm (and prove correctness) to exactly count |CA

t (G

′) ∩ CA

t (G

′′)|

  • Involves calculating max. number
  • f common super vertices between 2

super vertex sets (details in the paper)

12 / 19

slide-66
SLIDE 66

Contents

Greedy MaxCut Algorithms Approximation Set Coding (ASC) Applying ASC: Count the Approximation Sets Applying ASC: Experiments and Analysis

12 / 19

slide-67
SLIDE 67

Noise Model: Gaussian Edge Weights

Master Graph G

Gaussian distributed edge weights: Wij ∼ N(µ, σ2

m), µ = 600, σm = 50

Negative edges are set to be µ.

13 / 19

slide-68
SLIDE 68

Noise Model: Gaussian Edge Weights

Master Graph G

Gaussian distributed edge weights: Wij ∼ N(µ, σ2

m), µ = 600, σm = 50

Negative edges are set to be µ.

Master graph G with Gaussian weights

13 / 19

slide-69
SLIDE 69

Noise Model: Gaussian Edge Weights

Master Graph G

Gaussian distributed edge weights: Wij ∼ N(µ, σ2

m), µ = 600, σm = 50

Negative edges are set to be µ.

Master graph G with Gaussian weights Noisy Graphs G

′, G ′′

G

′, G ′′ are obtained by adding Gaussian distributed noise.

Negative edges are set to be 0.

13 / 19

slide-70
SLIDE 70

Noise Model: Edge Reversal

Master Graph G

14 / 19

slide-71
SLIDE 71

Noise Model: Edge Reversal

Master Graph G

  • 1. approximate bipartite G′

b: light edges,

heavy edges

14 / 19

slide-72
SLIDE 72

Noise Model: Edge Reversal

Master Graph G

  • 1. approximate bipartite G′

b: light edges,

heavy edges

heavy edges light edges

Approximate bipartite graph G′

b

14 / 19

slide-73
SLIDE 73

Noise Model: Edge Reversal

Master Graph G

  • 1. approximate bipartite G′

b: light edges,

heavy edges

  • 2. randomly flip edges in G′

b ⇒ G,

flipping: heavy (light) ⇒ light (heavy) (flip eij) ∼ Ber(pm); pm = 0.2

heavy edges light edges

Approximate bipartite graph G′

b

14 / 19

slide-74
SLIDE 74

Noise Model: Edge Reversal

Master Graph G

  • 1. approximate bipartite G′

b: light edges,

heavy edges

  • 2. randomly flip edges in G′

b ⇒ G,

flipping: heavy (light) ⇒ light (heavy) (flip eij) ∼ Ber(pm); pm = 0.2

heavy edges light edges

Approximate bipartite graph G′

b

Noisy Graphs G

′, G ′′ 14 / 19

slide-75
SLIDE 75

Noise Model: Edge Reversal

Master Graph G

  • 1. approximate bipartite G′

b: light edges,

heavy edges

  • 2. randomly flip edges in G′

b ⇒ G,

flipping: heavy (light) ⇒ light (heavy) (flip eij) ∼ Ber(pm); pm = 0.2

heavy edges light edges

Approximate bipartite graph G′

b

Noisy Graphs G

′, G ′′

  • Flip G ⇒ G

′ and G ′′.

Probability of flipping an edge: Bernoulli distribution with p, (flip eij) ∼ Ber(p) p: noise level

14 / 19

slide-76
SLIDE 76

Stepwise Information IA

t IA

t

:= EG′,G′′

  • log
  • |C|·|∆CA

t (G′,G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • Gaussian Model, σ = 125

Edge Reversal, p = 0.65

15 / 19

slide-77
SLIDE 77

Stepwise Information IA

t IA

t

:= EG′,G′′

  • log
  • |C|·|∆CA

t (G′,G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • Gaussian Model, σ = 125

Edge Reversal, p = 0.65

  • IA

t

behavior: increase initially ⇒ reach the optimal step t∗ ⇒ decreases ⇒ vanishes.

15 / 19

slide-78
SLIDE 78

Stepwise Information IA

t IA

t

:= EG′,G′′

  • log
  • |C|·|∆CA

t (G′,G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • Gaussian Model, σ = 125

Edge Reversal, p = 0.65

  • IA

t

behavior: increase initially ⇒ reach the optimal step t∗ ⇒ decreases ⇒ vanishes.

  • consistent with analysis: ր t ⇒ tradeoff of roubstness and

informativeness

15 / 19

slide-79
SLIDE 79

Information Content IA

IA := maxt IA

t

(channel capacity)

Gaussian Edge Weights Model Edge Reversal Model

16 / 19

slide-80
SLIDE 80

Information Content IA

IA := maxt IA

t

(channel capacity)

Gaussian Edge Weights Model Edge Reversal Model

  • All reach max. information content in the noise free limit (G′ = G′′)

(p = 0, 1 in edge reversal model, σ = 0 in Gaussian model)

16 / 19

slide-81
SLIDE 81

Information Content IA

IA := maxt IA

t

(channel capacity)

Gaussian Edge Weights Model Edge Reversal Model

  • All reach max. information content in the noise free limit (G′ = G′′)

(p = 0, 1 in edge reversal model, σ = 0 in Gaussian model)

  • 1 node transmits about 1 bit information

16 / 19

slide-82
SLIDE 82

Effect of Greedy Heuristics

Backward greedy double greedy

Gaussian Edge Weights Model Edge Reversal Model

17 / 19

slide-83
SLIDE 83

Effect of Greedy Heuristics

Backward greedy double greedy

Gaussian Edge Weights Model Edge Reversal Model

  • Delayed decision making of backward greedy

17 / 19

slide-84
SLIDE 84

Effect of Greedy Heuristics

Backward greedy double greedy

Gaussian Edge Weights Model Edge Reversal Model

  • Delayed decision making of backward greedy
  • EC preserves consistent solutions by contracting lightest edge (having

low probability to be included in the cut)

17 / 19

slide-85
SLIDE 85

Effect of Greedy Techniques

Gaussian Edge Weights Model Edge Reversal Model

18 / 19

slide-86
SLIDE 86

Effect of Greedy Techniques

Gaussian Edge Weights Model Edge Reversal Model

  • Initializing (D2Greedy ⇒ SG): ց, due to early decision making

18 / 19

slide-87
SLIDE 87

Effect of Greedy Techniques

Gaussian Edge Weights Model Edge Reversal Model

  • Initializing (D2Greedy ⇒ SG): ց, due to early decision making
  • Sorting candidates (SG ⇒ SG3): ց, due to early decision making

18 / 19

slide-88
SLIDE 88

Discussion

  • Observation:

Different greedy heuristics (backward, double) and different processing techniques (sorting candidates, initializing the first 2 vertices) sensitively influence the information content of A .

19 / 19

slide-89
SLIDE 89

Discussion

  • Observation:

Different greedy heuristics (backward, double) and different processing techniques (sorting candidates, initializing the first 2 vertices) sensitively influence the information content of A .

  • Conjecture:

Backward greedy

delayed decision making

  • double greedy

for different noise models and noise levels.

19 / 19

slide-90
SLIDE 90

Thank you!

Qs?

19 / 19

slide-91
SLIDE 91

Supplement: Analogy of Communication System

Imaginary communication system:

  • message: permutations σs ∈ Σ on the data space
  • encoder: encoding σs using CA

t (σs ◦ G′) (codebook vector)

  • channel: noisy instances G′, G′′
  • decoder: max. overlap of approx. sets:

ˆ σ := arg maxσ∈Σ |CA

t (σ ◦ G′′) ∩ CA t (σs ◦ G′)|

Analogical mutual information in step t

IA

t (σs; ˆ

σ) := EG′,G′′

  • log
  • |C| |CA

t (G′)∩CA t (G′′)|

|CA

t (G′)|·|CA t (G′′)|

  • channel capacity IA := maxt IA

t

(Information content of A )

19 / 19