Multidimensional scaling and flat split systems Monika Balvoi ut - - PowerPoint PPT Presentation

multidimensional scaling and flat split systems
SMART_READER_LITE
LIVE PREVIEW

Multidimensional scaling and flat split systems Monika Balvoi ut - - PowerPoint PPT Presentation

Multidimensional scaling and flat split systems Monika Balvoi ut e joint work with David Bryant University of Otago 6th Nov 2014 1 / 28 Splits and Split systems A split S = A | B is a bipartition of a set of taxa X into two non empty


slide-1
SLIDE 1

Multidimensional scaling and flat split systems

Monika Balvoči¯ ut˙ e

joint work with David Bryant University of Otago 6th Nov 2014

1 / 28

slide-2
SLIDE 2

Splits and Split systems

A split S = A|B is a bipartition of a set of taxa X into two non empty subsets such that X = A ∪ B and A ∩ B = ∅. A split system S is set of splits {S} over some set of taxa X.

2 / 28

slide-3
SLIDE 3

Equivalent representations of flat split systems

Oriented matroid splits Flat split system Planar split network

a b c d ℓ∞ b d c a d b c a

3 / 28

slide-4
SLIDE 4

FlatNJ – computing planar split networks

Compute building blocks Identify neighbors Agglomerate Reverse agglomeration Weight and filter

  • M. Balvoči¯

ut˙ e, A. Spillner and V. Moulton, FlatNJ:..., Syst. Biol. 2014, 63(3): 383–96 4 / 28

slide-5
SLIDE 5

Neighbors

e and f are neighbors a b c d a b d a c b c d a b c d a b d a c b c d e e e e f f f f

5 / 28

slide-6
SLIDE 6

Not Neighbors

b and f are not neighbors c e d c d c e e d a a a a c d a c e e d a c e d b b b b f f f f

6 / 28

slide-7
SLIDE 7

Agglomeration

a b c d a b d a c b c d a b c d a b d a c b c d e e e e f f f f a b c d a b d a c b c d e,f e,f e,f e,f a b c d a b c d

7 / 28

slide-8
SLIDE 8

Agglomeration

a b c d a b d a c b c d a b c d a b d a c b c d e e e e f f f f a b c d a b d a c b c d e,f e,f e,f e,f a b c d a b c d

7 / 28

slide-9
SLIDE 9

Agglomeration

a b c d b d c b c d e,f b c d e,f e,f e,f a a a b c d a,e,f

7 / 28

slide-10
SLIDE 10

Agglomeration

a b c d b d c b c d e,f b c d e,f e,f e,f a a a b c d a,e,f

7 / 28

slide-11
SLIDE 11

Reversing agglomeration

b c d a,e,f b c d a e,f b c d a e f

8 / 28

slide-12
SLIDE 12

Reversing agglomeration

b c d a,e,f b c d a e,f b c d a e f

8 / 28

slide-13
SLIDE 13

Reversing agglomeration

b c d a,e,f b c d a e,f b c d a e f

8 / 28

slide-14
SLIDE 14

Reversing agglomeration

b c d a,e,f b c d a e,f b c d a e f

8 / 28

slide-15
SLIDE 15

Q: When does it fail?

A: When there are no neighbours.

9 / 28

slide-16
SLIDE 16

Affine splits

Split – line ℓS in R2 − X; Split system – arrangement of lines A in R2 − X;

A B X X

Split Split system

10 / 28

slide-17
SLIDE 17

Neighbours in affine split systems

a b e d g a b e d g

Neighbours Not neighbours

11 / 28

slide-18
SLIDE 18

For example

a b c d e f a b c d e f ⇒

Input

12 / 28

slide-19
SLIDE 19

For example

a b c d e f a b c d e f ⇒

Input Output

12 / 28

slide-20
SLIDE 20

For example

a b c d e f a b c d e f ⇒

Input Output

12 / 28

slide-21
SLIDE 21

Multidimensional scaling (MDS)

Plot points in low (e.g. two) dimensional space based on their pairwise distances.

1 2 3 . . . n 1 d12 d13 . . . d1n 2 d12 d23 . . . d2n 3 d13 d23 . . . d3n . . . . . . . . . . . . ... . . . n d1n d2n d3n . . . ⇒

. . .

1 2 3 n

13 / 28

slide-22
SLIDE 22

Multidimensional scaling (MDS)

Plot points in low (e.g. two) dimensional space based on their pairwise distances.

1 2 3 . . . n 1 d12 d13 . . . d1n 2 d12 d23 . . . d2n 3 d13 d23 . . . d3n . . . . . . . . . . . . ... . . . n d1n d2n d3n . . . ⇒

. . .

1 2 3 n

Minimize the difference between input and output distances.

13 / 28

slide-23
SLIDE 23

MSD Stress

  • i
  • j=i(dij − δij)2
  • i
  • j=i(d2

ij − δ2 ij)2

  • i
  • j=i(dij−δij)2
  • i
  • j=i d2

ij

  • i
  • j=i wij(dij−δij)2
  • i
  • j=i wijd2

ij

. . . min

14 / 28

slide-24
SLIDE 24

MSD Stress

  • i
  • j=i(dij − δij)2
  • i
  • j=i(d2

ij − δ2 ij)2

  • i
  • j=i(dij−δij)2
  • i
  • j=i d2

ij

  • i
  • j=i wij(dij−δij)2
  • i
  • j=i wijd2

ij

. . . min

dij – actual distance; δij – plotted distance

14 / 28

slide-25
SLIDE 25

MSD Stress

  • i
  • j=i(dij − δij)2
  • i
  • j=i(d2

ij − δ2 ij)2

  • i
  • j=i(dij−δij)2
  • i
  • j=i d2

ij

  • i
  • j=i wij(dij−δij)2
  • i
  • j=i wijd2

ij

. . . min

dij – actual distance; δij – plotted distance

14 / 28

slide-26
SLIDE 26

MSD Stress

  • i
  • j=i(dij − δij)2
  • i
  • j=i(d2

ij − δ2 ij)2

  • i
  • j=i(dij−δij)2
  • i
  • j=i d2

ij

  • i
  • j=i wij(dij−δij)2
  • i
  • j=i wijd2

ij

. . . min

dij – actual distance; δij – plotted distance

14 / 28

slide-27
SLIDE 27

MSD Stress

  • i
  • j=i(dij − δij)2
  • i
  • j=i(d2

ij − δ2 ij)2

  • i
  • j=i(dij−δij)2
  • i
  • j=i d2

ij

  • i
  • j=i wij(dij−δij)2
  • i
  • j=i wijd2

ij

. . . min

dij – actual distance; δij – plotted distance

14 / 28

slide-28
SLIDE 28

MSD Stress

  • i
  • j=i(dij − δij)2
  • i
  • j=i(d2

ij − δ2 ij)2

  • i
  • j=i(dij−δij)2
  • i
  • j=i d2

ij

  • i
  • j=i wij(dij−δij)2
  • i
  • j=i wijd2

ij

. . . min

dij – actual distance; δij – plotted distance

14 / 28

slide-29
SLIDE 29

MSD Stress

  • i
  • j=i(dij − δij)2
  • i
  • j=i(d2

ij − δ2 ij)2

  • i
  • j=i(dij−δij)2
  • i
  • j=i d2

ij

  • i
  • j=i wij(dij−δij)2
  • i
  • j=i wijd2

ij

. . . min

dij – actual distance; δij – plotted distance

14 / 28

slide-30
SLIDE 30

MSD

  • S. L. France & J. D. Carroll, Two-Way Multidimensional Scaling: A Review, IEEE Trans. Syst.,

Man, Cybern.,Syst 2011, 41(5): 644–61 15 / 28

slide-31
SLIDE 31

Agglomerative approach to MDS

Take pairwise distance matrix Identify neighbours Agglomerate Reverse

16 / 28

slide-32
SLIDE 32

Agglomeration

g a d e b

17 / 28

slide-33
SLIDE 33

Agglomeration

g a d e b

17 / 28

slide-34
SLIDE 34

Agglomeration

g a d e b

17 / 28

slide-35
SLIDE 35

Agglomeration

g a d e b c

17 / 28

slide-36
SLIDE 36

Agglomeration

g a d e b c

17 / 28

slide-37
SLIDE 37

Agglomeration

g a d e b c d1 d2 d3 dm

dm =

  • 2d2

1+2d2 2−d2 3

4

17 / 28

slide-38
SLIDE 38

Agglomeration

g a d e b c d1 d2 d3 dm

dm =

  • 2d2

1+2d2 2−d2 3

4

17 / 28

slide-39
SLIDE 39

Agglomeration

g a d e b c d1 d2 d3 dm

dm =

  • 2d2

1+2d2 2−d2 3

4

17 / 28

slide-40
SLIDE 40

Agglomeration

1 2 . . . m a b 1 d12 . . . d1m da1 db1 2 d12 . . . d2m da2 db2 . . . . . . . . . ... . . . . . . . . . m d1m d2m . . . dam dbm a da1 da2 . . . dam dab b db1 db2 . . . dbm dab 1 2 . . . m c 1 d12 . . . d1m dc1 =

  • 2d2

a1+2d2 b1−d2 ab

4

2 d12 . . . d2m dc2 =

  • 2d2

a2+2d2 b2−d2 ab

4

. . . . . . . . . ... . . . . . . m d1m d2m . . . dcm =

  • 2d2

am+2d2 bm−d2 ab

4

c dc1 dc2 . . . dcm

18 / 28

slide-41
SLIDE 41

Agglomeration

1 2 . . . m a b 1 d12 . . . d1m da1 db1 2 d12 . . . d2m da2 db2 . . . . . . . . . ... . . . . . . . . . m d1m d2m . . . dam dbm a da1 da2 . . . dam dab b db1 db2 . . . dbm dab 1 2 . . . m c 1 d12 . . . d1m dc1 =

  • 2d2

a1+2d2 b1−d2 ab

4

2 d12 . . . d2m dc2 =

  • 2d2

a2+2d2 b2−d2 ab

4

. . . . . . . . . ... . . . . . . m d1m d2m . . . dcm =

  • 2d2

am+2d2 bm−d2 ab

4

c dc1 dc2 . . . dcm

18 / 28

slide-42
SLIDE 42

Agglomeration

1 2 . . . m a b 1 d12 . . . d1m da1 db1 2 d12 . . . d2m da2 db2 . . . . . . . . . ... . . . . . . . . . m d1m d2m . . . dam dbm a da1 da2 . . . dam dab b db1 db2 . . . dbm dab 1 2 . . . m c 1 d12 . . . d1m dc1 =

  • 2d2

a1+2d2 b1−d2 ab

4

2 d12 . . . d2m dc2 =

  • 2d2

a2+2d2 b2−d2 ab

4

. . . . . . . . . ... . . . . . . m d1m d2m . . . dcm =

  • 2d2

am+2d2 bm−d2 ab

4

c dc1 dc2 . . . dcm ⇓

18 / 28

slide-43
SLIDE 43

Expansion

g d e c

19 / 28

slide-44
SLIDE 44

Expansion

g d e c

19 / 28

slide-45
SLIDE 45

Expansion

g d e a b c

19 / 28

slide-46
SLIDE 46

Expansion

g d e b′ a′ a b c

19 / 28

slide-47
SLIDE 47

Expansion

We know:

g d e c dce dcg dcd a b

20 / 28

slide-48
SLIDE 48

Expansion

We know:

g d e c dce dcg dcd a b

c = {a, b}

20 / 28

slide-49
SLIDE 49

Expansion

We know:

g d e c dce dcg dcd a b

c = {a, b} dag, dbg dad, dbd dae, dbe

20 / 28

slide-50
SLIDE 50

Expansion

We know: We don’t know:

g d e c dce dcg dcd a b

c = {a, b} dag, dbg dad, dbd dae, dbe Actual dimension

20 / 28

slide-51
SLIDE 51

Expansion

g d e c

21 / 28

slide-52
SLIDE 52

Expansion

g d e c

21 / 28

slide-53
SLIDE 53

Expansion

g d e c a = −b b

21 / 28

slide-54
SLIDE 54

Expansion

g d e c a = −b b

21 / 28

slide-55
SLIDE 55

Expansion

g d e c a = −b b δae δbe δag δbg δad δbd δab

21 / 28

slide-56
SLIDE 56

Expansion

g d e c a = −b b δae δbe δag δbg δad δbd δab

δab ∼ dab δag ∼ dag δad ∼ dad δae ∼ dae δbg ∼ dbg δbd ∼ dbd δbe ∼ dbe

21 / 28

slide-57
SLIDE 57

Expansion [minimizing stress function]

We have m points and want to separate a and b:

m

  • i=1
  • (δai − dai)2 + (δbi − dbi)2

+ (δab − dab)2 → min

22 / 28

slide-58
SLIDE 58

Expansion [minimizing stress function]

We have m points and want to separate a and b:

m

  • i=1
  • (δai − dai)2 + (δbi − dbi)2

+ (δab − dab)2 → min Substitute distances (δ’s) with coordinates (remember that a = −b):

m

  • i=1

[(

  • (xi − xa)2 + (yi − ya)2 − dai)2+

(

  • (xi + xa)2 + (yi + ya)2 − dbi)2]+

(2

  • (xa)2 + (ya)2 − dab)2 → min

22 / 28

slide-59
SLIDE 59

Expansion [minimizing stress function]

We have m points and want to separate a and b:

m

  • i=1
  • (δai − dai)2 + (δbi − dbi)2

+ (δab − dab)2 → min Substitute distances (δ’s) with coordinates (remember that a = −b):

m

  • i=1

[(

  • (xi − xa)2 + (yi − ya)2 − dai)2+

(

  • (xi + xa)2 + (yi + ya)2 − dbi)2]+

(2

  • (xa)2 + (ya)2 − dab)2 → min

And that is hard.

22 / 28

slide-60
SLIDE 60

Expansion (Solution no.1)

g d e ad bd ae be c ag bg

23 / 28

slide-61
SLIDE 61

Expansion (Solution no.1)

g d e ad bd ae be c ag bg a b

23 / 28

slide-62
SLIDE 62

Expansion (Solution no.1)

g d e ad bd ae be c ag bg a b dan dbn dan dbn a b a′ b′ c n

23 / 28

slide-63
SLIDE 63

Expansion (Solution no.2)

Solve numerically.

24 / 28

slide-64
SLIDE 64

How to evaluate what we get?

Compute overall stress.

25 / 28

slide-65
SLIDE 65

How to evaluate what we get?

Compute overall stress. Compare neighbourhoods (n nearest neighbours).

25 / 28

slide-66
SLIDE 66

How to select neighbours?

Minimum/maximum distance Minimum/maximum variance

26 / 28

slide-67
SLIDE 67

Thanks to

27 / 28

slide-68
SLIDE 68

Thank you for attention!

28 / 28