Distributed and Non-Distributed Computational Models Ami Paz IRIF - - PowerPoint PPT Presentation

β–Ά
distributed and non distributed
SMART_READER_LITE
LIVE PREVIEW

Distributed and Non-Distributed Computational Models Ami Paz IRIF - - PowerPoint PPT Presentation

Distributed and Non-Distributed Computational Models Ami Paz IRIF CNRS and Paris Diderot University Message Passing Models Local 1. Congest 2. Clique 3. Message Passing Models A graph = , representing the network


slide-1
SLIDE 1

Distributed and Non-Distributed Computational Models

Ami Paz IRIF – CNRS and Paris Diderot University

slide-2
SLIDE 2

Message Passing Models

1.

Local

2.

Congest

3.

Clique

slide-3
SLIDE 3

Message Passing Models

 A graph 𝐻 = π‘Š, 𝐹 representing the network’s topology  π‘œ unbounded processors, located on the nodes  Communicating on the edges  Synchronous network  Compute / verify graph parameters

3

slide-4
SLIDE 4

The Local Model

 Unbounded messages  Solving local tasks:

 Coloring  MST  MIS

 Anything solvable in Ο 𝐸 rounds

2-hop environment 1-hop environment

*

4

slide-5
SLIDE 5

Two Examples

 Triangle detection

 Easy, in one round  Send all your neighbors your list of neighbors

 Computing the diameter 𝐸

 Takes Θ(𝐸) rounds

5

slide-6
SLIDE 6

Diameter Lower Bound

 Computing 𝐸 takes Ω 𝐸 rounds

 Indistinguishability argument

6

𝐸 = π‘œ/2 𝐸 = π‘œ βˆ’ 1

slide-7
SLIDE 7

Diameter Lower Bound

 Computing 𝐸 takes Ω 𝐸 rounds

 Indistinguishability argument

7

View after π‘œ/2 βˆ’ 1 rounds View after π‘œ/2 βˆ’ 1 rounds

Cannot distinguish

𝐸 = π‘œ/2 𝐸 = π‘œ βˆ’ 1

slide-8
SLIDE 8

The Congest Model

 Bounded message size; typically 𝑐 = O log π‘œ  All Local lower bounds still hold  Some Local algorithms still work

 But not all!

Bottleneck

8

slide-9
SLIDE 9

Congest – Typical Lower Bound [HW12]

 Communication complexity problem  Inputs encoded by a graph  Split the graph between Alice and Bob  CC lower bounds imply message lower bounds

Alice Bob Bottleneck

Disjointness on Θ π‘œ2 bits. Diam 2 or 3?

  • Diam 2 – disjoint
  • Diam 3 – not disjoint

Ξ© (π‘œ) rounds are needed

1 2 3 4

slide-10
SLIDE 10

Congest – Another Lower Bound

Alice

Bottleneck

Ξ©( π‘œ/𝑐) lower bound Verification: MST, bipartiteness, cycle, connectivity… Approximation: MST, min cut, shortest s-t path…

slide-11
SLIDE 11

So Far:

 Local model:

 Unbounded messages  Everything is solvable in 𝑃 𝐸 rounds

 Congest model:

 Message = 𝑃 log π‘œ bits  Lower bounds of

Ξ© π‘œ + 𝐸

 Tight for many problems

 Question: is Ω

π‘œ due to congestion?

11

slide-12
SLIDE 12

The Clique Model

 All-to-all message passing – a clique network  Diameter of 1  No distance – only congestion  MST in 𝑃(logβˆ— π‘œ) rounds [GP16]

 Fast triangle detection, diameter, APSP, …

12

slide-13
SLIDE 13

Clique – Lower Bound?

 Diam = 1  Larger set – more outgoing edges  No nontrivial lower bound is known  Simple counting argument [DKO14]

 many functions need π‘œ βˆ’ 5 log π‘œ rounds

13

slide-14
SLIDE 14

Parallel Systems

slide-15
SLIDE 15

Parallel Systems

 π‘œ synchronous processors, 𝑙 inputs to each  Connected by a communication graph  Typical graphs:

 Clique  Cycle  T

  • rus (Grid)

 Known topology, known identities  Bounded message size  Bounded memory  Bounded computational power

15

slide-16
SLIDE 16

Parallel vs. Congest

 Parallel is more restrictive:

 Bounded memory  Bounded computational power

 Different focus:

 Specific communication graphs  Algebraic questions vs. graph parameters

16

slide-17
SLIDE 17

Circuits

slide-18
SLIDE 18

Circuits

 Algebraic computation model  A computation graph (circuit) composed of:

 Inputs, output, and operation gates

 Represent many algorithms:

 Matrix multiplication, determinant, permanent

 Complexity measures:

 Depth, number of gates, fan-in, fan-out + * + + Λ„ Λ… Λ„ Λ„

18

slide-19
SLIDE 19

Λ„

Circuits Families

 Arithmetic circuits  Boolean circuits  Boolean circuits augmented with:

 mod 𝑛 gates  Threshold gates  … Λ„

mod 3 Λ„ Λ„

19

slide-20
SLIDE 20

Circuits Lower Bounds

 What can be computed in constant depth?  Counting argument:

 Many functions cannot be computed using Boolean circuits  … or even using augmented circuits

 But:

 No explicit function is known

20

slide-21
SLIDE 21

Circuits ⇔ Clique

slide-22
SLIDE 22

Clique vs. Circuits

 Clique can simulate circuits [DKO14]

 Each node simulates a set of gates in a layer  Circuit’s depth = # of rounds mod 3 mod 3 Λ… Λ… Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„

22

slide-23
SLIDE 23

Clique vs. Circuits

 Main idea:

 Simulate each layer of the circuit in 𝑃 1 rounds mod 3 mod 3 Λ… Λ… Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ 𝑦 𝑧 𝑧 𝑦 Λ„

23

slide-24
SLIDE 24

Clique vs. Circuits

 Main idea:

 Simulate each layer of the circuit in 𝑃 1 rounds mod 3 mod 3 Λ… Λ… Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„

24

slide-25
SLIDE 25

Clique vs. Circuits

 Main idea:

 Simulate each layer of the circuit in 𝑃 1 rounds mod 3 mod 3 Λ… Λ… Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„

25

slide-26
SLIDE 26

Clique vs. Circuits

 Main idea:

 Simulate each layer of the circuit in 𝑃 1 rounds mod 3 mod 3 Λ… Λ… Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„ Λ„

26

slide-27
SLIDE 27

Clique vs. Circuits

 Clique can simulate circuits

 Non-constant rounds lower bound for the Clique β‡’

Non-constant depth lower bound for circuits

 There is also a reduction in the other direction [DKO14]

 A circuit can simulate the Clique

mod 3 Λ„ Λ„ Λ„

27

slide-28
SLIDE 28

Parallel ⇔ Clique

slide-29
SLIDE 29

Matrix Multiplication

 Base for many algebraic problems  Thoroughly studied in parallel computing  Several algorithms:

 different topologies, input / output partitions

𝑅 𝑇 π‘ˆ = β‹…

29

slide-30
SLIDE 30

Matrix Multiplication

 This talk:

 The 3D algorithm [ABG+95]  For π‘œ Γ— π‘œ matrices and π‘œ processors  Adaptation of parallel algorithm to the Clique [CHK+16]

𝑅 𝑇 π‘ˆ = β‹…

30

Skip Details

slide-31
SLIDE 31

Matrix Multiplication

 Parallel 3D algorithm β‡’

 Clique matrix multiplication in 𝑃 π‘œ1/3 rounds

 Implies triangle detection, 𝐸, APSP

, …

 In similar time [CHK+16]

𝑅 𝑇 π‘ˆ = β‹…

31

slide-32
SLIDE 32

Fast Matrix Multiplication

 Standard matrix multiplication:

 Compute π‘œ2 entries, each need π‘œ multiplications  T

  • tal: Θ π‘œ3 time

 There exist faster algorithms:

 Strassen 𝑃 π‘œ2.807

[1969]

 Coopersmith-Vinograd 𝑃 π‘œ2.376

[1990]

 …  Le Gall 𝑃 π‘œ2.373

[2014]  Can be implemented in the Clique

 Distributed matrix multiplication in 𝑃(π‘œ0.158) rounds

32

slide-33
SLIDE 33

Some Results & Conclusion

slide-34
SLIDE 34

Triangle Detection in the Clique

  • 1. Combinatorial algorithm:

 Ο π‘œ

1 3

rounds [DLP12]

  • 2. Reduction from circuits for matrix multiplication:



π‘œπœ•βˆ’2 β‰ˆ Ο π‘œ0.373 rounds, randomized [DKO14]

  • 3. Using a technique from parallel matrix multiplication:

 O π‘œ1βˆ’

2 πœ• β‰ˆ Ο π‘œ0.158 rounds [CHK+16]

 2,3 Imply similar complexities for:

 APSP, diameter, girth

34

Sequential matrix multiplication: 𝑃 π‘œπœ• operations

slide-35
SLIDE 35

Conclusion

 Several models:

 Message passing

 Local, Congest and Clique

 Parallel systems  Circuits

 Arithmetic, Boolean, augmented

 Many connections and similarities  Approach different questions  Using different techniques

+ * + +

35

Thank You!