ER-DCOPS: A FRAMEWORK FOR DCOP WITH UNCERTAINTY IN CONSTRAINT - - PowerPoint PPT Presentation

er dcops a framework for dcop with uncertainty in
SMART_READER_LITE
LIVE PREVIEW

ER-DCOPS: A FRAMEWORK FOR DCOP WITH UNCERTAINTY IN CONSTRAINT - - PowerPoint PPT Presentation

1 ER-DCOPS: A FRAMEWORK FOR DCOP WITH UNCERTAINTY IN CONSTRAINT UTILITIES Tiep Le, Ferdinando Fioretto, William Yeoh, Tran Cao Son, Enrico Pontelli Computer Science Department New Mexico State University 2 OUTLINE BACKGROUND &


slide-1
SLIDE 1

ER-DCOPS: A FRAMEWORK FOR DCOP WITH UNCERTAINTY IN CONSTRAINT UTILITIES

Tiep Le, Ferdinando Fioretto, William Yeoh, Tran Cao Son, Enrico Pontelli Computer Science Department New Mexico State University

1

slide-2
SLIDE 2

OUTLINE

  • BACKGROUND & MOTIVATION
  • ER-DCOP
  • ER-DPOP ALGORITHM
  • EXPERIMENTAL RESULTS
  • CONCLUSION

2

slide-3
SLIDE 3

OUTLINE

  • BACKGROUND & MOTIVATION
  • ER-DCOP
  • ER-DPOP ALGORITHM
  • EXPERIMENTAL RESULTS
  • CONCLUSION

3

slide-4
SLIDE 4

DISTRIBUTED CONSTRAINT OPTIMZATION PROBLEMS

  • DCOP P = <Χ, D, F, A, α>

4

x3 x1 x2

slide-5
SLIDE 5

DISTRIBUTED CONSTRAINT OPTIMZATION PROBLEMS

  • DCOP P = <Χ, D, F, A, α>

5

x3 x1 x2

D1 = D2 = {0} D3 = {0, 1}

slide-6
SLIDE 6

DISTRIBUTED CONSTRAINT OPTIMZATION PROBLEMS

  • DCOP P = <Χ, D, F, A, α>

6

x3 x1 x2

D1 = D2 = {0} D3 = {0, 1} f13 f23

slide-7
SLIDE 7

DISTRIBUTED CONSTRAINT OPTIMZATION PROBLEMS

  • DCOP P = <Χ, D, F, A, α>

7

x3 x1 x2

D1 = D2 = {0} D3 = {0, 1} f13 f23

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

f23 f13

slide-8
SLIDE 8

DISTRIBUTED CONSTRAINT OPTIMZATION PROBLEMS

  • DCOP P = <Χ, D, F, A, α>

8

x3 x1 x2

D1 = D2 = {0} D3 = {0, 1} f13 f23

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

f23 f13

Worker x1 owns variable x1 Worker x2 owns variable x2 Assistant robot x3 owns variable x3

slide-9
SLIDE 9

DISTRIBUTED CONSTRAINT OPTIMZATION PROBLEMS

  • DCOP P = <Χ, D, F, A, α>
  • Goal: The assignment for all variables maximizes the

aggregate utility

9

x3 x1 x2

D1 = D2 = {0} D3 = {0, 1} f13 f23

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

f23 f13

Worker x1 owns variable x1 Worker x2 owns variable x2 Assistant robot x3 owns variable x3

slide-10
SLIDE 10

MOTIVATION

  • In real-world applications, the utilities are stochastic.

10

x2 x3 U23 (Fail) (Success) 40 1 (Fail) (Success) 50

f23

slide-11
SLIDE 11

UR-DCOP

  • In real-world applications, the utilities are stochastic.
  • Stochastic utilities can be sampled from a known

probability distribution space.

11

x2 x3 U23 (Fail) (Success) 40 1 (Fail) (Success) 50

f23

Good 50% 50% 50% 50% Bad 90% 10% 90% 10%

slide-12
SLIDE 12

MOTIVATION

  • In real-world applications, the utilities are stochastic.
  • Stochastic utilities can be sampled from a known

probability distribution space.

  • Expected-regret

12

x2 x3 U23 (Fail) (Success) 40 1 (Fail) (Success) 50

f23

Good 50% 50% 20% 80% Bad 90% 10% 50% 50%

slide-13
SLIDE 13

MOTIVATION

  • In real-world applications, the utilities are stochastic.
  • Stochastic utilities can be sampled from a known

probability distribution space.

13

ER-DCOP framework!

x2 x3 U23 (Fail) (Success) 40 1 (Fail) (Success) 50

f23

Good 50% 50% 20% 80% Bad 90% 10% 50% 50%

slide-14
SLIDE 14

OUTLINE

  • BACKGROUND
  • ER-DCOP
  • ER-DPOP ALGORITHM
  • EXPERIMENTAL RESULTS
  • CONCLUSION

14

slide-15
SLIDE 15

EXPECTED REGRET-DCOP (ER-DCOP)

15

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

slide-16
SLIDE 16

EXPECTED REGRET-DCOP (ER-DCOP)

16

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x2 x3 U23 (Fail) (Success) 40 1 (Fail) (Success) 50

slide-17
SLIDE 17

EXPECTED REGRET-DCOP (ER-DCOP)

17

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50

slide-18
SLIDE 18

EXPECTED REGRET-DCOP (ER-DCOP)

18

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80% Bad 90% 10% 50% 50%

Good: 12% Bad: 88%

slide-19
SLIDE 19

EXPECTED REGRET-DCOP (ER-DCOP)

19

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80% Bad 90% 10% 50% 50% x1 x3 r1 U13 0 0 (Fail) 1 (Success) 50 1 0 (Fail) 1 (Success) 30 Good 10% 90% 30% 70% Bad 30% 70% 50% 50%

  • ER-DCOP P = <Χ, D, A, α, R, S, F>
slide-20
SLIDE 20

EXPECTED REGRET-DCOP (ER-DCOP)

20

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80% x1 x3 r1 U13 0 0 (Fail) 1 (Success) 50 1 0 (Fail) 1 (Success) 30 Good 10% 90% 30% 70%

  • ER-DCOP P = <Χ, D, A, α, R, S, F>
  • belief of r1, belief of r2
  • x : joint belief for all random variables
slide-21
SLIDE 21

EXPECTED REGRET-DCOP (ER-DCOP)

21

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80% x1 x3 r1 U13 0 0 (Fail) 1 (Success) 50 1 0 (Fail) 1 (Success) 30 Good 10% 90% 30% 70%

  • ER-DCOP P = <Χ, D, A, α, R, S, F>
  • Using Expected Utility (EU)

EU 45 21 EU 20 40

consider only 1 joint belief of good weather

slide-22
SLIDE 22

EXPECTED REGRET-DCOP (ER-DCOP)

22

x1 x2 x3 EU 39 1 40

bad weather

x1 x2 x3 EU 65 1 41

good weather Belief Space Good: 12% Bad: 88%

Optimal assignment if bad weather (EU = 40) Optimal assignment if good weather (EU = 65)

slide-23
SLIDE 23

EXPECTED REGRET-DCOP (ER-DCOP)

23

bad weather good weather

Regret 40-39=1 40-40=0 Regret 65-65=0 65-61=4

Belief Space Good: 12% Bad: 88%

Assignment x1 = x2 = x3 = 0 has regret of 1 if bad weather regret of 0 if good weather x1 x2 x3 EU 39 1 40 x1 x2 x3 EU 65 1 41

slide-24
SLIDE 24

EXPECTED REGRET-DCOP (ER-DCOP)

24

bad weather good weather

Regret 40-39=1 40-40=0 Regret 65-65=0 65-61=4 x1 x2 x3 ER 0 12%*0 + 88%*1 = 0.88 1 12%*4 + 88%*0 = 0.48 12% 88%

Expected-Regret (ER) Belief Space Good: 12% Bad: 88%

x1 x2 x3 EU 39 1 40 x1 x2 x3 EU 65 1 41

slide-25
SLIDE 25

EXPECTED REGRET-DCOP (ER-DCOP)

25

x1 x3 EU 35 1 15

bad weather

x2 x3 EU 4 1 25 x1 x3 EU 45 1 21

good weather

x2 x3 EU 20 1 40 Regret 40-39=1 40-40=0 Regret 65-65=0 65-61=4 x1 x2 x3 ER 0 12%*0 + 88%*1 = 0.88 1 12%*4 + 88%*0 = 0.48 12% 88%

The solution minimizes the expected-regret

Expected-Regret (ER)

slide-26
SLIDE 26

OUTLINE

  • BACKGROUND
  • ER-DCOP
  • ER-DPOP ALGORITHM
  • EXPERIMENTAL RESULTS
  • CONCLUSION

26

slide-27
SLIDE 27

ER-DPOP

  • Phase 1: Generation of the pseudo-tree

27

x3 x1 x2

slide-28
SLIDE 28

ER-DPOP

  • Phase 2: Resolution of subproblems

28

x3 x1 x2

x1 x3 r1 U13 0 0 (Fail) 1 (Success) 50 1 0 (Fail) 1 (Success) 30 Good 10% 90% 30% 70% Bad 30% 70% 50% 50% x3 1 EU(Good) 45 21 EU(Bad) 35 15 EU = Expected Utility

slide-29
SLIDE 29

ER-DPOP

  • Phase 2: Resolution of subproblems

29

x3 x1 x2

x2 x3 r2 U23 0 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80% Bad 90% 10% 50% 50% x3 1 EU(Good) 20 40 EU(Bad) 4 25 EU = Expected Utility

slide-30
SLIDE 30

ER-DPOP

  • Phase 2: Resolution of subproblems

30

x3 x1 x2

x3 1 EU(Good) 45+20=65 21+40=61 EU(Bad) 35+4=39 15+25=40 EU = Expected Utility

slide-31
SLIDE 31

ER-DPOP

  • Phase 3: Resolution of the main problems
  • Generate DCOP with expected-regret as utilities
  • Use DPOP [Petcu et al. AAAI2007] to solve that DCOP

31

x1 x3 Expected-Regret 0 12%*(45-45) + 88%*(15-35) = -17.6 1 12%*(45-21) + 88%*(15-15) = 2.88 x1 x3 1 EU(Good) 45 21 EU(Bad) 35 15 x2 x3 1 EU(Good) 20 40 EU(Bad) 4 25 x2 x3 Expected-Regret 0 12%*(20-20) + 88%*(25-4) = 18.48 1 12%*(20-40) + 88%*(25-25) = 2.4

slide-32
SLIDE 32

ER-DPOP IMPLEMENTATIONS

  • GPU-ER-DPOP (GPU-based ER-DPOP)
  • Utilizes the parallelism offered by Graphical Processing

Unit (GPU) to speed up computations in ER-DPOP

  • ASP-ER-DPOP (ASP-based ER-DPOP)
  • Prunes the search space offered by logic-programming

based inference rules in Answer Set Programming (ASP)

32

slide-33
SLIDE 33

RELATED WORK

  • UR-DCOP (F. Wu et al. AAAI 2014)
  • Beliefs of random variables are independent with values
  • f decision variables;
  • Belief space does not exhibit probabilistic model;
  • Minimizing the worst-case loss (regret) over belief space.

33

slide-34
SLIDE 34

OUTLINE

  • BACKGROUND
  • ER-DCOP
  • ER-DPOP ALGORITHM
  • EXPERIMENTAL RESULTS
  • CONCLUSION

34

slide-35
SLIDE 35

EXPERIMENTAL RESULTS

  • Algorithms:
  • GPU-ER-DPOP
  • ASP-ER-DPOP
  • FRODO-ER (solve subproblems in Phase 2 sequentially)
  • Domains:
  • Random Graph (varying |X|, |D|, constraint density p1,

constraint tightness p2, or belief space’s size)

  • Power Network Problems (varying Topology or |D|)

35

slide-36
SLIDE 36

EXPERIMENTAL RESULTS

|X| ASP-ER-DPOP GPU-ER-DPOP FRODO-ER 8 3.1 0.1 0.3 13 9.4 0.2 61.1 18 44.1 N/A N/A 23 120.8 N/A N/A

runtime in second N/A: not available Random Graphs

36

|D| ASP-ER-DPOP GPU-ER-DPOP FRODO-ER 4 4.5 0.1 1.8 6 8.9 0.1 33.6 8 22.2 1.2 143.2 10 80.4 4.8 N/A 12 121.2 15.4 N/A

slide-37
SLIDE 37

EXPERIMENTAL RESULTS

Power Network Problems

37

102 103 104 105 106 3 5 7 9 11 13 15 Simulated Runtime (ms) Domain Size |A| = 13, |X| = 74, |F| = 51 ASP-ER Frodo-ER 102 103 104 105 106 3 5 7 9 11 13 15 Simulated Runtime (ms) Domain Size |A| = 37, |X| = 218, |F| = 147 ASP-ER Frodo-ER 102 103 104 105 106 3 5 7 9 11 13 15 Simulated Runtime (ms) Domain Size |A| = 124, |X| = 748, |F| = 497 ASP-ER Frodo-ER

slide-38
SLIDE 38

EXPERIMENTAL RESULTS

  • Compare the actual regret between
  • ER-DCOP
  • UR-DCOP (F. Wu et al. AAAI 2014)
  • Beliefs of random variables are independent with values of

decision variables;

  • Belief space does not exhibit probabilistic model;
  • Minimizing the worst-case loss (regret) over belief space.
  • Domain:
  • Random Graph
  • UR-DCOPs instances augmented a probability for each

joint belief according to a normal distribution.

38

slide-39
SLIDE 39

EXPERIMENTAL RESULTS

Belief Space’s Size Better Worse Equal 5 45% 20% 35% 10 36% 28% 36% 15 47% 20% 33%

Compare Actual Regret ER-DCOP solution vs UR-DCOP solution

39

slide-40
SLIDE 40

CONCLUSION

  • ER-DCOPs to model DCOPs with uncertainty in constraint

utilities.

  • ER-DPOP, a distributed complete algorithm to solve ER-

DCOPs.

  • GPU-ER-DPOP harnesses the parallelism offered by GPU.
  • ASP-ER-DPOP exploits logic programming-based inference

rules to prune the search space.

  • ER-DCOP solution outperforms UR-DCOP solution in terms of

actual regret (belief space exhibits normal distribution).

40

slide-41
SLIDE 41

THANK YOU FOR YOUR ATTENTION!

41

slide-42
SLIDE 42

DPOP

  • 3 phases: Pseudo-tree Generation, UTIL Propagation,

and VALUE Propagation.

42

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

slide-43
SLIDE 43

DPOP

  • 3 phases: Pseudo-tree Generation, UTIL Propagation,

and VALUE Propagation.

43

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x3 Umax 50 1 30

slide-44
SLIDE 44

DPOP

  • 3 phases: Pseudo-tree Generation, UTIL Propagation,

and VALUE Propagation.

44

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x3 Umax 50 1 30 x3 Umax 40 1 50

slide-45
SLIDE 45

ANSWER SET PROGRAMMING (ASP)

  • Π = { rule | rule’s form : c ß a1, …, am, not b1, …, not bn }
  • The answer sets of an ASP program which encodes a

problem P represent solutions for P.

45

slide-46
SLIDE 46

BENEFITS: RULE vs TABLE

x1 x3 U13 1 1 1 1 1 1 2 Dx1 = Dx2 = [0,1]. U(X1,X2) = X1 + X2 domain_x1(0..1). domain_x2(0..1). utility1_2(U,X1,X2) ß domain_x1(X1), domain_x2(X2), U = X1 + X2. Implicit Representation

46

slide-47
SLIDE 47

BENEFITS: OPTIMIZED ASP SOLVER (GROUNDING)

x1 x2 U12 1

1

1 1

Dx1 = Dx2 = [0,1]. The message U(X1,X) = 0 if X1 = X2 = 0; otherwise, -∞ domain_x1(0..1). domain_x2(0..1). utility1_2(0,X1,X2) ß domain_x1(X1), domain_x2(X2), X1 = 0, X2= 0.

47

slide-48
SLIDE 48

EXPECTED REGRET-DCOP (ER-DCOP)

48

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80% Good: 12% Bad: 88% x1 x3 r1 U13 0 0 (Fail) 1 (Success) 50 1 0 (Fail) 1 (Success) 30 Good 10% 90% 30% 70%

  • ER-DCOP P = <Χ, D, A, α, R, S, F>
  • The conditional probability distribution of a random

variable is a belief of the random variable.

slide-49
SLIDE 49

GPU-ER-DPOP

  • Specifying a DCOP using ASP
  • var(x3). dom(x3, 0..1).
  • constraint(u1_3). scope(u1_3,x1,x3).

util1_3(5,0,0). (facts or rules)

  • agent(a3). owner(a3,x3).
  • 3 phases as DPOP.
  • Information about children, ancestor.
  • In x2:
  • ancestor(x1).
  • children(x3).

x1 x3 x2

49

slide-50
SLIDE 50

ASP-DPOP

Πa2

table_max_a3(16,0,0) table_max_a3(25,0,1) table_max_a3(25,1,0) table_max_a3(40,1,1) table_info(a3,a2,x2,0,1) table_info(a3,a1,x1,0,1)

solution(a1,x1,1)

table_max_a2(33,0) table_max_a2(45,1) table_info(a2,a1,x1,0,1)

From agent a3

table_row_a2(V0+V1,X1) ← x1_cons_x2(V0,X1,X2), table_max_a3(V1,X1,X2). table_max_a2(U,X1) ← U = #max[ table_row_a2(V,X1)=V ]

To agent a1

0 { row(U,X2) } ← table_max_a2(U,X1), solution(a1,x1,X1), x1_cons_x2(V0,X1,X2), table_max_a3(V1,X1,X2), U == V0+V1 ← not 1 {row(U,X2) } 1 solution(a2,x2,X2) ← row(U,X2)

From agent a1

solution(a2,x2,0) solution(a1,x1,1)

To agent a3

Ia2 Ma3 I’a2

x1 x3 x2

Agent Controller in Agent 2

50

slide-51
SLIDE 51

EXPECTED REGRET-DCOP (ER-DCOP)

51

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80%

slide-52
SLIDE 52

EXPECTED REGRET-DCOP (ER-DCOP)

52

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80% Bad 90% 10% 50% 50%

Good: 12% Bad: 88%

slide-53
SLIDE 53

EXPECTED REGRET-DCOP (ER-DCOP)

53

x2 x3 r2 U23 0 (Fail) 1 (Success) 40 1 0 (Fail) 1 (Success) 50 Good 50% 50% 20% 80% x1 x3 r1 U13 0 0 (Fail) 1 (Success) 50 1 0 (Fail) 1 (Success) 30 Good 10% 90% 30% 70%

  • ER-DCOP P = <Χ, D, A, α, R, S, F>
  • EU = Expected Utility

EU 45 21 EU 20 40

slide-54
SLIDE 54

DISTRIBUTED CONSTRAINT OPTIMZATION PROBLEMS

  • DCOP P = <Χ, D, F, A, α>
  • Goal: The assignment for all variables maximizes the

aggregate utility.

54

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

D1 = D2 = {0} D3 = {0, 1}

slide-55
SLIDE 55

MOTIVATION

  • In real-world applications, the utilities are stochastic.
  • Stochastic utilities can be sampled from a known

probability distribution space.

55

ER-DCOP framework FAIL SUCCESS ≠

slide-56
SLIDE 56

DISTRIBUTED CONSTRAINT OPTIMZATION PROBLEMS

  • DCOP P = <Χ, D, F, A, α>

56

x3 x1 x2

D1 = D2 = {0} D3 = {0, 1} f13 f23

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

f23 f13

slide-57
SLIDE 57

DPOP1

  • 3 phases: Pseudo-tree Generation, UTIL Propagation,

and VALUE Propagation.

57

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

1: (A. Petcu et al. IJCAI 2005)

slide-58
SLIDE 58

DPOP

  • 3 phases: Pseudo-tree Generation, UTIL Propagation,

and VALUE Propagation.

58

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x3 Umax 50 1 30

slide-59
SLIDE 59

DPOP

  • 3 phases: Pseudo-tree Generation, UTIL Propagation,

and VALUE Propagation.

59

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x3 Umax 40 1 50 x3 Umax 50 1 30

slide-60
SLIDE 60

DPOP

  • 3 phases: Pseudo-tree Generation, UTIL Propagation,

and VALUE Propagation.

60

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x3 Umax 40 1 50 x3 Umax 50 1 30 x3 Umax 50+40=90 1 30+50=80

slide-61
SLIDE 61

DPOP

  • 3 phases: Pseudo-tree Generation, UTIL Propagation,

and VALUE Propagation.

61

x1 x3 U13 50 1 30 x2 x3 U23 40 1 50

x3 x1 x2

x3 Umax 40 1 50 x3 Umax 50 1 30 x3 Umax 90 1 80