Stochastic approximation based methods for computing the optimal - - PowerPoint PPT Presentation

stochastic approximation based methods for computing the
SMART_READER_LITE
LIVE PREVIEW

Stochastic approximation based methods for computing the optimal - - PowerPoint PPT Presentation

Stochastic approximation based methods for computing the optimal thresholds in remote-state estimation with packet drops Jhelum Chakravorty Joint work with Jayakumar Subramanian and Aditya Mahajan McGill University American Control Conference


slide-1
SLIDE 1

Stochastic approximation based methods for computing the optimal thresholds in remote-state estimation with packet drops

Jhelum Chakravorty Joint work with Jayakumar Subramanian and Aditya Mahajan

McGill University

American Control Conference May 24, 2017

1 / 18

slide-2
SLIDE 2

Motivation

Sequential transmission of data Zero delay in reconstruction

2 / 18

slide-3
SLIDE 3

Motivation

Applications? Smart grids

2 / 18

slide-4
SLIDE 4

Motivation

Applications? Environmental monitoring, sensor network

2 / 18

slide-5
SLIDE 5

Motivation

Applications? Internet of things

2 / 18

slide-6
SLIDE 6

Motivation

Applications? Smart grids Environmental monitoring, sensor network Internet of things Salient features Sensing is cheap Transmission is expensive Size of data-packet is not critical

2 / 18

slide-7
SLIDE 7

Motivation

We study a stylized model. Characterization of the fundamental trade-off between estimation accuracy and transmission cost!

2 / 18

slide-8
SLIDE 8

The remote-state estimation setup

Transmitter Markov process Receiver Erasure channel Xt Ut Yt ˆ Xt Xt+1 = aXt + Wt Ut = ft(X0:t, Y0:t−1), ∈ {0, 1} ˆ Xt = gt(Y0:t) ACK/NACK St ∈ {ON(1-ε), OFF(ε)}

Source model Xt+1 = aXt + Wt, Wt i.i.d.

3 / 18

slide-9
SLIDE 9

The remote-state estimation setup

Transmitter Markov process Receiver Erasure channel Xt Ut Yt ˆ Xt Xt+1 = aXt + Wt Ut = ft(X0:t, Y0:t−1), ∈ {0, 1} ˆ Xt = gt(Y0:t) ACK/NACK St ∈ {ON(1-ε), OFF(ε)}

Source model Xt+1 = aXt + Wt, Wt i.i.d. a, Xt, Wt ∈ R, pdf of Wt: φ(·) - Gaussian.

3 / 18

slide-10
SLIDE 10

The remote-state estimation setup

Transmitter Markov process Receiver Erasure channel Xt Ut Yt ˆ Xt Xt+1 = aXt + Wt Ut = ft(X0:t, Y0:t−1), ∈ {0, 1} ˆ Xt = gt(Y0:t) ACK/NACK St ∈ {ON(1-ε), OFF(ε)}

Source model Xt+1 = aXt + Wt, Wt i.i.d. Channel model St i.i.d.; St = 1: channel ON, St = 0: channel OFF Packet drop with probability ε.

3 / 18

slide-11
SLIDE 11

The remote-state estimation setup

Transmitter Ut = ft(X0:t, Y0:t−1) and Yt =

  • Xt,

if UtSt = 1 E, if UtSt = 0. Receiver ˆ Xt = gt(Y0:t) Per-step distortion: d(Xt − ˆ Xt) = (Xt − ˆ Xt)2. Communication Transmission strategy f = {ft}∞

t=0

strategies Estimation strategy g = {gt}∞

t=0

3 / 18

slide-12
SLIDE 12

The optimization problem

Discounted setup: β ∈ (0, 1)

Dβ(f , g) := (1 − β)E(f ,g) ∞

  • t=0

βtd(Xt − ˆ Xt)

  • X0 = 0
  • Nβ(f , g) := (1 − β)E(f ,g) ∞
  • t=0

βtUt

  • X0 = 0
  • Long-term average setup: β = 1

D1(f , g) := lim sup

T→∞

1 T E(f ,g) T−1

  • t=0

d(Xt − ˆ Xt)

  • X0 = 0
  • N1(f , g) := lim sup

T→∞

1 T E(f ,g) T−1

  • t=0

Ut

  • X0 = 0
  • 4 / 18
slide-13
SLIDE 13

The optimization problem

Constrained performance: The Distortion-Transmission function D∗

β(α) := Dβ(f ∗, g∗) :=

inf

(f ,g):Nβ(f ,g)≤α Dβ(f , g), β ∈ (0, 1]

Minimize expected distortion such that expected number of transmissions is less than α

4 / 18

slide-14
SLIDE 14

The optimization problem

Constrained performance: The Distortion-Transmission function D∗

β(α) := Dβ(f ∗, g∗) :=

inf

(f ,g):Nβ(f ,g)≤α Dβ(f , g), β ∈ (0, 1]

Minimize expected distortion such that expected number of transmissions is less than α Costly performance: Lagrange relaxation C ∗

β(λ) := inf (f ,g) Dβ(f , g) + λNβ(f , g), β ∈ (0, 1]

4 / 18

slide-15
SLIDE 15

Decentralized control systems

Team: Multiple decision makers to achieve a common goal

5 / 18

slide-16
SLIDE 16

Decentralized control systems

Pioneers: Theory of teams Economics: Marschak, 1955; Radner, 1962 Systems and control: Witsenhausen, 1971; Ho, Chu, 1972

5 / 18

slide-17
SLIDE 17

Decentralized control systems

Pioneers: Theory of teams Economics: Marschak, 1955; Radner, 1962 Systems and control: Witsenhausen, 1971; Ho, Chu, 1972 Remote-state estimation as Team problem No packet drop - Marshak, 1954; Kushner, 1964; Åstrom, Bernhardsson, 2002; Xu and Hespanha, 2004; Imer and Basar, 2005; Lipsa and Martins, 2011; Molin and Hirche, 2012; Nayyar, Başar, Teneketzis and Veeravalli, 2013; D. Shi, L. Shi and Chen, 2015 With packet drop - Ren, Wu, Johansson, G. Shi and L. Shi, 2016; Chen, Wang, D. Shi and L. Shi, 2017; With noise - Gao, Akyol and Başar, 2015–2017

5 / 18

slide-18
SLIDE 18

Remote-state estimation - Steps towards optimal solution

Establish the structure of optimal strategies (transmission and estimation) Computation of optimal strategies and performances

6 / 18

slide-19
SLIDE 19

Step 1 - Structure of optimal strategies: Lipsa-Martins 2011 & Molin-Hirsche 2012 - no packet drop

Optimal estimator Time homogeneous! ˆ Xt = g∗

t (Yt) = g∗(Yt) =

  • Yt,

if Yt = E; a ˆ Xt−1, if Yt = E.

7 / 18

slide-20
SLIDE 20

Step 1 - Structure of optimal strategies: Lipsa-Martins 2011 & Molin-Hirsche 2012 - no packet drop

Optimal estimator Time homogeneous! ˆ Xt = g∗

t (Yt) = g∗(Yt) =

  • Yt,

if Yt = E; a ˆ Xt−1, if Yt = E. Optimal transmitter Xt ∈ R; Ut is threshold based action: Ut = f ∗

t (Xt, U0:t−1) = f ∗(Xt) =

  • 1,

if |Xt − a ˆ Xt| ≥ k 0, if |Xt − a ˆ Xt| < k

7 / 18

slide-21
SLIDE 21

Step 1 - Structure of optimal strategies: Lipsa-Martins 2011 & Molin-Hirsche 2012 - no packet drop

Optimal estimator Time homogeneous! ˆ Xt = g∗

t (Yt) = g∗(Yt) =

  • Yt,

if Yt = E; a ˆ Xt−1, if Yt = E. Optimal transmitter Xt ∈ R; Ut is threshold based action: Ut = f ∗

t (Xt, U0:t−1) = f ∗(Xt) =

  • 1,

if |Xt − a ˆ Xt| ≥ k 0, if |Xt − a ˆ Xt| < k Similar structural results for channel with packet drops.

7 / 18

slide-22
SLIDE 22

Step 2 - The error process Et

τ (k): the time a packet was last received successfully. Et := Xt − at−τ (k)Xτ (k), ˆ Et := ˆ Xt − at−τ (k)Xτ (k);

8 / 18

slide-23
SLIDE 23

Step 2 - The error process Et

τ (k): the time a packet was last received successfully. Et := Xt − at−τ (k)Xτ (k), ˆ Et := ˆ Xt − at−τ (k)Xτ (k); d(Xt − ˆ Xt) = d(Et − ˆ Et).

8 / 18

slide-24
SLIDE 24

Step 2 - The error process Et

τ (k): the time a packet was last received successfully. Et := Xt − at−τ (k)Xτ (k), ˆ Et := ˆ Xt − at−τ (k)Xτ (k); = Xt − a( ˆ Xt−1 − ˆ Et−1) =

  • aEt−1 + Wt−1,

if Yt = E Wt, if Yt = E

8 / 18

slide-25
SLIDE 25

Performance evaluation - JC-AM TAC ’17, NecSys ’16

f (k)(e) =

  • 1,

if |e| ≥ k 0, if |e| < k Till first successful reception L(k)

β (0) := E

τ (k)−1

  • t=0

βtd(Et)

  • E0 = 0
  • M(k)

β (0) := E

τ (k)−1

  • t=0

βt

  • E0 = 0
  • K (k)

β (0) := E

τ (k)

  • t=0

βtUt

  • E0 = 0
  • 9 / 18
slide-26
SLIDE 26

Performance evaluation - JC-AM TAC ’17, NecSys ’16

f (k)(e) =

  • 1,

if |e| ≥ k 0, if |e| < k Et is regenerative process Renewal relationships D(k)

β (0) := Dβ(f (k), g∗) =

L(k)

β (0)

M(k)

β (0)

N(k)

β (0) := Nβ(f (k), g∗) =

K (k)

β (0)

M(k)

β (0)

9 / 18

slide-27
SLIDE 27

Computation of D, N

L(k)

β (e) =

       ε

  • d(e) + β
  • n∈R

φ(n − ae)L(k)

β (n)dn

  • ,

if |e| ≥ k d(e) + β

  • n∈R

φ(n − ae)L(k)

β (n)dn,

if |e| < k,

10 / 18

slide-28
SLIDE 28

Computation of D, N

L(k)

β (e) =

       ε

  • d(e) + β
  • n∈R

φ(n − ae)L(k)

β (n)dn

  • ,

if |e| ≥ k d(e) + β

  • n∈R

φ(n − ae)L(k)

β (n)dn,

if |e| < k, M(k)

β (e) and K (k) β (e) defined in a similar way.

10 / 18

slide-29
SLIDE 29

Computation of D, N

L(k)

β (e) =

       ε

  • d(e) + β
  • n∈R

φ(n − ae)L(k)

β (n)dn

  • ,

if |e| ≥ k d(e) + β

  • n∈R

φ(n − ae)L(k)

β (n)dn,

if |e| < k, ε = 0: Fredholm integral equations of second kind - bisection method to compute optimal threshold ε = 0: Fredholm-like equation; discontinuous kernel, infinite limit - analytical methods difficult

10 / 18

slide-30
SLIDE 30

Optimality condition (JC & AM: TAC’17, NecSys ’16)

D(k)

β ,N(k) β , C (k) β

  • differentiable in k

Theorem - costly communication If (k, λ) satisfies ∂kD(k)

β

+ λ∂kN(k)

β

= 0, then, (f (k), g∗) optimal for costly comm. with cost λ.

11 / 18

slide-31
SLIDE 31

Optimality condition (JC & AM: TAC’17, NecSys ’16)

D(k)

β ,N(k) β , C (k) β

  • differentiable in k

Theorem - costly communication If (k, λ) satisfies ∂kD(k)

β

+ λ∂kN(k)

β

= 0, then, (f (k), g∗) optimal for costly comm. with cost λ. C ∗

β(λ) := Cβ(f (k), g∗; λ) is continuous, increasing and concave in λ.

11 / 18

slide-32
SLIDE 32

Optimality condition (JC & AM: TAC’17, NecSys ’16)

D(k)

β ,N(k) β , C (k) β

  • differentiable in k

Theorem - costly communication If (k, λ) satisfies ∂kD(k)

β

+ λ∂kN(k)

β

= 0, then, (f (k), g∗) optimal for costly comm. with cost λ. C ∗

β(λ) := Cβ(f (k), g∗; λ) is continuous, increasing and concave in λ.

Theorem - constrained communication k∗

β(α) := {k : N(k) β (0) = α}. (f k∗

β(α), g∗) is optimal for the

  • ptimization problem with constraint α ∈ (0, 1).

D∗

β(α) := Dβ(f (k), g∗) is continuous, decreasing and convex in α.

11 / 18

slide-33
SLIDE 33

Main results

12 / 18

slide-34
SLIDE 34

Computation of optimal thresholds

Difficulty Numerically compute L(k)

β , M(k) β

and K (k)

β ; use renewal

relationship to compute C (k)

β

and D(k)

β .

Need to solve Fredholm-like integral - computationally difficult.

13 / 18

slide-35
SLIDE 35

Computation of optimal thresholds

Difficulty Numerically compute L(k)

β , M(k) β

and K (k)

β ; use renewal

relationship to compute C (k)

β

and D(k)

β .

Need to solve Fredholm-like integral - computationally difficult. Simulation based approach Two main approaches - Monte Carlo (MC) and Temporal Difference (TD)

MC - High variance due to one sample path; low bias TD - Low variance due to bootstrapping; high bias

13 / 18

slide-36
SLIDE 36

Computation of optimal thresholds

Difficulty Numerically compute L(k)

β , M(k) β

and K (k)

β ; use renewal

relationship to compute C (k)

β

and D(k)

β .

Need to solve Fredholm-like integral - computationally difficult. Simulation based approach Two main approaches - Monte Carlo (MC) and Temporal Difference (TD)

MC - High variance due to one sample path; low bias TD - Low variance due to bootstrapping; high bias

Exploit regenerative property of the underlying state (error) process Renewal Monte Carlo (RMC) - low variance (independent sample paths from renewal) and low bias (since MC)

13 / 18

slide-37
SLIDE 37

Computation of optimal thresholds

Difficulty Numerically compute L(k)

β , M(k) β

and K (k)

β ; use renewal

relationship to compute C (k)

β

and D(k)

β .

Need to solve Fredholm-like integral - computationally difficult. Key idea Renewal Monte Carlo

Pick a k, compute sample values L, M, K till first successful reception Sample average to compute L(k)

β , M(k) β , K (k) β .

Stochastic approximation techniques to compute optimal k.

13 / 18

slide-38
SLIDE 38

Computation of optimal thresholds

13 / 18

slide-39
SLIDE 39

Computation of optimal thresholds

Key steps of the algorithms Noisy policy evaluation - MC until successful reception: constitutes one episode; sample average over few episodes to find ˆ L, ˆ M, ˆ K and hence ˆ C and ˆ D. Policy improvement - Smoothed Functional ˆ ki+1 = ˆ ki − γi η 2˜ β ˆ C(ˆ ki + ˜ βη) − ˆ C(ˆ ki − ˜ βη)

  • Policy improvement - Robbins-Monro

ˆ ki+1 = ˆ ki − γi(α ˆ M − ˆ K).

13 / 18

slide-40
SLIDE 40

Validation of simulation results

Results validated by comparing with analytical results of no packet-drop case: JC-AM, TAC ’17. Costly performance - Error in k∗: 10−2 − 10−3; Error in C ∗: 10−4 − 10−5 Constrained performance - Error in k∗: 10−3; Error in D∗: 10−3 − 10−5

14 / 18

slide-41
SLIDE 41

Optimal thresholds from simulations

Costly performance:

1 2 3 4 5 6 7 ·104 5 10

λ = 100 λ = 300 λ = 500

Iterations Threshold Figure: Costly communication: β = 0.9, ε = 0.3.

15 / 18

slide-42
SLIDE 42

Optimal thresholds from simulations

Constrained performance:

1,000 2,000 3,000 4,000 5,000 1 2 3

α = 0.1 α = 0.3 α = 0.5

Iterations Threshold

Figure: Constrained communication using RM: β = 0.9, ε = 0.3.

15 / 18

slide-43
SLIDE 43

Optimal trade-off between distortion and communication cost

20 40 60 2 4 6 8 10 λ C∗

0.9(λ)

ε = 0.0 ε = 0.3 ε = 0.7

Figure: Costly communication: β = 0.9, ε ∈ {0, 0.3, 0.7}.

16 / 18

slide-44
SLIDE 44

Optimal trade-off between distortion and communication cost

0.2 0.4 0.6 0.8 1 2 3 4 α D∗

0.9(α)

ε = 0.0 ε = 0.3 ε = 0.7

Figure: Constrained communication: β = 0.9, ε ∈ {0, 0.3, 0.7}.

16 / 18

slide-45
SLIDE 45

Future work

Markovian erasure channel -

Thresholds at t are function of channel-state at t − 1

Higher dimension -

Xt ∈ Rm is ASU

?

= ⇒ AXt + Wt is ASU Notion of stochastic dominance in higher dimension

17 / 18

slide-46
SLIDE 46

Thank you!

18 / 18