Stochastic approximation based methods for computing the optimal - PowerPoint PPT Presentation

Stochastic approximation based methods for computing the optimal thresholds in remote-state estimation with packet drops Jhelum Chakravorty Joint work with Jayakumar Subramanian and Aditya Mahajan McGill University American Control Conference May 24, 2017 1 / 18

Motivation Sequential transmission of data Zero delay in reconstruction 2 / 18

Motivation Applications? Smart grids 2 / 18

Motivation Applications? Environmental monitoring, sensor network 2 / 18

Motivation Applications? Internet of things 2 / 18

Motivation Applications? Smart grids Environmental monitoring, sensor network Internet of things Salient features Sensing is cheap Transmission is expensive Size of data-packet is not critical 2 / 18

Motivation We study a stylized model. Characterization of the fundamental trade-off between estimation accuracy and transmission cost! 2 / 18

The remote-state estimation setup U t = f t ( X 0: t , Y 0: t − 1 ), ∈ { 0 , 1 } S t ∈ { ON(1- ε ), OFF( ε ) } ˆ X t Y t X t U t Markov process Transmitter Erasure channel Receiver ˆ X t +1 = aX t + W t ACK/NACK X t = g t ( Y 0: t ) Source model X t + 1 = aX t + W t , W t i.i.d. 3 / 18

The remote-state estimation setup U t = f t ( X 0: t , Y 0: t − 1 ), ∈ { 0 , 1 } S t ∈ { ON(1- ε ), OFF( ε ) } ˆ X t Y t X t U t Markov process Transmitter Erasure channel Receiver ˆ X t +1 = aX t + W t ACK/NACK X t = g t ( Y 0: t ) Source model X t + 1 = aX t + W t , W t i.i.d. a , X t , W t ∈ R , pdf of W t : φ ( · ) - Gaussian . 3 / 18

The remote-state estimation setup U t = f t ( X 0: t , Y 0: t − 1 ), ∈ { 0 , 1 } S t ∈ { ON(1- ε ), OFF( ε ) } ˆ X t Y t X t U t Markov process Transmitter Erasure channel Receiver ˆ X t +1 = aX t + W t ACK/NACK X t = g t ( Y 0: t ) Source model X t + 1 = aX t + W t , W t i.i.d. Channel model S t i.i.d.; S t = 1: channel ON, S t = 0: channel OFF Packet drop with probability ε . 3 / 18

The remote-state estimation setup � if U t S t = 1 X t , Transmitter U t = f t ( X 0 : t , Y 0 : t − 1 ) and Y t = if U t S t = 0 . E , Receiver ˆ X t = g t ( Y 0 : t ) Per-step distortion: d ( X t − ˆ X t ) = ( X t − ˆ X t ) 2 . Communication Transmission strategy f = { f t } ∞ t = 0 strategies Estimation strategy g = { g t } ∞ t = 0 3 / 18

The optimization problem Discounted setup: β ∈ ( 0 , 1 ) D β ( f , g ) := ( 1 − β ) E ( f , g ) � ∞ � � � β t d ( X t − ˆ � X 0 = 0 X t ) � t = 0 N β ( f , g ) := ( 1 − β ) E ( f , g ) � ∞ � � � β t U t � X 0 = 0 � t = 0 Long-term average setup: β = 1 T E ( f , g ) � T − 1 1 � � � d ( X t − ˆ D 1 ( f , g ) := lim sup � X 0 = 0 X t ) � T →∞ t = 0 T E ( f , g ) � T − 1 1 � � � N 1 ( f , g ) := lim sup � X 0 = 0 U t � T →∞ t = 0 4 / 18

The optimization problem Constrained performance: The Distortion-Transmission function D ∗ β ( α ) := D β ( f ∗ , g ∗ ) := ( f , g ): N β ( f , g ) ≤ α D β ( f , g ) , β ∈ ( 0 , 1 ] inf Minimize expected distortion such that expected number of transmissions is less than α 4 / 18

The optimization problem Constrained performance: The Distortion-Transmission function D ∗ β ( α ) := D β ( f ∗ , g ∗ ) := ( f , g ): N β ( f , g ) ≤ α D β ( f , g ) , β ∈ ( 0 , 1 ] inf Minimize expected distortion such that expected number of transmissions is less than α Costly performance: Lagrange relaxation C ∗ β ( λ ) := inf ( f , g ) D β ( f , g ) + λ N β ( f , g ) , β ∈ ( 0 , 1 ] 4 / 18

Decentralized control systems Team: Multiple decision makers to achieve a common goal 5 / 18

Decentralized control systems Pioneers: Theory of teams Economics: Marschak, 1955; Radner, 1962 Systems and control: Witsenhausen, 1971; Ho, Chu, 1972 5 / 18

Decentralized control systems Pioneers: Theory of teams Economics: Marschak, 1955; Radner, 1962 Systems and control: Witsenhausen, 1971; Ho, Chu, 1972 Remote-state estimation as Team problem No packet drop - Marshak, 1954; Kushner, 1964; Åstrom, Bernhardsson, 2002; Xu and Hespanha, 2004; Imer and Basar, 2005; Lipsa and Martins, 2011; Molin and Hirche, 2012; Nayyar, Başar, Teneketzis and Veeravalli, 2013; D. Shi, L. Shi and Chen, 2015 With packet drop - Ren, Wu, Johansson, G. Shi and L. Shi, 2016; Chen, Wang, D. Shi and L. Shi, 2017; With noise - Gao, Akyol and Başar, 2015–2017 5 / 18

Remote-state estimation - Steps towards optimal solution Establish the structure of optimal strategies (transmission and estimation) Computation of optimal strategies and performances 6 / 18

Step 1 - Structure of optimal strategies: Lipsa-Martins 2011 & Molin-Hirsche 2012 - no packet drop Optimal estimator Time homogeneous! � if Y t � = E ; Y t , ˆ X t = g ∗ t ( Y t ) = g ∗ ( Y t ) = a ˆ if Y t = E . X t − 1 , 7 / 18

Step 1 - Structure of optimal strategies: Lipsa-Martins 2011 & Molin-Hirsche 2012 - no packet drop Optimal estimator Time homogeneous! � if Y t � = E ; Y t , ˆ X t = g ∗ t ( Y t ) = g ∗ ( Y t ) = a ˆ if Y t = E . X t − 1 , Optimal transmitter X t ∈ R ; U t is threshold based action: � if | X t − a ˆ 1 , X t | ≥ k U t = f ∗ t ( X t , U 0 : t − 1 ) = f ∗ ( X t ) = if | X t − a ˆ 0 , X t | < k 7 / 18

Step 1 - Structure of optimal strategies: Lipsa-Martins 2011 & Molin-Hirsche 2012 - no packet drop Optimal estimator Time homogeneous! � if Y t � = E ; Y t , ˆ X t = g ∗ t ( Y t ) = g ∗ ( Y t ) = a ˆ if Y t = E . X t − 1 , Optimal transmitter X t ∈ R ; U t is threshold based action: � if | X t − a ˆ 1 , X t | ≥ k U t = f ∗ t ( X t , U 0 : t − 1 ) = f ∗ ( X t ) = if | X t − a ˆ 0 , X t | < k Similar structural results for channel with packet drops. 7 / 18

Step 2 - The error process E t τ ( k ) : the time a packet was last received successfully. E t := X t − a t − τ ( k ) X τ ( k ) , E t := ˆ ˆ X t − a t − τ ( k ) X τ ( k ) ; 8 / 18

Step 2 - The error process E t τ ( k ) : the time a packet was last received successfully. E t := X t − a t − τ ( k ) X τ ( k ) , E t := ˆ ˆ X t − a t − τ ( k ) X τ ( k ) ; d ( X t − ˆ X t ) = d ( E t − ˆ E t ) . 8 / 18

Step 2 - The error process E t τ ( k ) : the time a packet was last received successfully. E t := X t − a t − τ ( k ) X τ ( k ) , E t := ˆ ˆ X t − a t − τ ( k ) X τ ( k ) ; = X t − a ( ˆ X t − 1 − ˆ E t − 1 ) � if Y t = E aE t − 1 + W t − 1 , = if Y t � = E W t , 8 / 18

Performance evaluation - JC-AM TAC ’17, NecSys ’16 � 1 , if | e | ≥ k f ( k ) ( e ) = 0 , if | e | < k Till first successful reception � τ ( k ) − 1 � L ( k ) � � β ( 0 ) := E β t d ( E t ) � E 0 = 0 � t = 0 � τ ( k ) − 1 β t � � M ( k ) � β ( 0 ) := E � E 0 = 0 � t = 0 � τ ( k ) � � K ( k ) � β t U t β ( 0 ) := E � E 0 = 0 � t = 0 9 / 18

Performance evaluation - JC-AM TAC ’17, NecSys ’16 � 1 , if | e | ≥ k f ( k ) ( e ) = E t is regenerative process 0 , if | e | < k Renewal relationships L ( k ) β ( 0 ) D ( k ) β ( 0 ) := D β ( f ( k ) , g ∗ ) = M ( k ) β ( 0 ) K ( k ) β ( 0 ) N ( k ) β ( 0 ) := N β ( f ( k ) , g ∗ ) = M ( k ) β ( 0 ) 9 / 18

Computation of D , N �  � � φ ( n − ae ) L ( k ) if | e | ≥ k d ( e ) + β β ( n ) dn ε ,    L ( k ) n ∈ R β ( e ) = � φ ( n − ae ) L ( k ) if | e | < k , d ( e ) + β β ( n ) dn ,    n ∈ R 10 / 18

Computation of D , N �  � � φ ( n − ae ) L ( k ) if | e | ≥ k d ( e ) + β β ( n ) dn ε ,    L ( k ) n ∈ R β ( e ) = � φ ( n − ae ) L ( k ) if | e | < k , d ( e ) + β β ( n ) dn ,    n ∈ R M ( k ) β ( e ) and K ( k ) β ( e ) defined in a similar way. 10 / 18

Computation of D , N �  � � φ ( n − ae ) L ( k ) if | e | ≥ k d ( e ) + β β ( n ) dn ε ,    L ( k ) n ∈ R β ( e ) = � φ ( n − ae ) L ( k ) if | e | < k , d ( e ) + β β ( n ) dn ,    n ∈ R ε = 0: Fredholm integral equations of second kind - bisection method to compute optimal threshold ε � = 0: Fredholm-like equation; discontinuous kernel, infinite limit - analytical methods difficult 10 / 18

Optimality condition (JC & AM: TAC’17, NecSys ’16) D ( k ) β , N ( k ) β , C ( k ) - differentiable in k β Theorem - costly communication If ( k , λ ) satisfies ∂ k D ( k ) + λ∂ k N ( k ) = 0, then, ( f ( k ) , g ∗ ) optimal β β for costly comm. with cost λ . 11 / 18

Optimality condition (JC & AM: TAC’17, NecSys ’16) D ( k ) β , N ( k ) β , C ( k ) - differentiable in k β Theorem - costly communication If ( k , λ ) satisfies ∂ k D ( k ) + λ∂ k N ( k ) = 0, then, ( f ( k ) , g ∗ ) optimal β β for costly comm. with cost λ . β ( λ ) := C β ( f ( k ) , g ∗ ; λ ) is continuous, increasing and concave in λ . C ∗ 11 / 18

Stochastic approximation based methods for computing the optimal - PowerPoint PPT Presentation

Stochastic approximation based methods for computing the optimal thresholds in remote-state estimation with packet drops Jhelum Chakravorty Joint work with Jayakumar Subramanian and Aditya Mahajan McGill University American Control Conference

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Gersende FORT LTCI /

6. Approximation and fitting norm approximation least-norm problems regularized

Multi-level stochastic approximation algorithms Noufel Frikha Universit e Paris Diderot, LPMA

Stochastic approximation-based algorithms, when the Monte Carlo bias does not vanish Gersende

Bridging the gap between Stochastic Approximation and Markov chains Aymeric DIEULEVEUT ENS

Stochastic Approximation in Hilbert Spaces Aymeric DIEULEVEUT Supervised by Francis BACH

Approximation Methods in Derivatives Pricing Minqiang Li Bloomberg LP September 24, 2013 1 / 27

Overview of the Stochastic Gradient Method December 02, 2020 P. Carpentier Master Optimization

Some References P. Carpentier Master MMMEF Cours MNOS 2014-2015 263 / 263 Stochastic

Stochastic Computing by Stochastic Computing by a New Polynomial a New Polynomial Dimensional

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

What If We Only Have Stochastic . . . What if the Stochastic . . . Approximate Stochastic

Moderately exponential approximation Bridging the gap between exact computation and polynomial

6. Approximation and fitting Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao

Deep Approximation via Deep Learning Zuowei Shen Department of Mathematics National University

ECS 231 Lecture on Approximation and Error Analysis 1 / 9 Approximation and error analysis 1.

Syncretism in Optimality Theory An Overview Gereon M uller Institut f ur Linguistik

Beyond Optimality: The computational nature of phonological maps and constraints Jeffrey Heinz

arXiv:1404.5733v2 [math.PR] 27 Apr 2014 Abstract This article is concerned with the analysis of

Plan Composite Likelihood Methods What are composite likelihoods? David Firth Where are

Smoothing of Variable Bandwidth Kernel Estimate of Heavy-Tailed Density Function Natalia M.

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Non parametric methods Course of Machine Learning Master Degree in Computer Science University

Methodological considerations Biosocial research framework Biological data quality issues

Sambuz

Useful Links

Newsletter

Mail Us