 
              System Analysis and Optimizations of Human Actuated Dynamical Systems Sangjae Bae, Sang Min Han, Scott J. Moura July 5, 2018 S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 1 / 35
Human Actuated Dynamical Systems (HADS)? Dynamical systems where the system inputs are induced by human behaviours. In such system... we cannot directly control human behaviours and therefore the system inputs either. still, their behaviours can be “encouraged” with (price) incentives. S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 2 / 35
Human Actuated Dynamical Systems (HADS)? S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 3 / 35
Problem Statement With emerging importance of smart cities, we need an effective system modeling framework to address human behaviors that can be encouraged by incentives. Specifically, we are trying to answer: 1 How to model human behaviours with dynamical systems? 2 How to incentivize human actuators to make desired behaviors for a system-wide benefit? S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 4 / 35
Overview Literature 1 System Modeling and Optimization of HADS 2 System modeling with Discrete Choice Model Convex Optimization Framework Application: Demand Response Appendix: Proofs of Theorem and Propositions 3 S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 5 / 35
Overview Literature 1 System Modeling and Optimization of HADS 2 System modeling with Discrete Choice Model Convex Optimization Framework Application: Demand Response Appendix: Proofs of Theorem and Propositions 3 S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 6 / 35
Literature In the existing literature, Human behaviors are addressed as noises/disturbances, [Arnold, 2013, Gray, 2004, Maruyama, 1955] S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 7 / 35
Literature In the existing literature, Human behaviors are addressed as noises/disturbances, [Arnold, 2013, Gray, 2004, Maruyama, 1955] Human behaviors are to improve system performance[Leeper, 2012] S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 7 / 35
Literature In the existing literature, Human behaviors are addressed as noises/disturbances, [Arnold, 2013, Gray, 2004, Maruyama, 1955] Human behaviors are to improve system performance[Leeper, 2012] Finding optimal behaviors for an individual human actuator. [Mnih, 2015] S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 7 / 35
Literature In the existing literature, Human behaviors are addressed as noises/disturbances, [Arnold, 2013, Gray, 2004, Maruyama, 1955] Human behaviors are to improve system performance[Leeper, 2012] Finding optimal behaviors for an individual human actuator. [Mnih, 2015] We add a new perspective: Desired human behaviors are encouraged by incentives [Bae, 2018] S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 7 / 35
Overview Literature 1 System Modeling and Optimization of HADS 2 System modeling with Discrete Choice Model Convex Optimization Framework Application: Demand Response Appendix: Proofs of Theorem and Propositions 3 S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 8 / 35
Discrete Choice Model (DCM) Example of discrete choices: Figure: Example of discrete choices: choosing a laptop S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 9 / 35
Discrete Choice Model (DCM) U j . = β ⊤ j z j + γ ⊤ j w j + β 0 j + ǫ j , (1) where U j : Utility of j -th alternative , j ∈ { 1 , ..., J } β j : Parameters of controlled attributes z j : Controlled attributes γ j : Parameters of uncontrolled attributes w j : Uncontrolled attributes β 0 j : Alternative specific constant ǫ j : Undefined errors S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 10 / 35
Probability of Alternatives in DCM Logit model Assuming that the undefined errors, ǫ j , have iid Extreme Value distribution, the probability of choosing j -th alternative is [Train, 2009]:   e V j �  = Pr( u ( k ) = u j ) = Pr ( U j > U i ) i =1 e V i , (2) � J j � = i where V j . = β ⊤ j z k + γ ⊤ j w k + β 0 j . S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 11 / 35
Probability of Alternatives in DCM Figure: Binary Logit model example S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 12 / 35
Dynamical Systems with DCM Discrete-time Linear system x ( k + 1) = Ax ( k ) + B ( k ) u ( k ) , (3) S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 13 / 35
Dynamical Systems with DCM Discrete-time Linear system x ( k + 1) = Ax ( k ) + B ( k ) u ( k ) , (3) Figure: Block diagram of Dynamical Systems with DCM S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 13 / 35
Objective Function Consider the mean dynamics (with one human actuator): � J j =1 u j e V j ¯ x ( k + 1) = A ¯ x ( k ) + B ( k ) i =1 e V i , (4) � J S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 14 / 35
Objective Function Consider the mean dynamics (with one human actuator): � J j =1 u j e V j ¯ x ( k + 1) = A ¯ x ( k ) + B ( k ) i =1 e V i , (4) � J The objective function: T � f ( Z ) = x ( k ) , ¯ (5) k =1 S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 14 / 35
Objective Function Consider the mean dynamics (with one human actuator): � J j =1 u j e V j ¯ x ( k + 1) = A ¯ x ( k ) + B ( k ) i =1 e V i , (4) � J The objective function: T � f ( Z ) = x ( k ) , ¯ (5) k =1 Assume A = a ∈ R and B ( k ) = b ( k ) ∈ R ∀ k : T � k − 1 � � J i =1 e V i u i ( m ) � � a k − m − 1 b ( m ) f ( Z ) = , (6) � J j =1 e V j m =0 k =1 where V j . = β ⊤ j z m + γ ⊤ j w m + β 0 j . S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 14 / 35
Convexity Constraint Consider the simpler case where the number of alternatives J = 2, i.e. u i ( m ) ∈ { 0 , 1 } the decision variables z m are scalars, i.e., z m ∈ R . Theorem (Constraint enforcing convexity) Minimizing the objective function f ( Z ) in (6) with respect to z m is a convex optimization problem , if z k ( β m 0 − β m 1 ) ≥ γ m 1 − γ m 0 , u 0 ( m ) = 0 , and u 1 ( m ) = 1 . where β m 0 and β m 1 : the parameters of z m in U m 0 and U m 1 , respectively γ m 0 and γ m 1 : the parameters of w m in U m 0 and U m 1 , respectively. (Proof in Appendix) S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 15 / 35
Projected Gradient Descent Algorithm Proposition (Gradient of expected sum) For Z ∈ R T × L , the gradient of f ( Z ) is � ⊤ ∂ f ( Z ) � ∂ f ( Z ) , ∂ f ( Z ) , · · · , ∂ f ( Z ) = , (7) ∂ Z ∂ z 0 ∂ z 1 ∂ z T − 1 where � T − m − 1 � β m e z m ˜ ˜ β m +˜ γ m ∂ f ( Z ) � a i = − b ( m ) � 2 , (8) ∂ z m � 1 + e z m ˜ β m +˜ γ m i =0 β m . γ m . where ˜ = β m 0 − β m 1 and ˜ = γ m 0 − γ m 1 for every m = 0 , . . . , T − 1 . Proof in Appendix. The projected gradient descent finds optima. S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 16 / 35
Application: Demand Response Problem: System Operator (SO) cannot enforce DR participation (participate: u n ( k ) = 0, non-participate: u n ( k ) = 1) Still, SO gives price compensations to encourage participation. We find: Optimal price compensations z k to Figure: Smart home example minimize power loads S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 17 / 35
System Dynamics Consider the mean dynamics, x ( k ) + B ( k ) ⊤ ¯ x ( k + 1) = ¯ ¯ u ( k ) , (9) x ( k ): expected cumulative power consumption ∈ R , ¯ B ( k ): the vector of power loads ∈ R N , u ( k ): the vector of expected choice ∈ R N . ¯ The choice of each participant is binary; participate ( u n ( k ) = 0) or non-participate ( u n ( k ) = 1). S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 18 / 35
System Dynamics Consider the mean dynamics, x ( k ) + B ( k ) ⊤ ¯ x ( k + 1) = ¯ ¯ u ( k ) , (9) x ( k ): expected cumulative power consumption ∈ R , ¯ B ( k ): the vector of power loads ∈ R N , u ( k ): the vector of expected choice ∈ R N . ¯ The choice of each participant is binary; participate ( u n ( k ) = 0) or non-participate ( u n ( k ) = 1). We assume knowledge of each participant’s power load discrete choice model parameters S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 18 / 35
Formulating Optimization Problem � k − 1 T � B ( m ) ⊤ ¯ � � min u ( m ) + λ � Z � 2 (10) Z ∈ R T k =1 m =0 s.t. x (0) = x 0 = 0 B ( m ) = [ b 1 ( m ) b 2 ( m ) · · · b N ( m )] ⊤ ∈ R N × 1 (11) u N ( m )] ⊤ ∈ R N × 1 u ( m ) = [¯ ¯ u 1 ( m ) , ¯ u 2 ( m ) , · · · , ¯ (12) e β ( n ) m 1 z m + γ ( n ) m 1 u n ( m ) = ¯ (13) e β ( n ) m 0 z m + γ ( n ) m 0 + e β ( n ) m 1 z m + γ ( n ) m 1 z m ( β ( n ) m 0 − β ( n ) m 1 ) ≥ γ ( n ) m 1 − γ ( n ) m 0 , (14) where Z = [ z 0 , z 1 , · · · , z T − 1 ] ⊤ , b n ( m ) is the power load of the n -th participant at time-step m , and λ is the regularization parameter penalizing price compensations–i.e. control effort. S. Bae et al (UC Berkeley) 2018 TBSI ene2XX July 5, 2018 19 / 35
Recommend
More recommend