 
              SELECTED REFERENCES - MODELING Timed Automata and Timed Petri Nets – Alur, R., and D.L. Dill, “A Theory of Timed Automata,” Theoretical Computer Science, No. 126, pp. 183-235, 1994. – Cassandras, C.G, and S. Lafortune, “Introduction to Discrete Event Systems,” Springer, 2008. – Wang, J., Timed Petri Nets - Theory and Application, Kluwer Academic Pub- lishers, Boston, 1998. Hybrid Systems – Bemporad, A. and M. Morari, “Control of Systems Integrating Logic Dynamics and Constraints,” Automatica, Vol. 35, No. 3, pp.407 -427, 1999. – Branicky, M.S., V.S. Borkar, and S.K. Mitter, “A Unified Framework for Hy - brid Control: Model and Optimal Control Theory,”IEEE Trans. on Automatic Control, Vol. 43, No. 1, pp. 31-45, 1998. – Cassandras, C.G., and J. Lygeros, “Stochastic Hybrid Systems,” Taylor and Francis, 2007. – Grossman, R., A. Nerode, A. Ravn, and H. Rischel, (Eds), Hybrid Systems, Springer, New York, 1993. – Hristu-Varsakelis, D. and W.S. Levine, Handbook of Networked and Embedded Control Systems, Birkhauser, Boston, 2005. Christos G. Cassandras CODES Lab. - Boston University
CONTROL AND OPTIMIZATION – CHALLENGES 1. SCALABILITY Distributed Algorithms 2. DECENTRALIZATION 3. COMMUNICATION Event-driven (asynchronous) Algorithms 4. NON-CONVEXITY Global optimality, escape local optima 5. EXLOIT DATA Data-Driven Algorithms Christos G. Cassandras CODES Lab. - Boston University
WHEN CAN WE WHEN CAN WE DECENTRA DECENTRALIZE ? LIZE ?
MULTI-AGENT OPTIMIZATION: PROBLEM 1 Ω  s i : agent state, i = 1,…, N a 1 s = [ s 1 , … , s N ] a O 1 i  O j : obstacle (constraint) x  R ( x ): property of point x O 2 a 3 a 2  P ( x , s ): reward function   max ( ) ( , ) ( ) s s H P x R x dx  s      , 1 , , s i F i N GOAL: Find the best state vector s = [ s 1 , … , s N ] so that agents achieve a maximal reward from interacting with the mission space Christos G. Cassandras CODES Lab. - Boston University
MULTI-AGENT OPTIMIZATION: PROBLEM 2 Ω a 1 a O 1 i x O 2 a 3 a 2 May also have dynamics T    max ( , ( ( ))) ( ) J P x s u t R x dx dt  0 ( ) u t          ( , , ), 1 , , ( ) , 1 , , s f s u t i N s i t F i N i i i i GOAL: Find the best state trajectories s i ( t ) , 0 ≤ t ≤ T so that agents achieve a maximal reward from interacting with the mission space Christos G. Cassandras CODES Lab. - Boston University
WHEN CAN WE DECENTRALIZE A MULTI-AGENT PROBLEM 1?       s  max ( ) ( , ) ( ) , 1 , , s H P x R x dx s i F i N  s Recall :  x  ( i ) V s  i N    s i    ˆ x  ( , ) 1 1 ( , ) s ( ) P x p x s V s j i i  1 i  s k  s j ( , ) ( ) p x s x V s  ˆ  ( , ) i i i p x s  i i 0 otherwise Define agent i             NEIGHBORHOOD : k : 2 , 1 , , , B b s s k a k i i i i k i Christos G. Cassandras CODES Lab. - Boston University
OBJECTIVE FUNCTION DECOMPOSITION THEOREM: If P ( x, s ) = P ( p 1 ,…, p N ) is a function of local reward functions p i , then H ( s ) can be expressed as:   L ( ) ( ) ( ), s s H H H s 1 2 i i    s i = [ s 1 , × × × , s i - 1 , s i + 1 , × × × for any i = 1,…, N , where , s N ] L [ , , ] and s s s s � 1 i i a b b i i State of i and its neighbors only State of all agents except i   L ( ) ( ) s H s H   Theorem implies 1 i   s s i i  L ( ) H s    b 1 k k  Distributed gradient-based algorithm: 1 i s s  i i k k s i Christos G. Cassandras CODES Lab. - Boston University
OBJECTIVE FUNCTION DECOMPOSITION  Theorem 1 often applies and is easy to check for the “Problem 1” setting EXAMPLE: Coverage Control Problems Christos G. Cassandras CODES Lab. - Boston University
COVERAGE: PROBLEM FORMULATION  N mobile sensors, each located at s i  R 2 R ( x ) ( Hz / 50 m 2 ) 40 30  Data source at x emits signal with energy E 20 ? ? 10  ?? ? ? ? 10 0 ? 10 ? 8 5 6  Signal observed by sensor node i (at s i ) 4 2 0 0 SENSING MODEL:   ( , ) [ Detected by | ( ), ] p x s P i A x s i i i ( A ( x ) = data source emits at x ) Sensing attenuation:  p i ( x, s i ) monotonically decreasing in d i ( x )  || x - s i || Christos G. Cassandras CODES Lab. - Boston University
COVERAGE: PROBLEM FORMULATION Joint detection prob. assuming sensor independence  ( s = [ s 1 ,…, s N ] : node locations) Event sensing probability N       ( , ) 1 1 ( , ) P x s p x s i i  1 i OBJECTIVE: Determine locations s = [ s 1 ,…, s N ] to  maximize total Detection Probability :  max ( ) ( , ) R x P x s dx Theorem 1 applies s  Christos G. Cassandras CODES Lab. - Boston University
DISTRIBUTED COOPERATIVE SCHEME CONTINUED Set    N   dx         ( , , ) ( ) 1 1 ( ) H s s R x p x 1 N i     1 i Maximize H ( s 1 , … , s N ) by forcing nodes to move using  gradient information:    N   ( ) H p x s x     k k ( ) 1 ( ) R x p x dx   i ( ) ( ) s d x d x    1 , i i k k k k  H    b 1 k k s s Desired displacement = V · D t  i i k k s i Cassandras and Li, EJC, 2005 Zhong and Cassandras, IEEE TAC, 2011 Christos G. Cassandras CODES Lab. - Boston University
DISTRIBUTED COOPERATIVE SCHEME CONTINUED CONTINUED    N   ( ) H p x s x     k k ( ) 1 ( ) R x p x dx   i ( ) ( ) s d x d x    1 , i i k k k k … has to be autonomously evaluated by each node so as to determine how to move to next position:  H    b 1 k k s s  i i k k s i  Truncated p i ( x )   replaced by node neighborhood  i  Discretize p i ( x ) using a local grid Christos G. Cassandras CODES Lab. - Boston University
CONTROL AND OPTIMIZATION – CHALLENGES 1. SCALABILITY Distributed Algorithms 2. DECENTRALIZATION 3. COMMUNICATION Event-driven (asynchronous) Algorithms 4. NON-CONVEXITY Global optimality, escape local optima 5. EXLOIT DATA Data-Driven Algorithms Christos G. Cassandras CODES Lab. - Boston University
EVENT EVENT-DRIVEN DRIVEN DISTRIB DISTRIBUTED UTED AL ALGORITHMS GORITHMS
DISTRIBUTED COOPERATIVE OPTIMIZATION N system components  min ( , , ) H s s 1 N s (processors, agents, vehicles, nodes) , 1 . . constraint s on s t s one common objective: 1  min ( , , ) H s s 1 N  , , … s s 1 N . . constraint s on each s t s i 1  min ( , , ) H s s N s N . . constraint s on s t s N Christos G. Cassandras CODES Lab. - Boston University
DISTRIBUTED COOPERATIVE OPTIMIZATION Controllable state s i , i = 1,…, n i i    a ( 1 ) ( ) ( ( )) s s k s k d k i i i i Step Size Update Direction , usually 1  min ( , , ) H s s   N ( ( )) ( ( )) d s k H s k s i i i . . constraint s on s t s i i requires knowledge of all s 1 ,…, s N Inter-node communication Christos G. Cassandras CODES Lab. - Boston University
SYNCHRONIZED (TIME-DRIVEN) COOPERATION COMMUNICATE + UPDATE 1 2 3 Drawbacks:  Excessive communication (critical in wireless settings!)  Faster nodes have to wait for slower ones  Clock synchronization infeasible  Bandwidth limitations  Security risks Christos G. Cassandras CODES Lab. - Boston University
ASYNCHRONOUS COOPERATION 1 2 3  Nodes not synchronized, delayed information used Update frequency for each node    a ( 1 ) ( ) ( ( )) s k s k d s k is bounded  i i i i + converges technical conditions Bertsekas and Tsitsiklis, 1997 Christos G. Cassandras CODES Lab. - Boston University
ASYNCHRONOUS (EVENT-DRIVEN) COOPERATION UPDATE COMMUNICATE 1 2 3  UPDATE at i : locally determined, arbitrary (possibly periodic)  COMMUNICATE from i : only when absolutely necessary Christos G. Cassandras CODES Lab. - Boston University
WHEN SHOULD A NODE COMMUNICATE? Node state at any time t : x i ( t )  s i ( k ) = x i ( t k ) Node state at t k : s i ( k ) : node j state estimated by node i AT UPDATE TIME t k : s i ( k ) j Estimate examples:  j ( k ) j   i j Most recent value ( ) ( ( )) s k x k j j i t k     j ( ) t k     a   i j j ( ) ( ( )) k ( ( )) s k x k d x k Linear prediction D j j i j j j Christos G. Cassandras CODES Lab. - Boston University
WHEN SHOULD A NODE COMMUNICATE? AT ANY TIME t : : node i state estimated by node j  x j ( t ) i  If node i knows how j estimates its state, then it can evaluate x j ( t ) i  Node i uses • its own true state, x i ( t ) • the estimate that j uses, x j ( t ) i   … and evaluates an ERROR FUNCTION j ( ), ( ) g x t x t i i   Error Function examples: j j ( ) ( ) , ( ) ( ) x t x t x t x t i i i i 1 2 Christos G. Cassandras CODES Lab. - Boston University
WHEN SHOULD A NODE COMMUNICATE?   to THRESHOLD  i Compare ERROR FUNCTION j ( ), ( ) g x t x t i i Node i communicates its state to node j only when it detects that its true state x i ( t ) deviates from j ’ estimate of it x j ( t ) i   so that   j ( ), ( ) g x t x t i i i  i ( t ) x i  Event-Driven Control j i i Christos G. Cassandras CODES Lab. - Boston University
THRESHOLD PROCESS Update Direction , usually    i i 0 K ( ( )) ( ( )) d s k H s k  i i Intuition:   i i ( ( ) if K d s k k C     i ( ) near convergence k i    ( 1 ) otherwise k (small ), i ( ( )) i d i s k better estimates are needed   i ( 0 ) ( ( 0 ) K d s  i i Christos G. Cassandras CODES Lab. - Boston University
CONVERGENCE Asynchronous distributed state update process at each i :    a  Estimates of other nodes, i ( 1 ) ( ) ( ( )) s s k s k d k i i i evaluated by node i  i ( ( ) if sends update K d s k k     i ( ) k i   ( 1 ) otherwise  k i Christos G. Cassandras CODES Lab. - Boston University
CONVERGENCE CONTINUED ASSUMPTION 1: There exists a positive integer B such that for all i = 1,…, N and k ≥ 0 at least one of the elements of the set { k−B +1, k−B +2,..., k } belongs to C i . INTERPRETATION: Each node updates its state at least once during a period in which B state update events take place (no time bound)   ASSUMPTION 2: The objective function H ( s ),    N m , s m n i 1 i satisfies: (a) H ( s ) ≥ 0 , for all   m s  H (  (b) H (·) continuously differentiable and Lipschitz continuous, )   m i.e., there exists K 1 such that for all x , y      ( ) ( ) H x H y K x y 1 Christos G. Cassandras CODES Lab. - Boston University
CONVERGENCE CONTINUED ASSUMPTION 3: There exist positive constants K 2 , K 3 such that k  for all i = 1,…, N and i C  2    i (a) ( ) ( ( )) ( ) / d k H s k d k K 3 i i i   i (b) ( ( )) ( ) s K H k d k 2 i i NOTE: Very mild condition, immediately satisfied with K 2 = K 3 = 1 when we use   the usual update direction given by i ( ) ( ( )) d k H s k i i ASSUMPTION 4: There exists a positive constant K 4 such that The ERROR FUNCTION satisfies    j j ( ) ( ) ( ( ) ( )) x t x t K g x t x t 4 i i i i NOTE: Very mild condition, immediately satisfied with K 4 = 1 when we use the    common choice j j ( ( ) ( )) ( ) ( ) g x t x t x t x t i i i i Christos G. Cassandras CODES Lab. - Boston University
CONVERGENCE CONTINUED THEOREM: Under A1-A4, there exist positive constants α and K δ such that   lim ( ( )) 0 H s k   k Zhong and Cassandras, IEEE TAC, 2010 INTERPRETATION: - Event-driven optimization achievable with reduced communication requirements  energy savings - No loss of performance Christos G. Cassandras CODES Lab. - Boston University
CONVERGENCE CONTINUED THEOREM: Under A1-A4, there exist positive constants α and K δ such that   lim ( ( )) 0 H s k   k BYPRODUCT OF PROOF: obtaining the largest possible K δ and hence the smallest possible number of communication events:   1 2     a K       ( 1 ) K K B K m 1 3 4 Comm. State dim. ~ network dim.  a  0 2 / K K frequency 1 3 Christos G. Cassandras CODES Lab. - Boston University
COONVERGENCE WHEN DELAYS ARE PRESENT   j g x i x , i Error function trajectory with    k i NO DELAY 0       ij ij ij ij t 2 3 0 1 j , g x i x Red curve: i   ~ j , g x i x Black curve: i DELAY    k i 0          ij ij ij ij ij ij ij ij ij t 2 1 2 3 4 0 1 3 4 Christos G. Cassandras CODES Lab. - Boston University
COONVERGENCE WHEN DELAYS ARE PRESENT Add a boundedness assumption: ASSUMPTION 5: There exists a non-negative integer D such that if a message is sent before t k-D from node i to node j , it will be received before t k . INTERPRETATION: at most D state update events can occur between a node sending a message and all destination nodes receiving this message. THEOREM: Under A1-A5, there exist positive constants α and K δ such that   lim ( ( )) 0 H s k   k NOTE: The requirements on α and K δ depend on D and they are tighter. Zhong and Cassandras, IEEE TAC, 2010 Christos G. Cassandras CODES Lab. - Boston University
SYNCHRONOUS v ASYNCHRONOUS OPTIMAL COVERAGE PERFORMANCE Energy savings + Extended lifetime SYNCHRONOUS v ASYNCHRONOUS: No. of communication events SYNCHRONOUS v ASYNCHRONOUS: for a deployment problem with obstacles Achieving optimality in a problem with obstacles Christos G. Cassandras CODES Lab. - Boston University
OPTIMAL COVERAGE IN A MAZE http://www.bu.edu/codes/research/distributed-control/ Zhong and Cassandras, 2008 Christos G. Cassandras CODES Lab. - Boston University
DEMO: OPTIMAL DISTRIBUTED DEPLOYMENT WITH OBSTACLES – SIMULATED AND REAL Christos G. Cassandras CODES Lab. - Boston University
IT IS HARD T IT IS HARD TO DECENTRALIZ DECENTRALIZE PROBLEM 2 … MORE ON THAT LATER…
CONTROL AND OPTIMIZATION – CHALLENGES 1. SCALABILITY Distributed Algorithms 2. DECENTRALIZATION 3. COMMUNICATION Event-driven (asynchronous) Algorithms 4. NON-CONVEXITY Global optimality, escape local optima 5. EXLOIT DATA Data-Driven Algorithms Christos G. Cassandras CODES Lab. - Boston University
DA DATA-DRIVEN + DRIVEN + EVENT EVENT-DRIVEN DRIVEN AL ALGORITHMS GORITHMS
DATA-DRIVEN STOCHASTIC OPTIMIZATION GOAL: q max E [ L ( )] q � � CONTROL/DECISION � � � � SYSTEM PERFORMANCE � � q q t max E e c ( x ( t , ), u ( t , )) dt MDP: (Parameterized by q ) � � q E [ L ( )] q � u ( t , ) U � � 0 NOISE � � � � � � t q q max E e c ( x ( t , ), u ( t , )) dt � � q � � � � L ( q ) ∆ 0 L ( q ) x ( t ) GRADIENT q � q � � � L q ( ) � n 1 n n n ESTIMATOR REAL-TIME DATA DIFFICULTIES: - E [ L ( q )] NOT available in closed form - � ( q not easy to evaluate L ) � q � ( q E [ L ( )] - may not be a good estimate of L ) Christos G. Cassandras CISE SE - CODES Lab. - Boston University
DATA-DRIVEN STOCHASTIC OPTIMIZATION IN DES : INFINITESIMAL PERTURBATION ANALYSIS (IPA) Model Sample path x ( t ) CONTROL/DECISION Discrete Event PERFORMANCE (Parameterized by q ) System (DES) q E [ L ( )] NOISE L ( q ) x ( t ) q � q � � � L q ( ) IPA � n 1 n n n For many (but NOT all) DES: L ( q ) ∆ - Unbiased estimators - General distributions - Simple on-line implementation Christos G. Cassandras CISE SE - CODES Lab. - Boston University
REAL-TIME STOCHASTIC OPTIMIZATION: HYBRID SYSTEMS Sample path CONTROL/DECISION HYBRID PERFORMANCE (Parameterized by q ) SYSTEM q E [ L ( )] NOISE L ( q ) L ( q ) ∆ x ( t ) q � q � � � L q ( ) IPA � n 1 n n n A general framework for an IPA theory in Hybrid Systems Christos G. Cassandras CISE SE - CODES Lab. - Boston University
PERFORMANCE OPTIMIZATION AND IPA Performance metric � � � � � � q q � q q J ; x ( , 0 ), T E L ; x ( , 0 ), T (objective function): IPA goal: � � � � q q q dL dJ ; x ( , 0 ), T - Obtain unbiased estimates of , normally q q d d q dL ( ) - Then: q � q � � n � n 1 n n q d � � � � � q � q x , t d � � � � � � k � x t , k NOTATION: � q q d Christos G. Cassandras CISE SE - CODES Lab. - Boston University
THE IP THE IPA CAL A CALCUL CULUS US
IPA: THREE FUNDAMENTAL EQUATIONS System dynamics over ( � k ( q ), � k +1 ( q )]: � q x f ( x , , t ) � k � � � � � q � � q x , t � � � � � � � x t , k NOTATION: k � q � q � � � 1. Continuity at events: � � x ( ) x ( ) k k Take d/d q : � � � � � � � � � � � � x ' ( ) x ' ( ) [ f ( ) f ( )] ' � k k k 1 k k k k � � � � d ( q , q , x , , ) If no continuity, use reset condition � � � � x ' ( ) k q d Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA: THREE FUNDAMENTAL EQUATIONS 2. Take d/d q of system dynamics � q over ( � k ( q ), � k +1 ( q )]: x f ( x , , t ) � k � � dx ' ( t ) f ( t ) f ( t ) � � k x ' ( t ) k � � q dt x � � dx ' ( t ) f ( t ) f ( t ) over ( � k ( q ), � k +1 ( q )]: Solve � � k x ' ( t ) k � � q dt x � � � � f ( u ) f ( u ) t t v � � � f ( v ) k � k du du � � � � � � � � � � � x ( t ) e x k e x dv x ( ) � � k k k � q � � initial condition from 1 above � � � k NOTE: If there are no events (pure time-driven system), IPA reduces to this equation Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA: THREE FUNDAMENTAL EQUATIONS �� 3. Get depending on the event type: k � � � 0 - Exogenous event: By definition, k q � q � g k x ( ( , ), ) 0 - Endogenous event: occurs when k � 1 � � � � � � � g g g � � � � � � � � � � � � f ( ) x ( ) � � k k k k � � q � � x � � x � - Induced events: � 1 � � � � y ( ) � � � � � � � k k y ( ) � � k k k � t � � Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA: THREE FUNDAMENTAL EQUATIONS Ignoring resets and induced events: Recall: � � � � � � � � � � � � � 1. x ' ( ) x ' ( ) [ f ( ) f ( )] ' � � � q x , t � k k k 1 k k k k � � � � x t � q � � � � q � � � k k � q � � � � f ( u ) f ( u ) t t v � � � f ( v ) k � k du du � � � � � � � � 2. � � x ( t ) e x k e x dv x ' ( ) � � k k k � q � � � � � k � 1 � � � � � � � g g g � � � � 3. � � � � � � � or � � f ( ) � x ( ) � 0 � � k k k k � � q � k � x � � x � 2 1 � x � ' ( ) k Cassandras et al, Europ. J. Control, 2010 3 Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA PROPERTIES � � � � � N � k 1 q � q L L ( x , , t ) dt Back to performance metric: k � k 0 � k � � � q L x , , t � � � q � NOTATION: L x , , t k k � q � � � � � � q N dL � k 1 � � � � � � � � � � � � � q Then: � � L ( ) L ( ) L ( x , , t ) dt � � k 1 k k 1 k k k k q d � � � � � k 0 � k What happens What happens at event times between event times Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA PROPERTY 1: ROBUSTNESS THEOREM 1: If either 1,2 holds, then dL ( q ) /d q depends only on information available at event times � k : 1. L ( x , q , t ) is independent of t over [ � k ( q ), � k +1 ( q )] for all k 2. L ( x , q , t ) is only a function of x and for all t over [ � k ( q ), � k +1 ( q )]: � � � d L d f d f � � � k k k 0 � � � q dt x dt x dt � � � � � � q N dL � k 1 � � � � � � � � � � � � � q � � L ( ) L ( ) L ( x , , t ) dt � � k 1 k k 1 k k k k q d � � � � � k 0 � k IMPLICATION: - Performance sensitivities can be obtained from information limited to event times, which is easily observed - No need to track system in between events ! Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA PROPERTY 1: ROBUSTNESS EXAMPLE WHERE THEOREM 1 APPLIES (simple tracking problem): � � T � L � � � min E [ x ( t ) g ( )] dt � � � � 1 q � � , � � x 0 � � � � q � s.t. x a x ( t ) u ( ) w ( t ) f f du � � � q � k a , k k k k k k k k k � � q x d � k 1 , , N � k k k NOTE: THEOREM 1 provides sufficient conditions only. IPA still depends on info. limited to event times if � � q � x a x ( t ) u ( , t ) w ( t ) � k k k k k k � k 1 , , N � for “nice” functions u k ( q k , t ) , e.g., b k q t Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA PROPERTY 1: ROBUSTNESS EVENTS x � q f ( x , u , w , t ; ) � � k � k +1 ( q Evaluating x t ; ) requires full knowledge of w and f values (obvious) q dx ( t ; ) However, may be independent of w and f values ( NOT obvious) q d It often depends only on: - event times � k � - possibly f � ( ) � k 1 Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA PROPERTY 2: DECOMPOSABILITY THEOREM 2: Suppose an endogenous event occurs at � k with switching function g ( x , q ) . � x � � f � � � ( ) If ( ) 0 , then is independent of f k −1 . k k k dg � � x � � � If, in addition, then ( ) 0 0 k q d IMPLICATION: Performance sensitivities are often reset to 0 � sample path can be conveniently decomposed Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA PROPERTY 3: SCALABILITY IPA scales with the EVENT SET, not the STATE SPACE ! As a complex system grows with the addition of more states, the number of EVENTS often remains unchanged or increases at a much lower rate. EXAMPLE: A queueing network may become very large, but the basic events used by IPA are still “arrival” and “departure” at different nodes . IPA estimators are EVENT-DRIVEN Christos G. Cassandras CISE SE - CODES Lab. - Boston University
IPA PROPERTIES In many cases: - No need for a detailed model (captured by f k ) to describe state behavior in between events - This explains why simple abstractions of a complex stochastic system can be adequate to perform sensitivity analysis and optimization, as long as event times are accurately observed and local system behavior at these event times can also be measured. - This is true in abstractions of DES as HS since: Common performance metrics (e.g., workload) satisfy THEOREM 1 Christos G. Cassandras CISE SE - CODES Lab. - Boston University
WHAT IS THE RIGHT ABSTRACTION LEVEL ? TOO FAR… model not detailed enough TOO CLOSE… too much undesirable detail JUST RIGHT… good model CREDIT: W.B. Gong Christos G. Cassandras CISE SE - CODES Lab. - Boston University
A SMAR A SMART CITY T CITY CPS APPLICA CPS APPLICATION: TION: AD ADAPTIVE APTIVE TRAFFIC TRAFFIC LIGHT LIGHT CONTR CONTROL OL
TRAFFIC LIGHT CONTROL - BACKGROUND A basic binary switching control (GREEN – RED) problem with a long history… • Mixed Integer Linear Programming (MILP) [ Dujardin et al, 2011 ] • Extended Linear Complementarity Problem (ELCP) [ DeSchutter, 1999 ] • MDP and Reinforcement Learning [ Yu et al., 2006 ] • Game Theory [ Alvarez et al., 2010 ] • Evolutionary algorithms [ Taale et al., 1998 ] • Fuzzy Logic [ Murat et al., 2005 ] • Expert Systems [ Findler and Stapp, 1992 ] • Perturbation Analysis Christos G. Cassandras CODES Lab. - Boston University
TRAFFIC LIGHT CONTROL - BACKGROUND • Perturbation Analysis [ Panayiotou et al., 2005 ] Single Intersection [ Geng and Cassandras, 2012 ] Use a Hybrid System Model: Stochastic Flow Model (SFM) Vehicle queue DES SFM Aggregate states into modes and keep only events causing mode transitions Christos G. Cassandras CODES Lab. - Boston University
SINGLE-INTERSECTION MODEL Traffic light control: q � q q q q [ , , , ] 4 1 2 3 4 2 GREEN light cycle at queue n = 1,2,3,4 1 3 OBLECTIVE: Determine q to minimize � � 1 4 �� T q � q min J ( ) E w x ( , t ) dt � � total weighted vehicle queues T n n T q � � 0 � n 1 Christos G. Cassandras CISE - CODES Lab. - Boston University
SINGLE-INTERSECTION MODEL � � 4 1 1 � � �� T q � q � q min J ( ) E w x ( , t ) dt E L ( ) � � T n n T T T � � q 0 � n 1 IPA APPROACH: � � � � q q dL T dJ T - Observe events and event times, estimate through q q d d q dL ( ) q � q � � T n - Then, � n 1 n n q d Christos G. Cassandras CISE - CODES Lab. - Boston University
HYBRID SYSTEM STATE DYNAMICS q n GREEN n n n q n GREEN n � � q � q � 1 if 0 z t ( ) or z t ( ) GREEN light “clock” � � n n n n z t ( ) n 0 otherwise � � � � q z t ( ) 0 if z t ( ) Control: GREEN light cycle n n n Christos G. Cassandras CISE - CODES Lab. - Boston University
HYBRID SYSTEM STATE DYNAMICS � � � � q � q � 1 if 0 z t ( ) or z t ( ) � q z t ( ) 0 if z t ( ) � � n n n n z t ( ) n n n n 0 otherwise � [ RESOURCE DYNAMICS ] � � q � q � 1 if 0 z t ( ) or z ( ) t Define: � � n n n n GREEN at queue n G t ( ) n 0 otherwise � � q � � ( ) t if G ( , ) z 0 Queue content n n � � � � � � x t ( ) 0 if x t ( ) 0 and ( ) t ( ) t � n n n n [ USER DYNAMICS ] � � � � ( ) t ( ) t otherwise � n n Vehicle departure rate process Vehicle arrival rate process Christos G. Cassandras CISE - CODES Lab. - Boston University
EVENTS IN THE TLC MODEL Event G2R Event R2G GREEN light switches to RED RED light switches to GREEN endogenous endogenous Event E Event S Non-Empty-Period (NEP) ends Non-Empty-Period (NEP) starts endogenous endogenous or exogenous Christos G. Cassandras CISE - CODES Lab. - Boston University
APPLY IPA EQUATIONS FOR q AND s VECTORS FOR EXAMPLE: Endogenous event with q � q � q � g ( x ( , ), ) x ( , t ) 0 k k n � 1 � � � � � � � g g g � � � ' x ( ) � � � � � � � � � � � � f ( ) x ( ) � � � � n , i k ' k k k k � � q � x x � � � � k , i � � � � � ( ) ( ) n k n k � k � � � � � � � � ' � ( ) ( ) x ( ) � � ' � � ' � � n k n k n , i k � � � � � � � � � � � � x ( ) x ( ) x ' ( ) x ' ( ) [ f ( ) f ( )] ' � k k k 1 k k k k n , i k n , i k � � � � � ( ) ( ) n k n k � 0 Perturbation in queue n RESET to 0 when NEP ends Christos G. Cassandras CISE - CODES Lab. - Boston University
COST DERIVATIVE IN m th NEP � q ( ) � � n , m q L x ( , t ) dt n , m n � q ( ) n , m NOTES: - Need only TIMERS, COUNTERS and state derivatives - Scaleable in number of EVENTS – not states! Christos G. Cassandras CISE - CODES Lab. - Boston University
TYPICAL SIMULATION RESULTS 9-fold cost reduction Traffic pattern changes Adaptivity Christos G. Cassandras CISE - CODES Lab. - Boston University
IT IS HARD T IT IS HARD TO DECENTRALIZ DECENTRALIZE PROBLEM 2 …
MULTI-AGENT OPTIMIZATION: PROBLEM 2 Ω a 1 a i O 1 x O 2 a 3 a 2 T May also have dynamics � � � max J P ( x , s ( u ( t ))) R ( x ) dx dt � 0 u ( t ) � � � � � � s i ( t ) F , i 1 , , N s f ( s , u , t ), i 1 , , N � � � i i i i GOAL: Find the best state trajectories s i ( t ) , 0 ≤ t ≤ T so that agents achieve a maximal reward from interacting with the mission space Christos G. Cassandras CODES Lab. - Boston University
PERSISTENT MONITORING PROBLEM GOAL: Find the best state trajectories s i ( t ) , 0 ≤ t ≤ T so that agents achieve a maximal reward from interacting with the mission space Need three model elements: 1. ENVIRONMENT MODEL T � � � max J P ( x , s ( u ( t ))) R ( x ) dx dt � 0 u ( t ) 2. SENSING MODEL (how agents interact with environment) � � s f ( s , u , t ), i 1 , , N � � 3. AGENT MODEL i i i i Christos G. Cassandras CODES Lab. - Boston University
PERSISTENT MONITORING PROBLEM Start with 1-dimensional mission space � = [0, L ] � � AGENT DYNAMICS: s u , u ( t ) 1 � j j j � � � Analysis still holds for: s g ( s ) bu , u ( t ) 1 � j j j j j Christos G. Cassandras CODES Lab. - Boston University
PERSISTENT MONITORING PROBLEM SENSING MODEL: p ( x , s ) Probability agent at s senses point x x s ( t ) Christos G. Cassandras CODES Lab. - Boston University
PERSISTENT MONITORING PROBLEM x s ( t ) ENVIRONMENT MODEL: Associate to x Uncertainty Function R ( x,t ) � � � 0 if R ( x , t ) 0 , A ( x ) Bp ( x , s ( t )) � Use: � R ( x , t ) � � A ( x ) Bp ( x , s ( t )) otherwise � � � � R ( t ) f ( R , s , t ) noise If x is a known “target”: x x Christos G. Cassandras CODES Lab. - Boston University
PERSISTENT MONITORING PROBLEM Partition mission space � = [0, L ] into M intervals: … � 1 � M For each interval i = 1,…, M define Uncertainty Function R i ( t ): � � � 0 if R ( t ) 0 , A BP ( s ( t )) � � i i i R ( t ) � i � A BP ( s ( t )) otherwise � i i � � N � � � p ( s ) p ( , s ) � � � P ( s ) 1 1 p ( s ) i j j i j i i j � j 1 where P i ( s ) = joint prob. i is sensed by agents located at s = [ s 1 ,…, s N ] Christos G. Cassandras CODES Lab. - Boston University
PERSISTENT MONITORING (PM) WITH KNOWN TARGETS 1 T M � � � min J R ( t ) dt i T 0 u , , u � 1 N � i 1 s.t. � � � � � � s u , u ( t ) 1 , 0 a s ( t ) b L � j j j j � � � 0 if R ( t ) 0 , A BP ( s ( t )) � i i i � R ( t ) � i � A BP ( s ( t )) otherwise � i i Christos G. Cassandras CODES Lab. - Boston University
PERSISTENT MONITORING WITH KNOWN TARGETS Agent-Target Interaction Network Agent Network (time-varying) (time-varying) Hard to decentralize a controller that involves time-varying agent-environment interactions Christos G. Cassandras CODES Lab. - Boston University
THREE TYPES OF NEIGHBORHOODS (conventional) 𝑈 4 𝑈 3 𝐵 2 𝐵 4 𝐵 3 𝑈 2 𝑈 5 𝐵 1 𝑈 1 𝐵 5 Christos G. Cassandras CODES Lab. - Boston University
PM WITH KNOWN TARGETS – 1D CASE We have shown that: 1. Optimal Trajectories are bounded: � � � * x s ( t ) x j 1 , , N � 1 j M 2. Existence of finite dwell times at target on optimal trajectories: Under certain conditions: � � � * * s ( t ) x and u ( t ) 0 for t [ t , t ] j k j 1 2 3. Under the constraint s j ( t ) < s j +1 ( t ) , on an optimal trajectory: � s ( t ) s 1 t ( ) � j j Zhou et al, IEEE CDC, 2016 Christos G. Cassandras CODES Lab. - Boston University
OPTIMAL CONTROL SOLUTION Optimal trajectory is fully characterized by TWO parameter vectors: � � � � � � q � q q � w w w , j 1 , , N , j 1 , , N � � � � j j 1 jS j j 1 jS Waiting times at Switching points switching points, w jk ≥ 0 1 K M � �� ( θ , w ) � � � k 1 J ( θ , w ) R ( t ) dt i � k : k th event time T � ( θ , w ) k � � k 0 i 1 � Under optimal control, this is a HYBRID SYSTEM Christos G. Cassandras CODES Lab. - Boston University
HYBRID SYSTEM EVENTS Type 1: switches in 𝑆 𝑗 (𝑢) Type 2: switches in agent sensing Type 3: switches in 𝑡 𝑘 (𝑢) Type 4: changes in neighbor sets Christos G. Cassandras CODES Lab. - Boston University
HYBRID SYSTEM EVENTS: EXAMPLE � � � 0 if R ( t ) 0 , A BP ( s ( t )) � � i i i R ( t ) � i � A BP ( s ( t )) otherwise � i i A simple example: 1 agent 1 target Event type 1 𝑆 𝑗 = 𝐵 𝑗 − 𝐶 𝑗 𝑄 𝑗 (𝐭(𝑢)) 𝑆 𝑗 = 0 𝑡 𝑘 = −1 𝑡 𝑘 = −1 𝑆 ↓= 0 𝑆 ↑= 0 𝑆 𝑗 = 𝐵 𝑗 − 𝐶 𝑗 𝑄 𝑗 (𝐭(𝑢)) 𝑆 𝑗 = 0 𝑡 𝑘 = 0 𝑡 𝑘 = 0 Event type 2 𝑆 𝑗 = 𝐵 𝑗 − 𝐶 𝑗 𝑄 𝑗 (𝐭(𝑢)) 𝑆 𝑗 = 0 −1 𝑣 = 0 𝑡 𝑘 = 1 𝑡 𝑘 = 1 1 � � s u , u ( t ) 1 � j j j Christos G. Cassandras CODES Lab. - Boston University
IPA GRADIENTS Objective function gradient: M K 1 � ��� ( θ , w ) T � � � � � � k 1 � � R ( t ) R ( t ) J ( θ , w ) R ( t ) dt � � R ( t ) i i � � i i T � � � � θ w � ( θ , w ) k � � i 1 k 0 � where is obtained using the IPA Calculus R i ( t ) � R i ( t ) is updated on an � � � � � � � � � � � � � x ' ( ) x ' ( ) [ f ( ) f ( )] ' 1. � k k k 1 k k k k EVENT-DRIVEN basis � k : k th event time � � � � f ( u ) f ( u ) t t � v � � f ( v ) k k � du du � � � � � � � � � � 2. x ( t ) e x k e x dv x ' ( ) � � k k k � q � � � � � k � 1 � � � � g � � g g � � � � � � � � � � � � � � f ( ) � x ( ) � 3. 0 or � � k � k k � q � k k � x � � x � Christos G. Cassandras CODES Lab. - Boston University
Recommend
More recommend