optimal decentralized control of coupled subsystems with
play

Optimal decentralized control of coupled subsystems with control - PowerPoint PPT Presentation

Optimal decentralized control of coupled subsystems with control sharing Aditya Mahajan McGill University IEEE Conference on Decision and Control, 2011 A Mahajan (McGill) Control sharing info struc 1 Notation Random variables: ,


  1. Optimal decentralized control of coupled subsystems with control sharing Aditya Mahajan McGill University IEEE Conference on Decision and Control, 2011

  2. A Mahajan (McGill) Control sharing info struc 1 Notation Random variables: ๐‘Œ , realizations: ๏ฟฝ , state spaces: ๐’ด . ๐‘ ๔€Š ๔€• means that variable ๐‘ belongs to subsystem ๏ฟฝ at time ๏ฟฝ . ๐‘ ๔€ฃ:๔€• = ๏ฟฝ๐‘ ๔€ฃ , ๐‘ ๔€ค , โ€ฆ, ๐‘ ๔€• ๏ฟฝ ๐› = ๏ฟฝ๐‘ ๔€ฃ , ๐‘ ๔€ค , โ€ฆ, ๐‘ ๔€ ๏ฟฝ .

  3. A Mahajan (McGill) Control sharing info struc Controller with control sharing Objective Control-coupled subsystems 2 System Model ๏ฟฝ ๔€ฃ ๏ฟฝ ๔€ค โ‹ฏ ๏ฟฝ ๔€ ๔€• ๔€• ๔€• ๏ฟฝ ๔€ ๏ฟฝ ๔€ฃ ๏ฟฝ ๔€ค โ‹ฏ ๔€• ๔€• ๔€• ๏ฟฝ ๔€ ๏ฟฝ ๔€ฃ ๏ฟฝ ๔€ค ๔€• ๔€• ๔€• ๐ฏ ๔€•๔€ญ๔€ฃ ๐ฏ ๔€•๔€ญ๔€ฃ ๐ฏ ๔€•๔€ญ๔€ฃ ๏ฟฝ ๔€Š ๔€•๔€ฌ๔€ฃ = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐ฏ ๔€• , ๏ฟฝ ๔€Š ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• ๏ฟฝ ๔€ฃ:๔€• , ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ ๔€ป min all policies ๐ก ๐”ฝ [ โˆ‘ ๏ฟฝ ๔€• ๏ฟฝ๐ฒ ๔€• , ๐ฏ ๔€• ๏ฟฝ] ๔€•๔€ฎ๔€ฃ

  4. A Mahajan (McGill) Control sharing info struc 3 Some applications Feedback communication systems (physical layer) Point-to-point real-time source coding, multi-terminal source coding with feedback, some classes of multiple access channel with feedback Queueing networks (media access layer) Multi-access broadcast, some classes of decentralized scheduling and routing. Cellular networks Paging and registration in cellular networks

  5. A Mahajan (McGill) Control sharing info struc 4 Conceptual difficulties The system has non-classical information structure Data at each controller is increasing with time Is part of this data redundant? Can part of this data be compressed to a sufficient statistic? Multi-stage decision making How does current control action affect future estimation? its control action? ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€ฃ:๔€• , ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ What information does controller ๏ฟฝ communicate to controller ๏ฟฝ via

  6. A Mahajan (McGill) Other non-classical info-structures with sharing Belief sharing: Yรผksel, 2009 Periodic sharing: Ooi, Verbout, Ludwig, Wornell, 1997 Walrand, 1979, Nayyar, Mahajan, and Teneketzis, 2011 Witsenhausen 1971, Varaiya and Delayed (observation) sharing: Delayed state sharing: Aicadri, Davoli, and Minciardi, 1987 Reduces to one-step delayed sharing pattern Control sharing info struc embed the observations in control Exploit the fact that the action space is continuous and compact to Considered the LQG version of the problem Athans, 1974) Control sharing info-structure (Bismut, 1972, Sandell and Literature Overview 5 Partial history sharing: Mahajan, Nayyar, Teneketzis, 2008

  7. A Mahajan (McGill) wlo, wlo, Control sharing info struc Second structural result (based on common info approach of MNT 2008) Dynamic programming decomposition First structural result (based on person-by-person opt.) 6 Outline of the results ๏ฟฝ ๔€Š ๔€ฃ:๔€•๔€ญ๔€ฃ is redundant for optimal performance. ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ Define ฮ  ๔€Š ๔€• ๏ฟฝ๏ฟฝ๏ฟฝ = โ„™๏ฟฝ๐‘Œ ๔€Š ๔€• = ๏ฟฝ | ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ and ๐šธ ๔€• = ๏ฟฝฮ  ๔€ฃ ๔€• , โ€ฆ, ฮ  ๔€ ๔€• ๏ฟฝ . ๐† ๔€• is a sufficient statistic of ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ for optimal performance. ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐† ๔€• ๏ฟฝ

  8. A Mahajan (McGill) 7 Structural result based on person-by-person optimality Main lemma The states processes are conditionally independent given the past control actions. Control sharing info struc Implications ๔€ โ„™๏ฟฝ๐‘Œ ๔€Š ๔€ฃ:๔€• = ๏ฟฝ ๔€Š โ„™๏ฟฝ๐˜ ๔€ฃ:๔€• = ๐ฒ ๔€ฃ:๔€• | ๐• ๔€ฃ:๔€• ๏ฟฝ = โˆ ๔€ฃ:๔€• | ๐• ๔€ฃ:๔€• ๏ฟฝ ๔€Š๔€ฎ๔€ฃ Fix ๏ฟฝ ๔€ญ๔€Š and consider optimal design of ๏ฟฝ ๔€Š . Let ๐‘† ๔€Š ๔€• = ๏ฟฝ๐‘Œ ๔€Š ๔€• , ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ . Then {๐‘† ๔€Š ๔€• , ๏ฟฝ = ๏ฟฝ, โ€ฆ} is a controlled MDP with control action ๏ฟฝ ๔€Š ๔€• . โ„™๏ฟฝ๏ฟฝ ๔€Š ๔€•๔€ฌ๔€ฃ | ๏ฟฝ ๔€Š ๔€ฃ:๔€• , ๏ฟฝ ๔€Š ๔€ฃ:๔€• ๏ฟฝ = โ„™๏ฟฝ๏ฟฝ ๔€Š ๔€•๔€ฌ๔€ฃ | ๏ฟฝ ๔€Š ๔€• , ๏ฟฝ ๔€Š ๔€• ๏ฟฝ ๐”ฝ[๏ฟฝ ๔€• ๏ฟฝ๐ฒ ๔€• , ๐ฏ ๔€• ๏ฟฝ | ๏ฟฝ ๔€Š ๔€ฃ:๔€• , ๏ฟฝ ๔€Š ๔€ฃ:๔€• ] = ๐”ฝ[๏ฟฝ ๔€• ๏ฟฝ๐ฒ ๔€• , ๐ฏ ๔€• ๏ฟฝ | ๏ฟฝ ๔€Š ๔€• , ๏ฟฝ ๔€Š ๔€• ]

  9. Structural result . . . (cont.) A Mahajan (McGill) Design difficulty Implication of person-by-person optimality argument Original model 8 Control sharing info struc Data at the controller is still increasing with time ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€ฃ:๔€• , ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• ๏ฟฝ = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ

  10. A Mahajan (McGill) 9 A coordinator based on common information General idea proposed in (Mahajan, Nayyar, and Teneketzis 2008) Control sharing info struc ๏ฟฝ ๔€ฃ ๐‘Œ ๔€ฃ ๏ฟฝ ๔€ฃ ๏ฟฝ ๔€ค ๐‘Œ ๔€ค ๏ฟฝ ๔€ค ๔€• , ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๔€• , ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๔€• ๔€• ๔€• ๔€•

  11. A coordinator based on common information (cont.) Control sharing info struc 10 A Mahajan (McGill) ๏ฟฝ ๔€ฃ ๐‘Œ ๔€ฃ ๏ฟฝ ๔€ฃ ๏ฟฝ ๔€ค ๐‘Œ ๔€ค ๏ฟฝ ๔€ค ๔€• ๔€• ๔€• ๔€• ๔€• ๔€• ๏ฟฝ๏ฟฝ ๔€ฃ ๔€• , ๏ฟฝ ๔€ค โ„Ž ๔€• ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๔€• ๏ฟฝ where ๏ฟฝ ๔€Š ๔€• ๏ฟฝโ‹…๏ฟฝ = ๏ฟฝ ๔€Š ๔€• ๏ฟฝโ‹…, ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ

  12. A coordinator based on common information (cont.) A Mahajan (McGill) Control sharing info struc 11 Solution approach The coordinated system is a POMDP Identify the structure of optimal coordination strategies for the coordinated system Show that the coordinated system is equivalent to the original model Translate the structure of optimal coordination strategies to the original model

  13. A Mahajan (McGill) 12 The coordinated system wlo, Structure of optimal coordination strategy Control sharing info struc State: ๐ฒ ๔€• = ๏ฟฝ๏ฟฝ ๔€ฃ ๔€• , โ€ฆ, ๏ฟฝ ๔€ ๔€• ๏ฟฝ ๏ฟฝ ๔€ฃ ๐‘Œ ๔€ฃ ๏ฟฝ ๔€ฃ ๏ฟฝ ๔€ค ๐‘Œ ๔€ค ๏ฟฝ ๔€ค ๔€• ๔€• ๔€• ๔€• ๔€• ๔€• Observations: ๐ฏ ๔€•๔€ญ๔€ฃ = ๏ฟฝ๏ฟฝ ๔€ฃ ๔€•๔€ญ๔€ฃ , โ€ฆ, ๏ฟฝ ๔€ ๔€•๔€ญ๔€ฃ ๏ฟฝ ๏ฟฝ๏ฟฝ ๔€ฃ ๔€• , ๏ฟฝ ๔€ค โ„Ž ๔€• ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๔€• ๏ฟฝ Control actions: ๐ž ๔€• = ๏ฟฝ๏ฟฝ ๔€ฃ ๔€• , โ€ฆ, ๏ฟฝ ๔€ ๔€• ๏ฟฝ , ๔€ ๔€•๔€ญ๔€ฃ : ๔€ ๏ฟฝ๐’ด ๔€Š โ†’ ๐’ฑ ๔€Š ๐’ฑ ๔€Š ) Coordination rule: โ„Ž ๔€• : ( โˆ โˆ ๔€• ๏ฟฝ ๔€Š๔€ฎ๔€ฃ ๔€Š๔€ฎ๔€ฃ ๐ž ๔€• = โ„Ž ๔€• ๏ฟฝ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ Define ฮž ๔€• = โ„™๏ฟฝ state | history of observations ๏ฟฝ = โ„™๏ฟฝ๐ฒ | ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ . Then, ๐ž ๔€• = โ„Ž ๔€• ๏ฟฝ๐œŠ ๔€• ๏ฟฝ

  14. The coordinated system (cont.) A Mahajan (McGill) Control sharing info struc 13 Dynamic programming decomposition Salient features The optimization at each step is a functional optimization problem. (In our opinion) functional optimization at each step is the only way to circumvent the issue of signaling. ๏ฟฝ ๔€• ๏ฟฝ๐œŠ๏ฟฝ = min ๐ž ๐”ฝ [๏ฟฝ ๔€• ๏ฟฝ๐˜ ๔€• , ๐• ๔€• ๏ฟฝ + ๏ฟฝ ๔€•๔€ฌ๔€ฃ ๏ฟฝฮž ๔€•๔€ฌ๔€ฃ ๏ฟฝ | ฮž ๔€• = ๐œŠ]

  15. A Mahajan (McGill) Control sharing info struc Solve the DP for coordinated system. Dynamic programming decomposition Structural result wlo, system Translation of results back to the original 14 ๏ฟฝ ๔€ฃ ๐‘Œ ๔€ฃ ๏ฟฝ ๔€ฃ ๏ฟฝ ๔€ค ๐‘Œ ๔€ค ๏ฟฝ ๔€ค ๔€• ๔€• ๔€• ๔€• ๔€• ๔€• ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• ๏ฟฝ = โ„Ž ๔€Š ๔€• ๏ฟฝ๐œŠ ๔€• ๏ฟฝ๏ฟฝ๏ฟฝ ๔€Š ๔€• ๏ฟฝ = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐œŠ ๔€• ๏ฟฝ โ„Ž ๔€• ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ๏ฟฝ ๔€ฃ ๔€• , ๏ฟฝ ๔€ค ๔€• ๏ฟฝ Choose ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐œŠ ๔€• ๏ฟฝ = โ„Ž ๔€Š ๔€• ๏ฟฝ๐œŠ ๔€• ๏ฟฝ๏ฟฝ๏ฟฝ ๔€Š ๔€• ๏ฟฝ

  16. A Mahajan (McGill) Control sharing info struc 15 Further simplification of structural result Recall main lemma: The states processes are conditionally independent given the past control actions. Implication ๔€ โ„™๏ฟฝ๐‘Œ ๔€Š ๔€ฃ:๔€• = ๏ฟฝ ๔€Š โ„™๏ฟฝ๐˜ ๔€ฃ:๔€• = ๐ฒ ๔€ฃ:๔€• | ๐• ๔€ฃ:๔€• ๏ฟฝ = โˆ ๔€ฃ:๔€• | ๐• ๔€ฃ:๔€• ๏ฟฝ ๔€Š๔€ฎ๔€ฃ ๔€ ๐œŒ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๐œŠ ๔€• ๏ฟฝ๐ฒ๏ฟฝ = โ„™๏ฟฝ๐˜ ๔€• = ๐ฒ | ๐• ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ = โˆ ๔€• ๏ฟฝ ๔€Š๔€ฎ๔€ฃ

  17. Further simplification of structural result (cont.) while Control sharing info struc 16 Simplified structural result wlo, A Mahajan (McGill) Simplified dynamic programming decomposition Significant reduction is size. ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐œŠ ๔€• ๏ฟฝ = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐† ๔€• ๏ฟฝ ๐œŠ ๔€• โˆˆ ฮ”๏ฟฝ๐’ด ๔€ฃ ร— โ‹ฏ ร— ๐’ด ๔€ ๏ฟฝ ๐† ๔€• โˆˆ ฮ”๏ฟฝ๐’ด ๔€ฃ ๏ฟฝ ร— โ‹ฏ ร— ฮ”๏ฟฝ๐’ด ๔€ ๏ฟฝ ๏ฟฝ ๔€• ๏ฟฝ๐†๏ฟฝ = min ๐ž ๐”ฝ [๏ฟฝ ๔€• ๏ฟฝ๐˜ ๔€• , ๐• ๔€• ๏ฟฝ + ๏ฟฝ ๔€•๔€ฌ๔€ฃ ๏ฟฝ๐šธ ๔€•๔€ฌ๔€ฃ ๏ฟฝ | ๐šธ ๔€• = ๐†]

  18. A Mahajan (McGill) Using person-by-person approach Using specific conditional independence due to the dynamics Control sharing info struc Using the common information approach of (NMT 2008, 2011) 17 Original: Recap of structural results ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€ฃ:๔€• , ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๔€• , ๐œŠ ๔€• ๏ฟฝ, ๐œŠ ๔€• = โ„™๏ฟฝ๐˜ ๔€• | ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ ๏ฟฝ ๔€Š ๔€• = ๏ฟฝ ๔€Š ๔€• ๏ฟฝ๏ฟฝ ๔€Š ๐œŒ ๔€Š ๔€• = โ„™๏ฟฝ๐‘Œ ๔€Š ๔€• , ๐† ๔€• ๏ฟฝ, ๔€• | ๐ฏ ๔€ฃ:๔€•๔€ญ๔€ฃ ๏ฟฝ

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend