a unified framework for delay sensitive communications
play

A Unified Framework for Delay-Sensitive Communications Fangwen Fu - PowerPoint PPT Presentation

A Unified Framework for Delay-Sensitive Communications Fangwen Fu fwfu@ee.ucla.edu Advisor: Prof. Mihaela van der Schaar Motivation C C o o n n


  1. ������������������������������ A Unified Framework for Delay-Sensitive Communications Fangwen Fu fwfu@ee.ucla.edu Advisor: Prof. Mihaela van der Schaar

  2. Motivation C C o o n n te te n n t S t S e e rv rv e e r r S S D D T T V V P P D D A A B B r r id id g g e e In In te te rn rn e e t t A A c c c c e e s s s s P P o o in in t t H H D D T T V V In In te te rn rn e e t A t A c c c c e e s s s s R R G G D D V V D D R R e e c c o o rd rd e e r r In-home streaming Sensor networks Wireless video phone Video conference VOIP Delay sensitive multimedia applications are booming over a variety of • time-varying networks (e.g. sensor networks, WiMax, Wireless LAN, etc.) Existing dynamic distributed network environments cannot provide • adequate support for delay-sensitive multimedia applications This problem has been investigated for a decade, but we still do not have • efficient solutions for it. �

  3. Challenges Channel state Data arrival Transmitter Receiver Transmitter Receiver Challenge 1 : Unknown time-varying environments • Time-varying data arrivals and channel conditions – Lack of statistic knowledge of dynamics – Challenge 2: Heterogeneity in the data to transmit (e.g. media data) • Different delay deadlines, importance, and dependencies – Challenge 3 : Coupling in multi-user transmission • Mutual impact due to dynamically sharing of the same network resources – (e.g. bandwidth, transmission opportunities) by multiple users �

  4. Existing solutions-1 Minimize average delay for homogeneous traffic in point-to-point • communications Transmitter Receiver Information theory [Shannon and beyond] – Challenge 1 • Water-filling algorithms – Maximize the throughput without delay constraints – Control theory – Challenge 1 • Markov decision process (MDP) formulation [Berry 2002, Borkar 2007, Krishnamurthy – 2006] • Statistic knowledge of the underlying dynamics is required Online learning [Krishnamurthy 2007, Borkar 2008] – • Slow convergence and large memory requirement Stability-constrained optimization for single-user transmission [Tassiulas 1992,2006, – Neely 2006, Kumar 1995, Stolyar 2003] • Queue is stable, but delay performance is suboptimal (for low delay applications) �

  5. Existing solutions-2 Maximize quality of delay-sensitive applications with heterogeneous • traffic Transmitter Receiver Multimedia communication theory – Challenge 2 • Cross-layer optimization [van der Schaar 2001, 2003, 2005, Katsaggelos 2002] – • Observes and then optimizes (i.e. myopic optimization) Rate distortion optimization (RaDiO) [Chou, 2001, Frossard 2006, Girod 2006, Ortega – 2009] • Explicitly considers importance, delay deadlines and dependencies of packets • Linear transmission cost (e.g. not suitable for energy-constrained transmission) • No learning ability in unknown environments Both solutions only explore the heterogeneity in the media data, but do not explore the – network dynamics (e.g. time-varying channel conditions) and resource constraints. �

  6. Existing solutions-3 Multi-user transmission by sharing network resources • Transmitter Receiver Transmitter Receiver Network optimization theory • Network utility maximization [Chiang 2007, Katsaggelos 2008] – Challenge 3 – • Uses static utility function without considering the network dynamics • No delay guarantee • No learning ability in unknown environments Stability-constrained optimization for multi-user transmission [Tassiulas 1992, – 2006, Neely 2006, 2007, Kumar 1995, Stolyar 2003] - Challenges 1 and 3 • Queue is stable, but delay performance is suboptimal (for low-delay applications) • Does not consider heterogeneous media data �

  7. A unified foresighted optimization framework Challenges Solutions dynamic systems Foresighted optimization framework Unknown dynamics Online learning Learning efficiency Heterogeneity Separation principles Multi-user coupling Current utility State-value function + max y { � w u ( s, y ) V ( f ( s, y, w )) } Queue length s ′ = f ( s, y, w ) State: s State: Channel condition Heterogeneity Action: y Dynamics: w Current time slot Next time slot �

  8. Key accomplishments Previous state-of-art methods Improvements Energy-efficient data Stability constrained optimization Reduce the delay by transmission* [Neely 2006] 70% (at low delay region) Wireless video Rate-distortion optimization [Chou Improve up to 5dB in transmission 2001] video quality Multi-user video Network utility maximization [Chiang Improve 1~3dB in video transmission 2007] quality *minimize the average delay �

  9. Roadmap • Separation principle 1 (improving learning efficiency) – Post-decision state-based formulation – Structure-aware online learning with adaptive approximation • Separation principle 2 (Separating the foresighted decision for heterogeneous media data transmission) – Context-based state – Priority-based scheduling • Separation principle 3 (decomposing multi-user coupling ) – Multi-user Markov decision process formulation – Post-decision state value function decomposition �

  10. Roadmap • Separation principle 1 (improving learning efficiency) – Post-decision state-based formulation – Structure-aware online learning with adaptive approximation • Separation principle 2 (Separating the foresighted decision for heterogeneous media data transmission) – Context-based state – Priority-based scheduling • Separation principle 3 (decomposing multi-user coupling ) – Multi-user Markov decision process formulation – Post-decision state value function decomposition ��

  11. Energy-efficient data transmission x t y t h t a t Transmitter Receiver Point-to-point time-slotted communication system • System variables • x t Backlog (queue length): – h t Channel state: Finite state Markov chain (e.g. Rayleigh fading) – a t : i.i.d. Data arrival process: – Decision at each time slot • y t , 0 ≤ y t ≤ x t Amount of data to transmit (transmission rate): – ρ ( h t , y t ), convex in y t , e.g. ρ t ( h t , y t ) = σ 2 (2 �� − 1) Energy consumption: . – h � What is the optimal (queueing) delay and energy trade-off? ��

  12. Foresighted optimization formulation 8 average delay Constant energy 6 Foresighted optimization (MDP) formulation • ( x t , h t ) State: 4 – Constant rate y t Action: – 2 π : ( x t , h t ) → y t Policy: – Optimal trade�off 0 Utility function: – 25 30 35 40 45 energy consumption u ( x t , h t , y t ) = − ( x t − y t + λρ ( h t , y t )). Objective (optimize the trade-off between delay and energy consumption) • ∞ � α t { u ( x t , h t , π ( x t , h t )) } α ∈ [0 , 1) is discount factor. max � π � ∞ t =0 α ( k − t ) { u ( x k , h k , π ( x k , h k )) } State value function: V ( x t , h t ) = max � – π Bellman’s equations • k = t π { u ( x, h, π ( x, h )) + α � a,h ′ | h V ( x − π ( x, h ) + a, h ′ ) } V ( x, h ) = max Policy iteration – ��

  13. Challenges for solving the Bellman’s equations Bellman’s equation: π { u ( x, h, π ( x, h )) + α � a,h ′ | h V ( x − π ( x, h ) + a, h ′ ) } V ( x, h ) = max Lack of statistical knowledge of the underlying dynamics • Unknown traffic characteristics – Unknown channel (network) dynamics – Coupling between the maximization and expectation • Curses of dimensionality • Large state space – • Intractable due to large memory and heavy computation requirements ��

  14. Conventional online learning methods Decision and dynamics • Normal state Normal state ( x t +1 , h t +1 ) ( x t , h t ) … Decision y t Exogenous dynamics a t , h t +1 V ( x t , h t ) V ( x t +1 , h t +1 ) State�value function State�value function Foresighted optimization • 0 ≤ y ≤ x { u ( x, h, y ) + α � a,h ′ | h V ( x − y + a, h ′ ) } V ( x, h ) = max Q ( x, h, y ) Online learning • Learn Q-function (Q-learning): Q ( x, h, y ) – Low convergence, high space complexity ��

  15. Our approach- separation via post-decision state Post-decision state Normal state Normal state ( x t − y t , h t ) ( x t +1 , h t +1 ) ( x t , h t ) … Exogenous dynamics Decision y t a t , h t +1 V ( x t , h t ) U (˜ x t , h t ) V ( x t +1 , h t +1 ) State�value function Post�decision State�value function state�value function Foresighted decision Expectation over dynamics U ( x, h ) = � a,h ′ | h V ( x + a, h ′ ) V ( x, h ) = max y { u ( x, h, y ) + αU ( x − y, h ) } Post-decision state separates foresighted decision from dynamics. Expectation over dynamics Foresighted decision ��

  16. Post-decision state-based online learning U ( x, h ) = � a,h ′ | h V ( x + a, h ′ ) V ( x, h ) = max y { u ( x, h, y ) + αU ( x − y, h ) } Online learning • U t ( x, h t − 1 ) = (1 − β t ) U t − 1 ( x, h t − 1 ) + β t V t ( x, h t ) e.g. β t = 1 /t Online update Time-average Foresighted decision V t ( x, h t ) = max y ∈Y { u ( x, h t , y ) + αU t − 1 ( x − y, h t ) } Theorem: Online adaptation converges to the optimal solution when t → ∞ Expectation is independent of backlog �� → batch update (fast convergence). Batch update incurs high complexity. � ��

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend