stability of feedback equilibrium solutions for
play

Stability of Feedback Equilibrium Solutions for Noncooperative - PowerPoint PPT Presentation

Stability of Feedback Equilibrium Solutions for Noncooperative Differential Games Alberto Bressan Department of Mathematics, Penn State University Alberto Bressan (Penn State) Noncooperative Games 1 / 33 Differential games: the PDE approach


  1. Stability of Feedback Equilibrium Solutions for Noncooperative Differential Games Alberto Bressan Department of Mathematics, Penn State University Alberto Bressan (Penn State) Noncooperative Games 1 / 33

  2. Differential games: the PDE approach The search for equilibrium solutions to noncooperative differential games in feedback form leads to a nonlinear system of Hamilton-Jacobi PDEs for the values functions main focus: existence and stability of the solutions to these PDEs Alberto Bressan (Penn State) Noncooperative Games 2 / 33

  3. Any system can be locally approximated by a linear one: x = f ( x , u 1 , u 2 ) , ˙ x = Ax + B 1 u 1 + B 2 u 2 ˙ Any cost functional can be locally approximated by a quadratic one. Main issue: Assume that an equilibrium solution is found, to an approximating game with linear dynamics and quadratic costs. Does the original nonlinear game also have an equilibrium solution, close to the linear-quadratic one? Two cases (i) finite time horizon (ii) infinite time horizon Alberto Bressan (Penn State) Noncooperative Games 3 / 33

  4. An example - approximating a Cauchy problem for a PDE u t = u xx u (0 , x ) = ϕ ( x ) ϕ ( x ) ≈ ax 2 + bx + c Approximate the initial data: u ( t , x ) ≈ ax 2 + bx + c + 2 at Conclude: � CORRECT for t ≥ 0 WRONG for t < 0 Alberto Bressan (Penn State) Noncooperative Games 4 / 33

  5. Noncooperative differential games with finite horizon Finite Horizon Games x ∈ R n state of the system u 1 , u 2 controls implemented by the players Dynamics: x ( t ) = f ( x , u 1 , u 2 ) ˙ x ( τ ) = y Goal of i -th player: � T � � � � . maximize: J i ( τ, y , u 1 , u 2 ) = ψ i x ( T ) − L i x ( t ) , u 1 ( t ) , u 2 ( t ) dt τ = [terminal payoff] - [integral of a running cost] Alberto Bressan (Penn State) Noncooperative Games 5 / 33

  6. Seek: Nash equilibrium solutions in feedback form u i = u ∗ i ( t , x ) • Given the strategy u 2 = u ∗ 2 ( t , x ) adopted by the second player, for every initial data ( τ, y ), the assignment u 1 = u ∗ 1 ( t , x ) is a feedback solution to optimal control problem for the first player : � � � T L 1 ( x , u 1 , u ∗ max ψ 1 ( x ( T )) − 2 ( t , x )) dt u 1 ( · ) τ subject to x = f ( x , u 1 , u ∗ ˙ 2 ( t , x )) , x ( τ ) = y • Similarly u 2 = u ∗ 2 ( t , x ) should provide a solution to the optimal control problem for the second player, given that u 1 = u ∗ 1 ( t , x ) Alberto Bressan (Penn State) Noncooperative Games 6 / 33

  7. The system of PDEs for the value functions V i ( τ, y ) = value function for the i -th player (= expected payoff, if game starts at τ, y )   f ( x , u 1 , u 2 ) = f 1 ( x , u 1 ) + f 2 ( x , u 2 ) Assume:  L i ( x , u 1 , u 2 ) = L i 1 ( x , u 1 ) + L i 2 ( x , u 2 ) Optimal feedback controls: � � u ∗ = u ∗ i ( t , x , ∇ V i ) = argmax ∇ V i ( t , x ) · f i ( x , ω ) − L ii ( x , ω ) i ω The value functions satisfy a system of PDE’s ∂ t V i + ∇ V i · f ( x , u ∗ 1 , u ∗ 2 ) = L i ( x , u ∗ 1 , u ∗ 2 ) i = 1 , 2 with terminal condition: V i ( T , x ) = ψ i ( x ) Alberto Bressan (Penn State) Noncooperative Games 7 / 33

  8. Systems of Hamilton-Jacobi PDEs Finite horizon game   = H (1) ( x , ∇ V 1 , ∇ V 2 ) , ∂ t V 1 V 1 ( T , x ) = ψ 1 ( x )     = H (2) ( x , ∇ V 1 , ∇ V 2 ) , V 2 ( T , x ) = ψ 2 ( x ) ∂ t V 2 (backward Cauchy problem, with terminal conditions) Alberto Bressan (Penn State) Noncooperative Games 8 / 33

  9. Test well-posedness: by locally linearizing of the equations ∂ t V i = H ( i ) ( x , ∇ V 1 , ∇ V 2 ) i = 1 , 2 V ( ε ) perturbed solution: = V i + ε Z i + o ( ε ) i Differentiating H ( i ) ( x , p 1 , p 2 ), obtain a linear equation satisfied by Z i ∂ t Z i = ∂ H ( i ) ( x , ∇ V 1 , ∇ V 2 ) · ∇ Z 1 + ∂ H ( i ) ( x , ∇ V 1 , ∇ V 2 ) · ∇ Z 2 ∂ p 1 ∂ p 2 Alberto Bressan (Penn State) Noncooperative Games 9 / 33

  10. Freezing the coefficients at a point (¯ x , ∇ V 1 (¯ x ) , ∇ V 2 (¯ x )), one obtains a linear system with constant coefficients � Z 1 � � Z 1 � � n + A j = 0 (1) Z 2 Z 2 t x j j =1 Each A j is a 2 × 2 matrix For a given vector ξ = ( ξ 1 , . . . , ξ n ) ∈ R n , consider the matrix � A ( ξ ) . = ξ j A j j Definition 1. The system (1) is hyperbolic if there exists a constant C such that � � � exp iA ( ξ ) � ≤ C sup ξ ∈ R n Alberto Bressan (Penn State) Noncooperative Games 10 / 33

  11. Computing solutions in terms of Fourier transform, which is an isometry on L 2 , the above definition is motivated by Theorem 1. The system (1) is hyperbolic if and only if the corresponding Cauchy problem is well posed in L 2 ( R n ) . � � � � � � � � �� � exp( − iA ( ξ )) � · �� � Z ( t ) � � � = Z ( t ) L 2 ≤ sup Z (0) L 2 L 2 ξ ∈ R n � � � � � exp iA ( ξ ) � · � Z (0) � = sup L 2 ξ ∈ R n Lemma 1 (necessary condition). If the system (1) is hyperbolic, then for every ξ ∈ R m the matrix A ( ξ ) has a basis of eigenvectors r 1 , . . . , r n , with real eigenvalues λ 1 , . . . , λ n (not necessarily distinct). Lemma 2 (sufficient condition). Assume that, for | ξ | = 1 , the matrices A ( ξ ) can be diagonalized in terms of a real, invertible matrix R ( ξ ) continuously depending on ξ . Then the system (1) is hyperbolic. Alberto Bressan (Penn State) Noncooperative Games 11 / 33

  12. A class of differential games dynamics: x = f 1 ( x , u 1 ) + f 2 ( x , u 2 ) ˙ � T � � payoffs: J i = ψ i ( x ( T )) − L i 1 ( x , u 1 ) + L i 2 ( x , u 2 ) dt , i = 1 , 2 0 Value functions satisfy  ∂ t V 1 + ∇ V 1 · f ( x , u ♯ 1 , u ♯ = L 1 ( x , u ♯ 1 , u ♯  2 ) 2 )  ∂ t V 2 + ∇ V 2 · f ( x , u ♯ 1 , u ♯ = L 2 ( x , u ♯ 1 , u ♯ 2 ) 2 ) � � u ♯ = u ♯ i ( x , ∇ V i ) = argmax ∇ V i · f i ( x , ω ) − L ii ( x , ω ) i = 1 , 2 , i ω V 1 ( T , x ) = ψ 1 ( x ) , V 2 ( T , x ) = ψ 2 ( x ) Alberto Bressan (Penn State) Noncooperative Games 12 / 33

  13. f = ( f 1 , . . . , f n ) , ∇ V i = p i = ( p i 1 , . . . p in ) Evolution of a perturbation: � � � ∂ u ♯ � n n 2 ∂ u ♯ � � � ∇ V i · ∂ f − ∂ L i j j Z i , t + f k Z i , x k + Z 1 , x k + Z 2 , x k = 0 ∂ u j ∂ u j ∂ p 1 k ∂ p 2 k k =1 k =1 j =1 Maximality conditions = ⇒ ∇ V 1 · ∂ f − ∂ L 1 ∇ V 2 · ∂ f − ∂ L 2 = 0 , = 0 ∂ u 1 ∂ u 1 ∂ u 2 ∂ u 2 Alberto Bressan (Penn State) Noncooperative Games 13 / 33

  14. Evolution of a first order perturbation � � � � � � � n Z 1 , t Z 1 , x k 0 + A k = Z 2 , t Z 2 , x k 0 k =1 where the 2 × 2 matrices A k are given by  � � ∂ u ♯  ∇ V 1 · ∂ f − ∂ L 1 2 f k   ∂ u 2 ∂ u 2 ∂ p 2 k     A k =   � � ∂ u ♯   ∇ V 2 · ∂ f − ∂ L 2 1 f k ∂ u 1 ∂ u 1 ∂ p 1 k Alberto Bressan (Penn State) Noncooperative Games 14 / 33

  15.  � � ∂ u ♯  ∇ V 1 · ∂ f − ∂ L 1 2 f k ξ k ξ k   ∂ u 2 ∂ u 2 ∂ p 2 k � n     A ( ξ ) =   � � ∂ u ♯   ∇ V 2 · ∂ f − ∂ L 2 k =1 1 f k ξ k ξ k ∂ u 1 ∂ u 1 ∂ p 1 k HYPERBOLICITY = ⇒ A ( ξ ) has real eigenvalues for every ξ � � ∂ u ♯ ∇ V 1 · ∂ f − ∂ L 1 . 2 v = ( v 1 , . . . , v n ) , v k = ∂ u 2 ∂ u 2 ∂ p 2 k � � ∂ u ♯ ∇ V 2 · ∂ f − ∂ L 2 w k . 1 w = ( w 1 , . . . , w n ) , = ∂ u 1 ∂ u 1 ∂ p 1 k for all ξ ∈ R n HYPERBOLICITY = ⇒ ( v · ξ )( w · ξ ) ≥ 0 Alberto Bressan (Penn State) Noncooperative Games 15 / 33

  16. for all ξ ∈ R n HYPERBOLICITY = ⇒ ( v · ξ )( w · ξ ) ≥ 0 TRUE if v , w are linearly dependent, with same orientation. FALSE if v , w are linearly independent. w ξ w v ξ v Alberto Bressan (Penn State) Noncooperative Games 16 / 33

  17. In one space dimension, the Cauchy Problem can be well posed for a large set of data. In several space dimensions, generically the system is hyperbolic, and the Cauchy Problem is ill posed A.B., W.Shen, Small BV solutions of hyperbolic non-cooperative differential games, SIAM J. Control Optim. 43 (2004), 194–215. A.B., W.Shen, Semi-cooperative strategies for differential games, Intern. J. Game Theory 32 (2004), 561–593. A.B., Noncooperative differential games. Milan J. Math. , 79 (2011), 357–427. Alberto Bressan (Penn State) Noncooperative Games 17 / 33

  18. Differential games in infinite time horizon Dynamics: x = f ( x , u 1 , u 2 ) , ˙ x (0) = x 0 u 1 , u 2 controls implemented by the players Goal of i -th player: � + ∞ � � . e − γ t Ψ i maximize: J i = x ( t ) , u 1 ( t ) , u 2 ( t ) dt 0 (running payoff, exponentially discounted in time) Alberto Bressan (Penn State) Noncooperative Games 18 / 33

  19. A special case Dynamics: x = f ( x ) + g 1 ( x ) u 1 + g 2 ( x ) u 2 ˙ � � � ∞ φ i ( x ( t )) + u 2 i ( t ) e − γ t Player i seeks to minimize: J i = dt 2 0 Alberto Bressan (Penn State) Noncooperative Games 19 / 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend