Stability of Feedback Equilibrium Solutions for Noncooperative Differential Games
Alberto Bressan
Department of Mathematics, Penn State University
Alberto Bressan (Penn State) Noncooperative Games 1 / 33
Stability of Feedback Equilibrium Solutions for Noncooperative - - PowerPoint PPT Presentation
Stability of Feedback Equilibrium Solutions for Noncooperative Differential Games Alberto Bressan Department of Mathematics, Penn State University Alberto Bressan (Penn State) Noncooperative Games 1 / 33 Differential games: the PDE approach
Department of Mathematics, Penn State University
Alberto Bressan (Penn State) Noncooperative Games 1 / 33
Alberto Bressan (Penn State) Noncooperative Games 2 / 33
Alberto Bressan (Penn State) Noncooperative Games 3 / 33
Alberto Bressan (Penn State) Noncooperative Games 4 / 33
τ
Alberto Bressan (Penn State) Noncooperative Games 5 / 33
Seek: Nash equilibrium solutions in feedback form ui = u∗
i (t, x)
2(t, x) adopted by the second player,
for every initial data (τ, y), the assignment u1 = u∗
1(t, x) is a feedback solution to
max
u1(·)
T
τ
L1(x, u1, u∗
2(t, x)) dt
˙ x = f (x, u1, u∗
2(t, x)),
x(τ) = y
2(t, x) should provide a solution to the optimal control problem
for the second player, given that u1 = u∗
1(t, x)
Alberto Bressan (Penn State) Noncooperative Games 6 / 33
Vi(τ, y) = value function for the i-th player (= expected payoff, if game starts at τ, y) Assume: f (x, u1, u2) = f1(x, u1) + f2(x, u2) Li(x, u1, u2) = Li1(x, u1) + Li2(x, u2) Optimal feedback controls: u∗
i
= u∗
i (t, x, ∇Vi) = argmax ω
∂tVi + ∇Vi · f (x, u∗
1, u∗ 2) = Li(x, u∗ 1, u∗ 2)
i = 1, 2 with terminal condition: Vi(T, x) = ψi(x)
Alberto Bressan (Penn State) Noncooperative Games 7 / 33
Alberto Bressan (Penn State) Noncooperative Games 8 / 33
i
Alberto Bressan (Penn State) Noncooperative Games 9 / 33
n
ξ∈Rn
Alberto Bressan (Penn State) Noncooperative Games 10 / 33
Computing solutions in terms of Fourier transform, which is an isometry on L2, the above definition is motivated by Theorem 1. The system (1) is hyperbolic if and only if the corresponding Cauchy problem is well posed in L2(Rn).
=
sup
ξ∈Rn
= sup
ξ∈Rn
Lemma 1 (necessary condition). If the system (1) is hyperbolic, then for every ξ ∈ Rm the matrix A(ξ) has a basis of eigenvectors r1, . . . , rn, with real eigenvalues λ1, . . . , λn (not necessarily distinct). Lemma 2 (sufficient condition). Assume that, for |ξ| = 1, the matrices A(ξ) can be diagonalized in terms of a real, invertible matrix R(ξ) continuously depending on ξ. Then the system (1) is hyperbolic.
Alberto Bressan (Penn State) Noncooperative Games 11 / 33
dynamics: ˙ x = f1(x, u1) + f2(x, u2) payoffs: Ji = ψi(x(T)) − T
i = 1, 2 Value functions satisfy ∂tV1 + ∇V1 · f (x, u♯
1, u♯ 2)
= L1(x, u♯
1, u♯ 2)
∂tV2 + ∇V2 · f (x, u♯
1, u♯ 2)
= L2(x, u♯
1, u♯ 2)
u♯
i
= u♯
i (x, ∇Vi) =
argmax
ω
i = 1, 2 V1(T, x) = ψ1(x) , V2(T, x) = ψ2(x)
Alberto Bressan (Penn State) Noncooperative Games 12 / 33
f = (f1, . . . , fn), ∇Vi = pi = (pi1, . . . pin) Evolution of a perturbation: Zi,t+
n
fk Zi,xk+
n
2
∂uj − ∂Li ∂uj ∂u♯
j
∂p1k Z1,xk+ ∂u♯
j
∂p2k Z2,xk
Maximality conditions = ⇒ ∇V1 · ∂f ∂u1 − ∂L1 ∂u1 = 0 , ∇V2 · ∂f ∂u2 − ∂L2 ∂u2 = 0
Alberto Bressan (Penn State) Noncooperative Games 13 / 33
Z2,t
n
Ak
Z2,xk
Ak = fk
∂u2 − ∂L1 ∂u2 ∂u♯
2
∂p2k
∂u1 − ∂L2 ∂u1 ∂u♯
1
∂p1k fk
Alberto Bressan (Penn State) Noncooperative Games 14 / 33
A(ξ) =
n
fkξk
∂u2 − ∂L1 ∂u2 ∂u♯
2
∂p2k ξk
∂u1 − ∂L2 ∂u1 ∂u♯
1
∂p1k ξk fkξk HYPERBOLICITY = ⇒ A(ξ) has real eigenvalues for every ξ v = (v1, . . . , vn) , vk . =
∂u2 − ∂L1 ∂u2 ∂u♯
2
∂p2k w = (w1, . . . , wn) , wk . =
∂u1 − ∂L2 ∂u1 ∂u♯
1
∂p1k HYPERBOLICITY = ⇒ (v · ξ)(w · ξ) ≥ 0 for all ξ ∈ Rn
Alberto Bressan (Penn State) Noncooperative Games 15 / 33
HYPERBOLICITY = ⇒ (v · ξ)(w · ξ) ≥ 0 for all ξ ∈ Rn TRUE if v, w are linearly dependent, with same orientation. FALSE if v, w are linearly independent.
Alberto Bressan (Penn State) Noncooperative Games 16 / 33
A.B., W.Shen, Small BV solutions of hyperbolic non-cooperative differential games, SIAM J. Control Optim. 43 (2004), 194–215. A.B., W.Shen, Semi-cooperative strategies for differential games, Intern. J. Game Theory 32 (2004), 561–593. A.B., Noncooperative differential games. Milan J. Math., 79 (2011), 357–427.
Alberto Bressan (Penn State) Noncooperative Games 17 / 33
Alberto Bressan (Penn State) Noncooperative Games 18 / 33
i (t)
Alberto Bressan (Penn State) Noncooperative Games 19 / 33
2(g1 · ∇V1)2 − (g2 · ∇V1)(g2 · ∇V2) + φ1
2(g2 · ∇V2)2 − (g1 · ∇V1)(g1 · ∇V2) + φ2
i (x) =
Alberto Bressan (Penn State) Noncooperative Games 20 / 33
Assume that the dynamics is linear: ˙ x = (Ax + b0) + b1u1 + b2u2 , x(0) = y and the cost functions are quadratic: Ji = +∞ e−γt ai · x + xTPix + u2
i
2
Then the system of PDEs has a special solution of the form Vi(x) = ki + βi · x + xTΓix i = 1, 2 (∗)
i (x) =
− (βi + 2xTΓi) · bi To find this solution, it suffices to determine the coefficients ki, βi, Γi by solving a system of algebraic equations
Alberto Bressan (Penn State) Noncooperative Games 21 / 33
i
i
Alberto Bressan (Penn State) Noncooperative Games 22 / 33
One-dimensional, linear-quadratic games in infinite time horizon ˙ x = (a0x + b0) + b1u1 + b2u2 The ODE for the derivatives of the value functions ξi = V ′
i
takes the form A11 A12 A21 A22 ξ′
1
ξ′
2
= ψ1(x, ξ1, ξ2) ψ2(x, ξ1, ξ2) , Aij = Aij(x, ξ1, ξ2) The map (x, ξ1, ξ2) → det A(x, ξ1, ξ2) is a homogeneous quadratic polynomial An affine solution exists: ξ∗
1(x) = k1x + β1 ,
ξ∗
2(x) = k2x + β2
The map x → det A(x, ξ∗
1(x), ξ∗ 2(x)) is a quadratic polynomial
Alberto Bressan (Penn State) Noncooperative Games 23 / 33
(A.B. - Khai Nguyen, 2016)
(positively invariant for the feedback dynamics)
x ξ ξ (x)
∗ ∗
(x) ξ∗ (x) Γ
_
x x
Noncooperative Games 24 / 33
Easy case: det A(x, ξ∗
1(x), ξ∗ 2(x)) = 0 for all x ∈ R.
ξ′
1
ξ′
2
= A−1(x, ξ1, ξ2) ψ1(x, ξ1, ξ2) ψ2(x, ξ1, ξ2) The linear-quadratic game has a 2-parameter family of Nash equilibrium solutions in feedback form. One is affine, the other are nonlinear. All of the above solutions are stable w.r.t. small nonlinear perturbations of the dynamics and the cost functions.
x ξ1
2
ξ ξ
*
(x) Alberto Bressan (Penn State) Noncooperative Games 25 / 33
A11 A12 A21 A22 ξ′
1
ξ′
2
= ψ1(x, ξ1, ξ2) ψ2(x, ξ1, ξ2) Equivalent system: A11 dξ1 + A12 dξ2 − ψ1 dx = 0 A21 dξ1 + A22 dξ2 − ψ2 dx = 0 Setting: v . = −ψ1 A11 A12 , w . = −ψ2 A21 A22 , We seek continuously differentiable functions x → (ξ1(x), ξ2(x)) whose graph is
˙ x ˙ ξ1 ˙ ξ2 = v × w = A11A22 − A12A21 A22ψ1 − 12ψ2 A11ψ2 − A21ψ1 .
Alberto Bressan (Penn State) Noncooperative Games 26 / 33
det A11 A12 A21 A22
Σ1 ∪ Σ2 Under generic conditions on the coefficients of the linear-quadratic problem, there exists two curves γ1 ⊂ Σ1 , γ2 ⊂ Σ2 such that v × w = 0 on γ1 and on γ2 v × w is vertical on Σ1 \ γ1 and on Σ2 \ γ2
(x) x ξ ξ ξ∗ Σ1
1
γ
2
Σ γ2
1 2 Alberto Bressan (Penn State) Noncooperative Games 27 / 33
2 1 1 1 2 2
γ
unique solution
two − parameter family of solutions
γ γ γ γ γ
Alberto Bressan (Penn State) Noncooperative Games 28 / 33
2 1
*
x ξ1
2
ξ Σ2 γ2
1
γ
1
Σ
1
M M ξ* (x) x _ _ x
1 2
P*
2
P
Alberto Bressan (Penn State) Noncooperative Games 29 / 33
(A.B., K.Nguyen, 2016)
˙ x = a0x + f0(x) + (b1 + h1(x))u1 + (b2 + h2(x))u2 Ji = +∞ e−γt
i
2
Under generic assumptions on the coefficients a0, b1, b2, . . ., for any ε > 0 there exists δ > 0 such that the following holds. If the perturbations satisfy f0C2 + η1C2 + η2C2 + h1C1 + h2C1 ≤ δ, then the perturbed equations for ξ1 = V ′
1, ξ2 = V ′ 2 admit a solution such that
|ξ1(x) − ξ∗
1(x)| + |ξ2(x) − ξ∗ 2(x)| ≤ ε(1 + |x|)
for all x ∈ R
Alberto Bressan (Penn State) Noncooperative Games 30 / 33
Vi = Vi(x1, x2) γV1 = (f · ∇V1) + 1
2(g1 · ∇V1)2 + (g2 · ∇V1)(g2 · ∇V2) + φ1
γV2 = (f · ∇V2) + 1
2(g2 · ∇V2)2 + (g1 · ∇V1)(g1 · ∇V2) + φ2
Linearize around the affine solution of a L-Q game Determine if this linearized PDE is elliptic, hyperbolic, or mixed type Construct solutions to the perturbed nonlinear PDE
hyperbolic elliptic
x1 x2
Alberto Bressan (Penn State) Noncooperative Games 31 / 33
Alberto Bressan (Penn State) Noncooperative Games 32 / 33