Neurodynamic Optimization: New Models and kWTA Applications Jun - - PowerPoint PPT Presentation

neurodynamic optimization new models and kwta applications
SMART_READER_LITE
LIVE PREVIEW

Neurodynamic Optimization: New Models and kWTA Applications Jun - - PowerPoint PPT Presentation

Neurodynamic Optimization: New Models and kWTA Applications Jun Wang jwang@mae.cuhk.edu.hk Department of Mechanical & Automation Engineering The Chinese University of Hong Kong Shatin, New Territories, Hong Kong


slide-1
SLIDE 1

Neurodynamic Optimization: New Models and kWTA Applications

Jun Wang

jwang@mae.cuhk.edu.hk

Department of Mechanical & Automation Engineering The Chinese University of Hong Kong Shatin, New Territories, Hong Kong http://www.mae.cuhk.edu.hk/ ˜ jwang

Computational Intelligence Laboratory, CUHK – p. 1/69

slide-2
SLIDE 2

Introduction

Optimization is ubiquitous in nature and society.

Computational Intelligence Laboratory, CUHK – p. 2/69

slide-3
SLIDE 3

Introduction

Optimization is ubiquitous in nature and society. Optimization arises in a wide variety of scientific problems.

Computational Intelligence Laboratory, CUHK – p. 2/69

slide-4
SLIDE 4

Introduction

Optimization is ubiquitous in nature and society. Optimization arises in a wide variety of scientific problems. Optimization is an important tool for design, planning, control, operation, and management of engineering systems.

Computational Intelligence Laboratory, CUHK – p. 2/69

slide-5
SLIDE 5

Problem Formulation

Consider a general optimization problem: OP1 : Minimize f(x) subject to c(x) ≤ 0, d(x) = 0, where x ∈ ℜn is the vector of decision variables, f(x) is an objective function, c(x) = [c1(x), . . . , cm(x)]T is a vector-valued function, and d(x) = [d1(x), . . . , dp(x)]T a vector-valued function.

Computational Intelligence Laboratory, CUHK – p. 3/69

slide-6
SLIDE 6

Problem Formulation

Consider a general optimization problem: OP1 : Minimize f(x) subject to c(x) ≤ 0, d(x) = 0, where x ∈ ℜn is the vector of decision variables, f(x) is an objective function, c(x) = [c1(x), . . . , cm(x)]T is a vector-valued function, and d(x) = [d1(x), . . . , dp(x)]T a vector-valued function. If f(x) and c(x) are convex and d(x) is affine, then OP is a convex programming problem CP. Otherwise, it is a nonconvex program.

Computational Intelligence Laboratory, CUHK – p. 3/69

slide-7
SLIDE 7

Quadratic and Linear Programs

QP1 : minimize 1 2xTQx + qTx subject to Ax = b, l ≤ Cx ≤ h, where Q ∈ ℜn×n , q ∈ ℜn, A ∈ ℜm×n, b ∈ ℜm, C ∈ ℜn×n, l ∈ ℜn, h ∈ ℜn.

Computational Intelligence Laboratory, CUHK – p. 4/69

slide-8
SLIDE 8

Quadratic and Linear Programs

QP1 : minimize 1 2xTQx + qTx subject to Ax = b, l ≤ Cx ≤ h, where Q ∈ ℜn×n , q ∈ ℜn, A ∈ ℜm×n, b ∈ ℜm, C ∈ ℜn×n, l ∈ ℜn, h ∈ ℜn. When Q = 0, and C = I, QP1 becomes a linear program with equality and bound constraints: LP1 : minimize qTx subject to Ax = b, l ≤ x ≤ h

Computational Intelligence Laboratory, CUHK – p. 4/69

slide-9
SLIDE 9

Dynamic Optimization

In many applications (e.g., online pattern recognition and onboard signal processing), real-time solutions to

  • ptimization problems are necessary or desirable.

Computational Intelligence Laboratory, CUHK – p. 5/69

slide-10
SLIDE 10

Dynamic Optimization

In many applications (e.g., online pattern recognition and onboard signal processing), real-time solutions to

  • ptimization problems are necessary or desirable.

For such applications, classical optimization techniques may not be competent due to the problem dimensionality and stringent requirement on computational time.

Computational Intelligence Laboratory, CUHK – p. 5/69

slide-11
SLIDE 11

Dynamic Optimization

In many applications (e.g., online pattern recognition and onboard signal processing), real-time solutions to

  • ptimization problems are necessary or desirable.

For such applications, classical optimization techniques may not be competent due to the problem dimensionality and stringent requirement on computational time. It is computationally challenging when optimization procedures have to be performed in real time to

  • ptimize the performance of dynamical systems.

Computational Intelligence Laboratory, CUHK – p. 5/69

slide-12
SLIDE 12

Dynamic Optimization

In many applications (e.g., online pattern recognition and onboard signal processing), real-time solutions to

  • ptimization problems are necessary or desirable.

For such applications, classical optimization techniques may not be competent due to the problem dimensionality and stringent requirement on computational time. It is computationally challenging when optimization procedures have to be performed in real time to

  • ptimize the performance of dynamical systems.

One very promising approach to dynamic

  • ptimization is to apply artificial neural networks.

Computational Intelligence Laboratory, CUHK – p. 5/69

slide-13
SLIDE 13

Neurodynamic Optimization

Because of the inherent nature of parallel and distributed information processing in neural networks, the convergence rate of the solution process is not decreasing as the size of the problem increases.

Computational Intelligence Laboratory, CUHK – p. 6/69

slide-14
SLIDE 14

Neurodynamic Optimization

Because of the inherent nature of parallel and distributed information processing in neural networks, the convergence rate of the solution process is not decreasing as the size of the problem increases. Neural networks can be implemented physically in designated hardware such as ASICs where

  • ptimization is carried out in a truly parallel and

distributed manner.

Computational Intelligence Laboratory, CUHK – p. 6/69

slide-15
SLIDE 15

Neurodynamic Optimization

Because of the inherent nature of parallel and distributed information processing in neural networks, the convergence rate of the solution process is not decreasing as the size of the problem increases. Neural networks can be implemented physically in designated hardware such as ASICs where

  • ptimization is carried out in a truly parallel and

distributed manner. This feature is particularly desirable for dynamic

  • ptimization in decentralized decision-making

situations.

Computational Intelligence Laboratory, CUHK – p. 6/69

slide-16
SLIDE 16

Existing Approaches

In their seminal work, Tank and Hopfield (1985, 1986) applied the Hopfield networks for solving a linear program and the traveling salesman problem.

Computational Intelligence Laboratory, CUHK – p. 7/69

slide-17
SLIDE 17

Existing Approaches

In their seminal work, Tank and Hopfield (1985, 1986) applied the Hopfield networks for solving a linear program and the traveling salesman problem. Kennedy and Chua (1988) developed a neural network for nonlinear programming, which contains finite penalty parameters and thus its equilibrium points correspond to approximate optimal solutions only.

Computational Intelligence Laboratory, CUHK – p. 7/69

slide-18
SLIDE 18

Existing Approaches

In their seminal work, Tank and Hopfield (1985, 1986) applied the Hopfield networks for solving a linear program and the traveling salesman problem. Kennedy and Chua (1988) developed a neural network for nonlinear programming, which contains finite penalty parameters and thus its equilibrium points correspond to approximate optimal solutions only. The two-phase optimization networks by Maa and Shanblatt (1992).

Computational Intelligence Laboratory, CUHK – p. 7/69

slide-19
SLIDE 19

Existing Approaches

In their seminal work, Tank and Hopfield (1985, 1986) applied the Hopfield networks for solving a linear program and the traveling salesman problem. Kennedy and Chua (1988) developed a neural network for nonlinear programming, which contains finite penalty parameters and thus its equilibrium points correspond to approximate optimal solutions only. The two-phase optimization networks by Maa and Shanblatt (1992). The Lagrangian networks for quadratic programming by Zhang and Constantinides (1992) and Zhang, et al. (1992).

Computational Intelligence Laboratory, CUHK – p. 7/69

slide-20
SLIDE 20

Existing Approaches (cont’d)

A recurrent neural network for quadratic optimization with bounded variables only by Bouzerdoum and Pattison (1993).

Computational Intelligence Laboratory, CUHK – p. 8/69

slide-21
SLIDE 21

Existing Approaches (cont’d)

A recurrent neural network for quadratic optimization with bounded variables only by Bouzerdoum and Pattison (1993). The deterministic annealing network for linear and convex programming by Wang (1993, 1994).

Computational Intelligence Laboratory, CUHK – p. 8/69

slide-22
SLIDE 22

Existing Approaches (cont’d)

A recurrent neural network for quadratic optimization with bounded variables only by Bouzerdoum and Pattison (1993). The deterministic annealing network for linear and convex programming by Wang (1993, 1994). The primal-dual networks for linear and quadratic programming by Xia (1996, 1997).

Computational Intelligence Laboratory, CUHK – p. 8/69

slide-23
SLIDE 23

Existing Approaches (cont’d)

A recurrent neural network for quadratic optimization with bounded variables only by Bouzerdoum and Pattison (1993). The deterministic annealing network for linear and convex programming by Wang (1993, 1994). The primal-dual networks for linear and quadratic programming by Xia (1996, 1997). The projection networks for solving projection equations, constrained optimization, etc by Xia and Wang (1998, 2002, 2004) and Liang and Wang (2000).

Computational Intelligence Laboratory, CUHK – p. 8/69

slide-24
SLIDE 24

Existing Approaches (cont’d)

The dual networks for quadratic programming by Xia and Wang (2001), Zhang and Wang (2002).

Computational Intelligence Laboratory, CUHK – p. 9/69

slide-25
SLIDE 25

Existing Approaches (cont’d)

The dual networks for quadratic programming by Xia and Wang (2001), Zhang and Wang (2002). A two-layer network for convex programming subject to nonlinear inequality constraints by Xia and Wang (2004).

Computational Intelligence Laboratory, CUHK – p. 9/69

slide-26
SLIDE 26

Existing Approaches (cont’d)

The dual networks for quadratic programming by Xia and Wang (2001), Zhang and Wang (2002). A two-layer network for convex programming subject to nonlinear inequality constraints by Xia and Wang (2004). A simplified dual network for quadratic programming by Liu and Wang (2006)

Computational Intelligence Laboratory, CUHK – p. 9/69

slide-27
SLIDE 27

Existing Approaches (cont’d)

The dual networks for quadratic programming by Xia and Wang (2001), Zhang and Wang (2002). A two-layer network for convex programming subject to nonlinear inequality constraints by Xia and Wang (2004). A simplified dual network for quadratic programming by Liu and Wang (2006) Two one-layer networks with discontinuous activation functions for linear and quadratic programming by Liu and Wang (2008).

Computational Intelligence Laboratory, CUHK – p. 9/69

slide-28
SLIDE 28

Primal-Dual Network

The primal-dual network for solving LP2a: ǫdx dt = −(qTx − bTy)q − AT(Ax − b) + x+, ǫdy dt = (qTx − bTy)b, where ǫ > 0 is a scaling parameter, x ∈ ℜn is the primal state vector, y ∈ ℜm is the dual (hidden) state vector, x+ = (x+

1 ), ..., x+ n )T, and x+ i = max{0, xi}.

  • aY. Xia, “A new neural network for solving linear and quadratic programming problems,”

IEEE Transactions on Neural Networks, vol. 7, no. 6, 1544-1548, 1996.

Computational Intelligence Laboratory, CUHK – p. 10/69

slide-29
SLIDE 29

Primal-Dual Network

The primal-dual network for solving LP2a: ǫdx dt = −(qTx − bTy)q − AT(Ax − b) + x+, ǫdy dt = (qTx − bTy)b, where ǫ > 0 is a scaling parameter, x ∈ ℜn is the primal state vector, y ∈ ℜm is the dual (hidden) state vector, x+ = (x+

1 ), ..., x+ n )T, and x+ i = max{0, xi}.

The network is globally convergent to an optimal solution to LP1.

  • aY. Xia, “A new neural network for solving linear and quadratic programming problems,”

IEEE Transactions on Neural Networks, vol. 7, no. 6, 1544-1548, 1996.

Computational Intelligence Laboratory, CUHK – p. 10/69

slide-30
SLIDE 30

Lagrangian Network for QP

If C = 0 in QP1: ǫ d dt x y

  • =

−Qx(t) − ATy(t) − q, Ax − b

  • .

where ǫ > 0, x ∈ ℜn, y ∈ ℜm. It is globally exponentially convergent to the optimal solutiona.

  • aJ. Wang, Q. Hu, and D. Jiang, “A Lagrangian network for kinematic control of redundant

robot manipulators,” IEEE Transactions on Neural Networks, vol. 10, no. 5, pp. 1123-1132, 1999.

Computational Intelligence Laboratory, CUHK – p. 11/69

slide-31
SLIDE 31

Projection Network

A recurrent neural network called the projection network was developed for optimization with bound constraints onlya ǫdx dt = −x + g(x − ∇f(x)), where g(·) is a vector-valued piecewise-linear activation function.

aY.S. Xia and J. Wang, “On the stability of globally projected dynamic systems,” J. of Opti- mization Theory and Applications, vol. 106, no. 1, pp. 129-150, 2000.

Computational Intelligence Laboratory, CUHK – p. 12/69

slide-32
SLIDE 32

Piecewise-Linear Activation Function

g(xi) =    li xi < li xi li ≤ xi ≤ hi hi xi > hi.

Computational Intelligence Laboratory, CUHK – p. 13/69

slide-33
SLIDE 33

Two-layer Projection Network for QP1

If C = I in QP1, let α = 1 in the two-layer neural network for CP: ǫ d dt x y

  • =

−x + g((I − Q)x + ATy − q) −Ax + b

  • .

where ǫ > 0, x ∈ ℜn, y ∈ ℜm, g(x) = [g(x1), ..., g(xn)]T is the piecewise-linear activation function defined before. It is globally asymptotically convergent to the optimal solutiona.

aY.S. Xia, H. Leung, and J. Wang, “A projection neural network and its application to con- strained optimization problems,” IEEE Trans. Circuits and Systems I, vol. 49, no. 4, pp. 447-458, 2002.

Computational Intelligence Laboratory, CUHK – p. 14/69

slide-34
SLIDE 34

General projection Network for QP1

The dynamic equation of the general projection neural network (GPNN): ǫdz dt = (M + N)T{−Nz + g((N − M)z)}, where ǫ > 0 and z = (xT, yT)T is the state vector, M = Q −AT I

  • , N =

I A 0

  • .

The GPNN is globally convergent to an exact solution

  • f the problema.
  • aY. Xia and J. Wang, “A general projection neural network for solving optimization and re-

lated problems,” IEEE Trans. Neural Networks, vol. 15, pp. 318-328, 2004.

Computational Intelligence Laboratory, CUHK – p. 15/69

slide-35
SLIDE 35

Dual Network for QP2

For strictly convex QP2, Q is invertible. The dynamic equation of the dual network: ǫdy(t) dt = −CQ−1CTy + g

  • CQ−1CTy − y − Cq
  • +Cq + b,

x(t) = Q−1CTy − q, where ǫ > 0. It is also globally exponentially convergent to the optimal solutiona b.

  • aY. Xia and J. Wang, “A dual neural network for kinematic control of redundant robot manip-

ulators,” IEEE Trans. on Systems, Man, and Cybernetics, vol. 31, no. 1, pp. 147-154, 2001.

  • bY. Zhang and J. Wang, “A dual neural network for convex quadratic programming subject to

linear equality and inequality constraints,” Physics Letters A, pp. 271-278, 2002.

Computational Intelligence Laboratory, CUHK – p. 16/69

slide-36
SLIDE 36

Simplified Dual Net for QP1

For strictly convex QP1, Q is invertible. The dynamic equation of the simplified dual network a: ǫdu dt = −Cx + g(Cx − u), x = Q−1(ATy + CTu − q), y = (AQ−1AT)−1 −AQ−1CTu + AQ−1q + b

  • ,

where u ∈ Rn is the state vector, ǫ > 0. It is proven to be globally asymptotically convergent to the optimal solution.

  • aS. Liu and J. Wang, “A simplified dual neural network for quadratic programming with its

KWTA application,” IEEE Trans. Neural Networks, vol. 17, no. 6, pp. 1500-1510, 2006.

Computational Intelligence Laboratory, CUHK – p. 17/69

slide-37
SLIDE 37

Illustrative Example

minimize 3x2

1 + 3x2 2 + 4x2 3 + 5x2 4 + 3x1x2 + 5x1x3+

x2x4 − 11x1 − 5x4 subject to 3x1 − 3x2 − 2x3 + x4 = 0, 4x1 + x2 − x3 − 2x4 = 0, −x1 + x2 ≤ −1, −2 ≤ 3x1 + x3 ≤ 4. Q =        6 3 5 3 6 1 5 8 1 10        , q =        −11 −5        , A =   3 −3 −2 1 4 1 −1 −2   , b =   0   , C =   −1 1 3 1   , l =   −∞ −2   , h =   −1 4   .

Computational Intelligence Laboratory, CUHK – p. 18/69

slide-38
SLIDE 38

Illustrative Example (cont’d)

The simplified dual neural network for solving this quadratic program needs only two neurons only. In contrast, the Lagrange neural network needs twelve neurons. The primal-dual neural network needs nine neurons. The dual neural network needs four neurons.

Computational Intelligence Laboratory, CUHK – p. 19/69

slide-39
SLIDE 39

Illustrative Example (cont’d)

0.2 0.4 0.6 0.8 1 x 10

−5

−80 −70 −60 −50 −40 −30 −20 −10 10 t (sec) u u1 u2

Transient behaviors of the state vector u.

Computational Intelligence Laboratory, CUHK – p. 20/69

slide-40
SLIDE 40

Illustrative Example (cont’d)

0.2 0.4 0.6 0.8 1 x 10

−5

−2 −1.5 −1 −0.5 0.5 1 1.5 2 t (sec) x x1 x2 x3 x4

Transient behaviors of the output vector x.

Computational Intelligence Laboratory, CUHK – p. 21/69

slide-41
SLIDE 41

Illustrative Example (cont’d)

−5 −4 −3 −2 −1 1 2 3 −6 −5 −4 −3 −2 −1 1 2 x∗ x1 x2

Trajectories of x1 and x2 from different initial states.

Computational Intelligence Laboratory, CUHK – p. 22/69

slide-42
SLIDE 42

Illustrative Example (cont’d)

−3 −2 −1 1 2 3 4 5 −6 −5 −4 −3 −2 −1 1 2 x∗ x3 x4

Trajectories of x3 and x4 from different initial states.

Computational Intelligence Laboratory, CUHK – p. 23/69

slide-43
SLIDE 43

Improved Dual Net for special QP1

For special convex QP1 when Q = I, the dynamic equation of the improved dual networka: ǫdy dt = −y + (y + Ax − b)+, ǫdz dt = −Cx + d, x = gΩ(−ATy + CTz − p), where g(·) and (·)+ are two activation functions. It is proven to be globally convergent to the optimal solution.

  • aX. Hu and J. Wang, “An improved dual neural network for solving a class of quadratic

programming problems and its k winners-take-all application,” IEEE Trans. Neural Networks,

  • vol. 19, no. 12, pp. in press, 2008.

Computational Intelligence Laboratory, CUHK – p. 24/69

slide-44
SLIDE 44

A One-layer Net for LP

A new recurrent neural network model with a discontinuous activation function was recently developed for linear programming LP1a: ǫdx dt = −Px − σ(I − P)g(x) + s, where g(x) = (g1(x1), g2(x2), . . . , gn(xn))T is the vector-valued activation function, ǫ is a positive scaling constant, σ is a nonnegative gain parameter, P = AT(AAT)−1A, and s = −(I − P)q + AT(AAT)−1b.

  • aQ. Liu, and J. Wang, “A one-layer recurrent neural network with a discontinuous activation

function for linear programming,” Neural Computation, vol. 20, no. 5, pp. 1366-1383, 2008.

Computational Intelligence Laboratory, CUHK – p. 25/69

slide-45
SLIDE 45

Activation Function

A discontinuous activation function is defined as follows: For i = 1, 2, . . . , n; gi(xi) =              1, if xi > hi, [0, 1], if xi = hi, 0, if xi ∈ (li, hi), [−1, 0], if xi = li, −1, if xi < li.

Computational Intelligence Laboratory, CUHK – p. 26/69

slide-46
SLIDE 46

Activation Function (cont’d)

  • 6

xi gi(xi) 1 −1 li hi

Computational Intelligence Laboratory, CUHK – p. 27/69

slide-47
SLIDE 47

Convergence Results

The neural network is globally convergent to an

  • ptimal solution of LP1 with C = I, if ¯

Ω ⊂ Ω, where ¯ Ω is the equilibrium point set and Ω = {x|l ≤ x ≤ h}.

Computational Intelligence Laboratory, CUHK – p. 28/69

slide-48
SLIDE 48

Convergence Results

The neural network is globally convergent to an

  • ptimal solution of LP1 with C = I, if ¯

Ω ⊂ Ω, where ¯ Ω is the equilibrium point set and Ω = {x|l ≤ x ≤ h}. The neural network is globally convergent to an

  • ptimal solution of LP1 with C = I, if it has a unique

equilibrium point and σ ≥ 0 when (I − P)c = 0 or

  • ne of the following conditions holds when

(I − P)c = 0: (i) σ ≥ (I − P)cp/ min+

γ∈X (I − P)γp for

p = 1, 2, ∞, or (ii) σ ≥ cT(I − P)c/ min+

γ∈X{|cT(I − P)γ|},

where X = {−1, 0, 1}n

Computational Intelligence Laboratory, CUHK – p. 28/69

slide-49
SLIDE 49

Simulation Results

Consider the following LP problem: minimize 4x1 + x2 + 2x3, subject to x1 − 2x2 + x3 = 2, −x1 + 2x2 + x3 = 1, −5 ≤ x1, x2, x3 ≤ 5. According to the above condition, the lower bound of σ is 9

Computational Intelligence Laboratory, CUHK – p. 29/69

slide-50
SLIDE 50

Simulation Results (cont’d)

0.2 0.4 0.6 0.8 1 x 10

−5

−6 −5 −4 −3 −2 −1 1 2 3 4 5 time (sec) state trajectories x1 x2 x3

Transient behaviors of the states with σ = 15.

Computational Intelligence Laboratory, CUHK – p. 30/69

slide-51
SLIDE 51

Simulation Results (cont’d)

0.2 0.4 0.6 0.8 1 x 10

−5

−6 −5 −4 −3 −2 −1 1 2 3 4 5 time (sec) state trajectories x1 x2 x3

Transient behaviors of the states with σ = 9.

Computational Intelligence Laboratory, CUHK – p. 31/69

slide-52
SLIDE 52

Simulation Results (cont’d)

0.2 0.4 0.6 0.8 1 x 10

−5

−6 −5 −4 −3 −2 −1 1 2 3 4 5 time (sec) state trajectories x1 x2 x3

Transient behaviors of the states with σ = 5.

Computational Intelligence Laboratory, CUHK – p. 32/69

slide-53
SLIDE 53

Simulation Results (cont’d)

0.2 0.4 0.6 0.8 1 x 10

−5

−10 −5 5 time (sec) state trajectories x1 x2 x3

Transient behaviors of the states with σ = 3.

Computational Intelligence Laboratory, CUHK – p. 33/69

slide-54
SLIDE 54

A New One-layer Net for QP

A new one-layer recurrent neural net was recently developeda: ǫdz dt = −(I − P)z − [(I − P)Q + αP]g(z) + q, x = ((I − P)Q + αP)−1(−(I − P)z + s), where ǫ is a positive scaling constant, α > 0 is a parameter, s = −q + Pq + αAT(AAT)−1b, and g(·) is a vector-valued activation function.

  • aQ. Liu, and J. Wang, “A one-layer recurrent neural network with a discontinuous hard-

limiting activation function for quadratic programming,” IEEE Transactions on Neural Networks,

  • vol. 19, no. 4, pp. 558-570, 2008.

Computational Intelligence Laboratory, CUHK – p. 34/69

slide-55
SLIDE 55

Activation Function

The following hard-limiting activation function is defined: gi(zi)    = hi, if zi > 0, ∈ [li, hi], if zi = 0, = li, if zi < 0. If li = hi, then gi is discontinuous. When zi = 0, gi(zi) can take any values between li and hi.

Computational Intelligence Laboratory, CUHK – p. 35/69

slide-56
SLIDE 56

Activation Function (cont’d)

  • 6

zi gi(zi) hi li

Computational Intelligence Laboratory, CUHK – p. 36/69

slide-57
SLIDE 57

Convergence results

Assume that Q is positive definite. If α ≥ λmax(Q)/2

  • r α ≥ trace(Q)/2, then the state vector z(t) of the

neural network is globally convergent to an equilibrium point and the output vector x(t) is globally convergent to an optimal solution of QP. Assume that the objective function f(x) is strictly convex on the set S = {x ∈ Rn : Ax = b}. If α > λmax(Q2)λmax(Q−1)/4, then the state vector z(t) of the neural network is globally convergent to an equilibrium point and the

  • utput vector x(t) is globally convergent to an optimal

solution of QP.

Computational Intelligence Laboratory, CUHK – p. 37/69

slide-58
SLIDE 58

Illustrative Example

Consider the following QP problem: minimize f(x) = −0.5x2

1 + x2 2 + 2x1x2 + 6x1 −

subject to 3x1 − 2x2 = 1, 0 ≤ x1, x2 ≤ 10. As Q = −0.5 1 1 1

  • is not positive definite, the objective function is not

convex everywhere. However, if we substitute x1 = 2x2/3 + 1/3 into the objective function, then ˜ f(x2) = 19x2

2/9 + 22x2/9 − 35/18 is convex.

Computational Intelligence Laboratory, CUHK – p. 38/69

slide-59
SLIDE 59

Illustrative Example (cont’d)

The state variables of the new network.

5 10 15 20 25 30 −6 −5 −4 −3 −2 −1 1 2 3 4 time (sec) state trajectories z1 z2

Computational Intelligence Laboratory, CUHK – p. 39/69

slide-60
SLIDE 60

Illustrative Example (cont’d)

The output variables of the new network.

5 10 15 20 25 30 −2 −1.5 −1 −0.5 0.5 1 1.5 2 time (sec)

  • utput trajectories

x1 x2

Computational Intelligence Laboratory, CUHK – p. 40/69

slide-61
SLIDE 61

Illustrative Example (cont’d)

Phase plot of the output variabnles.

−5 5 −5 −4 −3 −2 −1 1 2 3 4 5 z1 z2 state variables z*

Computational Intelligence Laboratory, CUHK – p. 41/69

slide-62
SLIDE 62

Illustrative Example (cont’d)

The simulation result of the dual network.

5 10 15 20 25 30 −200 −150 −100 −50 50 100 150 200 time (sec) sign(x1)log10(|x1|) sign(x2)log10(|x2|)

Computational Intelligence Laboratory, CUHK – p. 42/69

slide-63
SLIDE 63

Illustrative Example (cont’d)

The simulation result of the projection network.

5 10 15 20 25 30 −1 −0.5 0.5 1 1.5 2 2.5 3 3.5 4 time (sec) x1 x2

Computational Intelligence Laboratory, CUHK – p. 43/69

slide-64
SLIDE 64

Model Comparisons for QP1

model layers neurons connections convergence condition Lagrangian network 2 3n + m n2 + 2mn f(x) is strictly convex Primal-dual network 2 n + m 3n2 + 3mn f(x) is convex General projection net 2 n + m n2 + 2mn f(x) is convex Dual network 1 n + m (n + m)2 f(x) is strictly convex Simplified dual network 1 n n2 f(x) is strictly convex New neural network 1 n 2n2 f(x) is strictly convex on S

where S = {x ∈ Rn : Ax = b}.

Computational Intelligence Laboratory, CUHK – p. 44/69

slide-65
SLIDE 65

k Winners Take All Operation

The k-winners-take-all (kWTA) operation is to select the k largest inputs out of n inputs (1 ≤ k ≤ n).

Computational Intelligence Laboratory, CUHK – p. 45/69

slide-66
SLIDE 66

k Winners Take All Operation

The k-winners-take-all (kWTA) operation is to select the k largest inputs out of n inputs (1 ≤ k ≤ n). The kWTA operation has important applications in machine learning, such as k-neighborhood classification, k-means clustering, etc.

Computational Intelligence Laboratory, CUHK – p. 45/69

slide-67
SLIDE 67

k Winners Take All Operation

The k-winners-take-all (kWTA) operation is to select the k largest inputs out of n inputs (1 ≤ k ≤ n). The kWTA operation has important applications in machine learning, such as k-neighborhood classification, k-means clustering, etc. As the number of inputs increases and/or the selection process should be operated in real time, parallel algorithms and hardware implementation are desirable.

Computational Intelligence Laboratory, CUHK – p. 45/69

slide-68
SLIDE 68

kWTA Problem Formulations

The kWTA function can be defined as: xi = f(ui) = 1, if ui ∈ {k largest elements of u}, 0, otherwise, where u ∈ Rn and x ∈ Rn is the input vector and

  • utput vector, respectively.

Computational Intelligence Laboratory, CUHK – p. 46/69

slide-69
SLIDE 69

kWTA Problem Formulations

The kWTA function can be defined as: xi = f(ui) = 1, if ui ∈ {k largest elements of u}, 0, otherwise, where u ∈ Rn and x ∈ Rn is the input vector and

  • utput vector, respectively.

The kWTA solution can be determined by solving the following linear integer program: minimize −

n

  • i=1

uixi, subject to

n

  • i=1

xi = k, xi ∈ {0, 1}, i = 1, 2, . . . , n.

Computational Intelligence Laboratory, CUHK – p. 46/69

slide-70
SLIDE 70

kWTA Problem Formulations

If the kth and (k + 1)th largest elements of u are different (denoted as ¯ uk and ¯ uk+1 respectively), the kWTA problem is equivalent to the following LP or QP problems: minimize −uTx or

a 2xTx − uTx,

subject to

n

  • i=1

xi = k, 0 ≤ xi ≤ 1, i = 1, 2, . . . , n, where a ≤ ¯ uk − ¯ uk+1 is a positive constant.

Computational Intelligence Laboratory, CUHK – p. 47/69

slide-71
SLIDE 71

QP-based Primal-Dual Network

The primal-dual network based on the QP formulation needs 3n + 1 neurons and 6n + 2 connections, and its dynamic equations can be written as:                      ǫdx

dt

= −(1 + a)(x − (x + ve + w − ax + u)+) −(eTx − k)e − x − y + e ǫdy

dt

= −y + (y + w)+ − x − y + e ǫdv

dt

= −eT(x − (x + ve + w − ax + u)+) +eTx − k ǫdw

dt

= −x + (x + ve + w − ax + u)+ −y + (y + w)+ + x + y − e where x, y, w ∈ Rn, v ∈ R, e = (1, 1, . . . , 1)T ∈ Rn, ǫ > 0, x+ = (x+

1 , . . . , x+ n )T, and x+ i = max{0, xi}

.

Computational Intelligence Laboratory, CUHK – p. 48/69

slide-72
SLIDE 72

QP-based Projection Network

The projection neural network for kWTA operation based on the QP formulation needs n + 1 neurons and 2n + 2 connections, which dynamic equations can be written as: ǫdx

dt

= −x + g(x − η(ax − ye − u)), ǫdy

dt

= −eTx + k. where x ∈ Rn, y ∈ R, ǫ and η are positive constants, g(x) = (g(x1), . . . , g(xn))T and g(xi) =    0, if xi < 0, xi, if 0 ≤ xi ≤ 1, 1, if xi > 1.

Computational Intelligence Laboratory, CUHK – p. 49/69

slide-73
SLIDE 73

LP-based Projection Network

Based on the equivalent LP formulation, we propose a recurrent neural network for KWTA operation with its dynamical equations as follows: ǫdx

dt = −x + g(x + αey + αu),

ǫdy

dt = eTx − k,

where ǫ > 0, α > 0, x ∈ Rn, y ∈ R.

Computational Intelligence Laboratory, CUHK – p. 50/69

slide-74
SLIDE 74

QP-based Simplified Dual Net

The simplified dual neural network for kWTA

  • peration based on the QP formulation a needs n

neurons and 3n connections, and its dynamic equation can be written as: ǫdy

dt

= −My + g((M − I)y − s) − s x = My + s, where x, y ∈ Rn, M = 2(I − eeT/n)/a, s = Mu + ke/n, I is an identity matrix, ǫ and g are defined as before.

  • aS. Liu and J. Wang, “A simplified dual neural network for quadratic programming with its

KWTA application,” IEEE Trans. Neural Networks, vol. 17, no. 6, pp. 1500-1510, 2006.

Computational Intelligence Laboratory, CUHK – p. 51/69

slide-75
SLIDE 75

LP-based One-layer kWTA Net

The dynamic equation of a new LP-based kWTA network model is described as follows: ǫdx dt = −Px − σ(I − P)g(x) + s, where P = eeT/n, s = u − Pu + ke/n, ǫ is a positive scaling constant, σ is a nonnegative gain parameter, and g(x) = (g(x1), g(x2), . . . , g(xn))T is a discontinuous vector-valued activation function.

Computational Intelligence Laboratory, CUHK – p. 52/69

slide-76
SLIDE 76

Activation Function

A discontinuous activation function is defined as follows: g(xi) =              1, if xi > 1, [0, 1], if xi = 1, 0, if 0 < xi < 1, [−1, 0], if xi = 0, −1, if xi < 0.

Computational Intelligence Laboratory, CUHK – p. 53/69

slide-77
SLIDE 77

Activation Function (cont’d)

✲ ✻

xi g(xi) 1 −1 1

Computational Intelligence Laboratory, CUHK – p. 54/69

slide-78
SLIDE 78

Convergence Results

The network can perform the kWTA operation if ¯ Ω ⊂ {x ∈ Rn : 0 ≤ x ≤ 1}, where ¯ Ω is the set of equilibrium point(s). The network can perform the kWTA operation if it has a unique equilibrium point and σ ≥ 0 when (I − eeT/n)u = 0 or one of the following conditions holds when (I − eeT/n)u = 0:

(i) σ ≥

n

i=1 |ui− n j=1 uj/n|

2n−2

, or (ii) σ ≥ n n

i=1(ui− n j=1 uj/n)2

n(n−1)

, or (iii) σ ≥ 2 maxi |ui − n

j=1 uj/n|, or,

(iv) σ ≥

n

i=1(ui− n j=1 uj/n)2

min+

γi∈{−1,0,1}

  • | n

i=1(ui− n j=1 uj/n)γi|

.

Computational Intelligence Laboratory, CUHK – p. 55/69

slide-79
SLIDE 79

Simulation Results

Consider a kWTA problem with input vector ui = i (i = 1, 2, . . . , n), n = 5, k = 3.

0.2 0.4 0.6 0.8 1 x 10

−5

−1.5 −1 −0.5 0.5 1 1.5 2 2.5 time (sec) x1,x2 x3,x4,x5 state trajectories

Transient behaviors of the kWTA network σ = 6.

Computational Intelligence Laboratory, CUHK – p. 56/69

slide-80
SLIDE 80

Simulation Results (cont’d)

0.2 0.4 0.6 0.8 1 x 10

−5

−1.5 −1 −0.5 0.5 1 1.5 2 2.5 time (sec) state trajectories

Transient behaviors of the kWTA network with σ = 2.

Computational Intelligence Laboratory, CUHK – p. 57/69

slide-81
SLIDE 81

Simulation Results (cont’d)

0.5 1 1.5 2 2.5 3 3.5 4 x 10

−7

−0.2 0.2 0.4 0.6 0.8 1 1.2 time (sec) n=5,σ=19 n=10,σ=19 n=15,σ=19 n=20,σ=19 x1 x5 x10 x15 x20

Convergence behavior of the kWTA network with respect to different values of n.

Computational Intelligence Laboratory, CUHK – p. 58/69

slide-82
SLIDE 82

QP-based One-layer kWTA Net

A QP-based kWTA network model with a discontinuous activation function is described as follows: ǫdz dt = −(I − P)z − [aI + (1 − a)P]g(z) + s, x = −1 a(I − P)z + s a + k(a − 1) na e, where g(z) = (g(z1), g(z2), . . . , g(zn))T is a discontinous activation function and ǫ is a positive scaling constant.

Computational Intelligence Laboratory, CUHK – p. 59/69

slide-83
SLIDE 83

Activation Function

g(zi) =    1, if zi > 0, [0, 1], if zi = 0, 0, if zi < 0.

✲ ✻

zi h(zi) 1

Computational Intelligence Laboratory, CUHK – p. 60/69

slide-84
SLIDE 84

Convergence Results

The neural network with any a > 0 is stable in the sense of Lyapunov and any trajectory is globally convergent to an equilibrium point. x∗ = −(I − P)z∗/a + s/a + (a − 1)ke/(na) is an

  • ptimal solution of kWTA problem, where z∗ is an

equilibrium point of the neural network.

Computational Intelligence Laboratory, CUHK – p. 61/69

slide-85
SLIDE 85

Simulation Results

0.5 1 1.5 2 x 10

−5

−2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 time (sec) z1 z2 z3 z4 z5 state trajectories

Computational Intelligence Laboratory, CUHK – p. 62/69

slide-86
SLIDE 86

Simulation Results (cont’d)

0.5 1 1.5 2 x 10

−5

−10 −8 −6 −4 −2 2 4 6 8 10 time (sec) x1,x2 x3,x4,x5

  • utput trajectories

Computational Intelligence Laboratory, CUHK – p. 63/69

slide-87
SLIDE 87

Simulation Results (cont’d)

0.5 1 1.5 2 x 10

−5

−2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 z1 z5 time (sec) α=0.01,n=5 α=0.10,n=5 α=0.50,n=5 α=1.00,n=5

Computational Intelligence Laboratory, CUHK – p. 64/69

slide-88
SLIDE 88

Simulation Results (cont’d)

1 2 3 4 5 6 7 8 x 10

−5

−15 −10 −5 5 10 time (sec) n=5,α=0.01 n=10,α=0.01 n=15,α=0.01 n=20,α=0.01 z1 z5 z10 z15 z20

Computational Intelligence Laboratory, CUHK – p. 65/69

slide-89
SLIDE 89

A Dynamic Example

Let inputs be 4 sinusoidal input signals (i.e., n = 4) ui(t) = 10 sin[2π(1000t + 0.2(i − 1)], and k = 2.

0.2 0.4 0.6 0.8 1 −10 −5 5 10 v v1 v2 v3 v4 0.2 0.4 0.6 0.8 1 0.5 1 x1 0.2 0.4 0.6 0.8 1 0.5 1 x2 0.2 0.4 0.6 0.8 1 0.5 1 x3 0.2 0.4 0.6 0.8 1 0.5 1 x4

Computational Intelligence Laboratory, CUHK – p. 66/69

slide-90
SLIDE 90

Model Comparisons

model layer(s) neurons connections LP-based primal-dual network 2 n + 1 2n + 2 QP-based primal-dual network 2 3n + 1 6n + 2 LP-based projection network 2 n + 1 2n + 2 QP-based projection network 2 n + 1 2n + 2 QP-based simplified dual network 1 n 3n LP-based one-layer network 1 n 2n QP-based one-layer network 1 n 3n

a

  • aQ. Liu, and J. Wang, “Two k-winners-take-all networks with discontinuous activation func-

tions,” Neural Networks, vol. 21, no. 2-3, pp. 406-413, 2008.

Computational Intelligence Laboratory, CUHK – p. 67/69

slide-91
SLIDE 91

Concluding Remarks

Neurodynamic optimization has been demonstrated to be a powerful alternative approach to many

  • ptimization problems.

For convex optimization, recurrent neural networks are available with global convergence to the optimal solution. Neurodynamic optimization approaches provide parallel distributed computational models more suitable for real-time applications.

Computational Intelligence Laboratory, CUHK – p. 68/69

slide-92
SLIDE 92

Future Works

The existing neurodynamic optimization model can still be improved to reduce their model complexity or increase their convergence rate. The available neurodynamic optimization model can be applied to more areas such as control, robotics, and signal processing. Neurodynamic approaches to global optimization and discrete optimization are much more interesting and challenging. It is more needed to develop neurodynamic models for nonconvex optimization and combinatorial

  • ptimization.

Computational Intelligence Laboratory, CUHK – p. 69/69