Dictionary Learning Applications in Control Theory Paul Irofti, - - PowerPoint PPT Presentation

dictionary learning applications in control theory
SMART_READER_LITE
LIVE PREVIEW

Dictionary Learning Applications in Control Theory Paul Irofti, - - PowerPoint PPT Presentation

Dictionary Learning Applications in Control Theory Paul Irofti, Florin Stoican Politehnica University of Bucharest Faculty of Automatic Control and Computers Department of Automatic Control and Systems Engineering Email: paul@irofti.net,


slide-1
SLIDE 1

Dictionary Learning Applications in Control Theory

Paul Irofti, Florin Stoican

Politehnica University of Bucharest Faculty of Automatic Control and Computers Department of Automatic Control and Systems Engineering Email: paul@irofti.net, florin.stoican@acse.pub.ro

Recent Advances in Artificial Intelligence, June 20th, 2017

Acknowledgment: This work was supported by the Romanian National Authority for Scientific Research, CNCS - UEFISCDI, project number PN-II-RU-TE-2014-4-2713.

slide-2
SLIDE 2

Sparse Representation (SR)

= y D x ·

slide-3
SLIDE 3

Orthogonal Matching Pursuit (OMP)

Algorithm 1: OMPa

1 Arguments: D, y, s 2 Initialize: r = y, I = ∅ 3 for k = 1 : s do 4

Compute correlations with residual: z = DTr

5

Select new column: i = arg maxj |zj|

6

Increase support: I ← I ∪ {i}

7

Compute new solution: x = LS(D, y, I)

8

Update residual: r = y − DIxI

aPati, Rezaiifar, and Krishnaprasad 1993.

slide-4
SLIDE 4

Dictionary Learning (DL)

≈ Y D X ·

slide-5
SLIDE 5

The Dictionary Learning (DL) Problem

Given a data set Y ∈ Rp×m and a sparsity level s, minimize the bivariate function minimize

D,X

Y − DX2

F

subject to dj2 = 1, 1 ≤ j ≤ n xi0 ≤ s, 1 ≤ i ≤ m, (1) where D ∈ Rp×n is the dictionary (whose columns are called atoms) and X ∈ Rn×m the sparse representations matrix.

slide-6
SLIDE 6

Approach

Algorithm 2: Dictionary learning – general structure

1 Arguments: signal matrix Y , target sparsity s 2 Initialize: dictionary D (with normalized atoms) 3 for k = 1, 2, . . . do 4

With fixed D, compute sparse representations X

5

With fixed X, update atoms dj, j = 1 : n

slide-7
SLIDE 7

DL Algorithms

K-SVD1 solves the optimization problem in sequence min

dj,Xj,Ij

YIj −

  • ℓ=j

dℓXℓ,Iℓ   − djXj,Ij

  • 2

F

(2) where all atoms excepting dj are fixed. This is seen as a rank-1 approximation and the solution is given by the singular vectors corresponding to the largest singular value. dj = u1, Xj,Ij = σ1v1. (3)

1Aharon, Elad, and Bruckstein 2006.

slide-8
SLIDE 8

LC-KSVD

minimize

D,X,A,W

Y − DX2

F + αQ − AX2 F + βH − WX2 F

subject to dj2 = 1, 1 ≤ j ≤ n xi0 ≤ s, 1 ≤ i ≤ m, (4) dictionary atoms evenly split among classes qi has non-zero entries where yi and di share the same label. linear transformation A encourages discrimination in X hi = ej where j is the class label of yi W represents the learned classifier parameters

slide-9
SLIDE 9

Fault Detection and Isolation in Water Networks

slide-10
SLIDE 10

FDI via DL

Water networks pose some interesting issues: large scale, distributed network with few sensors user demand unknown or imprecise pressure dynamics nonlinear (analytic solutions impractical) The DL approach for FDI: a residual signal compares expected and measured pressures ri(t) = pi(fi(t), fj(t), t) − ¯ pi, ∀i, j (5) to each fault is assigned a class and DL provides the atoms which discriminate between them each residual is sparsely described by atoms and thus, FDI is achieved iff the classification is unambiguous

slide-11
SLIDE 11

Hanoi

100 1350 9 1 1 5 1450 4 5 850 8 5 800 9 5 1200 3500 800 5 550 2 7 3 1 7 5 800 400 2 2 1500 500 2650 1230 1 3 850 300 7 5 1500 2 1600 150 860 950

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 junction node tank node node with sensor junction partition pipe connection fault event

legend

slide-12
SLIDE 12

Sensor Placement

Let R ∈ Rn×mn be measured pressure residuals in all n network

  • nodes. For each node we simulate m different faults.

Given s < n available sensors, apply OMP on each column r minimize

x

r − Inx2

2

subject to x0 ≤ s, (6) resulting in matrix X with s-sparse columns approximating R.

slide-13
SLIDE 13

Placement Strategies

(a) select the s most common used atoms (b) from each m-block select most frequent s atoms; of the n · s atoms, select again the first s.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 2 3 4 5 6 7 8 9 10

selected nodes number of sensors case (a) case (b)

slide-14
SLIDE 14

Learning

Algorithm 3: Placement and FDI learninga

1 Inputs: training residuals R ∈ Rn×nm 2

parameters s, α, β

3 Result: dictionary D, classifier W , sensor nodes Is 4 Select s sensor nodes Is based on matrix R using (a) or (b) 5 Let RIs be the restriction of R to the rows in Is 6 Use RIs, α and β to learn D and W from (4)

aIrofti and Stoican 2017.

slide-15
SLIDE 15

Fault Detection

Algorithm 4: Fault detection and isolation

1 Inputs: testing residuals R ∈ Rs×mn 2

dictionary D, classifier W

3 Result: prediction P ∈ Nmn 4 for k = 1 to mn do 5

Use OMP to obtain xk using rk and D

6

Label: Lk = Wxk

7

Classify: pk = arg maxc Lk Position c of the largest entry from Lk is the predicted class.

slide-16
SLIDE 16

Today

Improved sensor placement. Iteratively choose s rows from R solving at each step i = arg min

k

projRIrk2

2 + λ

1 δk,I1 , rk ∈ RIc, (7) where I is the set of currently selected rows and δk,I vector elements are the distances from node k to the nodes in I. Graph aware DL. Adding graph regularization2 Y − DX2

F + α Q − AX2 F + β H − WX2 F +

+ γTr(DTLD) + λTr(XLcX T) + µ L2

F ,

(8) where L is the graph Laplacian.

2Yankelevsky and Elad 2016.

slide-17
SLIDE 17

Zonotopic Area Coverage

slide-18
SLIDE 18

Zonotopic sets

Area packing, mRPI (over)approximation and other related notions may be described via unions of zonotopic sets: min

Zk

vol(S) − vol

  • k

Zk

  • ,

subject to Zk ⊆ S. (9) Zonotopes, given in generator representation3 Zk = Z(ck, Gk) = {ck + Gkξ : ξ∞ ≤ 1} (10) are easy to handle for: Minkowski sum: Z(G1, c1) ⊕ Z(G2, c2) = Z(

  • G1

G2

  • , c1 + c2)

linear mappings: RZ(G1, c1) = Z(RG1, Rc1)

3Fukuda 2004.

slide-19
SLIDE 19

Formulation

Each zonotope is parameterized after its center and a scaling vector (ck, λk). These variables help formulate the: inclusion constraint Z (ck, G · diag(λk)) ⊆ U: s⊤

i ck +

  • j

|s⊤

i Gj|λjk ≤ ri,

∀i, (11) where U = {u : s⊤

i u ≤ ri}.

explicitly describe the volume4 vol (Z(ck, Gλk)): vol(Z(ck, GΛk)) =

  • 1≤j1<...jn≤N
  • det(G j1...jn)
  • ·
  • j∈{j1...jn}

λjk. (12) The formulation becomes simpler if the scaling is homogeneous (λ∗

k = λjk, ∀j).

4Gover and Krikorian 2010.

slide-20
SLIDE 20

Implementation

We track the OMP formalism, without its theoretical convergence guarantees: Algorithm 5: Area Coverage with zonotopic sets

1 Inputs: area to be covered U, sparsity constraint s 2 Result: pairs of centers and scaling factors (ck, λk) 3 for k = 1 to s do 4

Enlarge the zonotopes until they saturate the constraints

5

Select Zk where k = arg min

k vol(Sk \ Zk) 6

Update the uncovered area vol(Sk+1) = vol(Sk ∪ Zk)

slide-21
SLIDE 21

Result

−0.4 −0.3 −0.2 −0.1 0.1 0.2 0.3 0.4 0.5 −0.4 −0.2 0.2 0.4

slide-22
SLIDE 22

Thank You!

Questions?