The Cayley-Hamilton Theorem For Finite Automata Radu Grosu SUNY at - - PowerPoint PPT Presentation

the cayley hamilton theorem
SMART_READER_LITE
LIVE PREVIEW

The Cayley-Hamilton Theorem For Finite Automata Radu Grosu SUNY at - - PowerPoint PPT Presentation

The Cayley-Hamilton Theorem For Finite Automata Radu Grosu SUNY at Stony Brook How did I get interested in this topic? Convergence of Theories Hybrid Systems Computation and Control: - convergence between control and automata theory.


slide-1
SLIDE 1

Radu Grosu SUNY at Stony Brook

The Cayley-Hamilton Theorem For Finite Automata

slide-2
SLIDE 2

How did I get interested in this topic?

slide-3
SLIDE 3
  • Hybrid Systems Computation and Control:
  • convergence between control and automata theory.
  • Hybrid Automata: an outcome of this convergence
  • modeling formalism for systems exhibiting both

discrete and continuous behavior,

  • successfully used to model and analyze embedded

and biological systems.

Convergence of Theories

slide-4
SLIDE 4

Lack of Common Foundation for HA

  • Mode dynamics:
  • Linear system (LS)
  • Mode switching:
  • Finite automaton (FA)
  • Different techniques:
  • LS reduction
  • FA minimization

Stimulated

 

U

V

s v 

U

V

v 

E

V

v

/

R

V t

v

R

x Ax Bu v v V Cx     / di t

s

 voltage(mv) time(ms)

  • LS & FA taught separately: No common foundation!
slide-5
SLIDE 5
  • Finite automata can be conveniently regarded as

time invariant linear systems over semimodules:

  • linear systems techniques generalize to automata
  • Examples of such techniques include:
  • linear transformations of automata,
  • minimization and determinization of automata as
  • bservability and reachability reductions
  • Z-transform of automata to compute associated

regular expression through Gaussian elimination.

Main Conjecture

slide-6
SLIDE 6

Minimal DFA are Not Minimal NFA

(Arnold, Dicky and Nivat’s Example) x1 a b x3 x4 x2 c b c x1 a x2 x3 a b c L = a (b* + c*)

slide-7
SLIDE 7

c a x1 x2 x3 b a b a x4 x5 c c b a b x2 x3 b c x4 x5 a a c b c x1

Minimal NFA: How are they Related?

(Arnold, Dicky and Nivat’s Example) L = ab+ac + ba+bc + ca+cb No homomorphism of either automaton onto the other.

slide-8
SLIDE 8

c a x1 x2 x3 b a b a x4 x8 c c b a b x5 x6 b c x7 x8 a a c b c x1

Minimal NFA: How are they Related?

(Arnold, Dicky and Nivat’s Example) Carrez’s solution: Take both in a terminal NFA. Is this the best one can do? No! One can use use linear (similarity) transformations.

slide-9
SLIDE 9

Observability Reduction HSCC’09

(Arnold, Dicky and Nivat’s Example)

Define linear transformation xt = xtT: T  x1 x2 x3 x4 x5 x1 1 x2 0 1 1 x3 0 1 1 x4 0 1 1 x5 0 1                

A = [AT]T (T1AT)

x0

t = x0 tT

C = [C]T (T1C)

a b x23 x24 b c x34 x5 a a c b c x1

A

c a x1 x2 x3 b a b a x4 x5 c c b

A

slide-10
SLIDE 10

Reachability Reduction HSCC’09

(Arnold, Dicky and Nivat’s Example)

Define linear transformation xt = xtT: T  x1 x2 x3 x4 x5 x1 1 x2 0 1 1 x3 0 1 1 x4 0 1 1 x5 0 1                

At = [AtT]T (T1AtT)

x0

t = x0 tT

C = [C]T (T1C)

a b x2 x3 b c x4 x5 a a c b c x1

A

c a x1 x23 x24 b a b a x34 x5 c c b

A

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14

Observability and minimization

slide-15
SLIDE 15

Finite Automata as Linear Systems

 Consider a finite automaton M = (X,,,S,F) with:

  • finite set of states X, finite input alphabet ,
  • transition relation   X    X,
  • starting and final sets of states S,F  X
slide-16
SLIDE 16

Finite Automata as Linear Systems

 Consider a finite automaton M = (X,,,S,F) with:

  • finite set of states X, finite input alphabet ,
  • transition relation   X    X,
  • starting and final sets of states S,F  X

 Let X denote row and column indices. Then:

  •  defines a matrix A,
  • S and F define corresponding vectors
slide-17
SLIDE 17

Finite Automata as Linear Systems

 Now define the linear system LM= [S,A,C]:

x

t(n+1) = x t(n)A, x0

= S y

t(n)

= x

t(n)C, C = F

slide-18
SLIDE 18

Finite Automata as Linear Systems

 Now define the linear system LM= [S,A,C]:

x

t(n+1)

= x

t(n)A,

x0 = S y

t(n)

= x

t(n)C,

C = F

 Example: consider following automaton:

x3 x2 x1 a a b b

A = 0 a b 0 a 0 0 0 b

       

x0 = 

       

C =  

       

slide-19
SLIDE 19

Semimodule of Languages

 (

*) is an idempotent semiring (quantale):

  • ((

*),+,0) is a commutative idempotent monoid (union),

  • ((

*),,1) is a monoid (concatenation),

  • multiplication distributes over addition,
  • 0 is an annihilator: 0  a = 0

 ((

*))n is a semimodule over scalars in ( *):

  • r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),
  • 1x = x, 0x = 0

 Note: No additive and multiplicative inverses!

slide-20
SLIDE 20

Semimodule of Languages

 (

*) is an idempotent semiring (quantale):

  • ((

*),+,0) is a commutative idempotent monoid (union),

  • ((

*),,1) is a monoid (concatenation),

  • multiplication distributes over addition,
  • 0 is an annihilator: 0  a = 0

 ((

*))n is a semimodule over scalars in ( *):

  • r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),
  • 1x = x, 0x = 0

 Note: No additive and multiplicative inverses!

slide-21
SLIDE 21

Observability

 Let L = [S,A,C]. Observe its output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t [C AC ... A n-1C] = x0 tO (1)

 If L operates on a vector space:

  • L is observable if: x0 is uniquely determined by (1),
  • Observability matrix O: has rank n,
  • n-outputs suffice: A

nC = s1A n-1C + s2A n-2C + ... + snC

 If L operates on a semimodule:

  • L is observable if: x0 is uniquely determined by (1)
slide-22
SLIDE 22

Observability

 Let L = [S,A,C]. Observe its output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t [C AC ... A n-1C] = x0 tO (1)

 If L operates on a vector space:

  • L is observable if: x0 is uniquely determined by (1),
  • Observability matrix O: has rank n,
  • n-outputs suffice: A

nC = s1A n-1C + s2A n-2C + ... + snC

(Cayley-Hamilton Theorem)

 If L operates on a semimodule:

  • L is observable if: x0 is uniquely determined by (1)
slide-23
SLIDE 23

Observability

 Let L = [S,A,C]. Observe the output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t [C AC ... A n-1C] = x0 tO (1)

 If L operates on a vector space:

  • L is observable if: x0 is uniquely determined by (1),
  • Observability matrix O: has rank n,
  • n-outputs suffice: A

nC = s1A n-1C + s2A n-2C + ... + snC

 If L operates on a semimodule:

  • L is observable if: x0 is uniquely determined by (1)
slide-24
SLIDE 24

The Cayley-Hamilton Theorem

( An = s1An-1 + s2An-2 + ... + snI )

slide-25
SLIDE 25

 Permutations are bijections of {1,...,n}:

  • Example:  = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}

 The graph G() of a permutation :

  • G() decomposes into: elementary cycles,

 The sign of a permutation:

  • Pos/Neg: even/odd number of even length cycles,
  • P

n  / P n : all positive/negative permutations.

Permutations

slide-26
SLIDE 26

 Permutations are bijections of {1,...,n}:

  • Example:  = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}

 The graph G() of a permutation :

  • G() decomposes into: elementary cycles

 The sign of a permutation:

  • Pos/Neg: even/odd number of even length cycles
  • P

n  / P n : all positive/negative permutations.

Permutations

3 4 2 1 7 5 6

slide-27
SLIDE 27

 Permutations are bijections of {1,...,n}:

  • Example:  = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}

 The graph G() of a permutation :

  • G() decomposes into: elementary cycles

 The sign of a permutation :

  • Pos/Neg: even/odd number of even length cycles
  • P

n  / P n : all positive/negative permutations

Permutations

3 4 2 1 7 5 6

slide-28
SLIDE 28

Eigenvalues in Vector Spaces

 The eigenvalues of a square matrix A:

  • Eigenvector equation: xtA = xts

 The characteristic equation of A:

  • The characteristic polynomial: cpA(s) = |sI-A|
  • The characteristic equation: cpA(s) = 0

 The determinant of A:

  • The determinant: |A| =

(A)

Pn

  • (A)

Pn

,

  • Permutation application: (A) =

A(i,(i))

i1 n

eigenvalue eigenvector

slide-29
SLIDE 29

Matrix-Eigenspaces in Vector Spaces

 The eigenvalues of a square matrix A:

  • Eigenvector equation: xt(sI-A) = 0

 The characteristic equation of A:

  • The characteristic polynomial: cpA(s) = |sI-A|
  • The characteristic equation: cpA(s) = 0

 The determinant of A:

  • The determinant: |A| =

(A)

Pn

  • (A)

Pn

,

  • Permutation application: (A) =

A(i,(i))

i1 n

slide-30
SLIDE 30

Matrix-Eigenspaces in Vector Spaces

 The eigenvalues of a square matrix A:

  • Eigenvector equation: xt(sI-A) = 0

 The characteristic equation of A:

  • The characteristic polynomial: cpA(s) = |sI-A|
  • The characteristic equation: cpA(s) = 0

 The determinant of A:

  • The determinant: |A| =

(A)

Pn

  • (A)

Pn

,

  • Permutation application: (A) =

A(i,(i))

i1 n

slide-31
SLIDE 31

Matrix-Eigenspaces in Vector Spaces

 The eigenvalues of a square matrix A:

  • Eigenvector equation: xt(sI-A) = 0

 The characteristic equation of A:

  • The characteristic polynomial: cpA(s) = |sI-A|
  • The characteristic equation: cpA(s) = 0

 The determinant of A:

  • The determinant: |A| =

(A)

Pn

  • (A)

Pn

,

  • Weight of a permutation: (A) =

A(i,(i))

i1 n

slide-32
SLIDE 32

 A satisfies its characteristic equation: cpA(A) = 0

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

  • a21 s
  • a23
  • a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

The Cayley-Hamilton Theorem (CHT)

slide-33
SLIDE 33

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

  • a21 s
  • a23
  • a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0

slide-34
SLIDE 34

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

  • a21 s
  • a23
  • a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0

slide-35
SLIDE 35

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

  • a21 s
  • a23
  • a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0

slide-36
SLIDE 36

A = 0 a12 0 a21 0 a23 a31 0 a33

          sI-A =

s -a12

  • a21 s
  • a23
  • a31 0 s-a33

         

|sI-A| = s

3 - a33s 2 - a12a21s + a12a21 a33- a12a23a31 = 0

s

3 + a12a21 a33 = a33s 2 + a12a21s + a12a23a31

A

3 + a12a21 a33I = a33A 2 + a12a21A + a12a23a31I

3 2 1 a12 a31 a23 a33 a21

cycle cycle cycle cycle cycle

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0

slide-37
SLIDE 37

 A satisfies its characteristic equation: cpA(A) = 0  Implicit assumptions in CHT:

  • Subtraction is available
  • Multiplication is commutative

 Does CHT hold in semirings?

  • Subtraction not indispensible (Rutherford, Straubing)
  • Commutativity still problematic

The Cayley-Hamilton Theorem (CHT)

slide-38
SLIDE 38

The Cayley-Hamilton Theorem (CHT)

 A satisfies its characteristic equation: cpA(A) = 0  Implicit assumptions in CHT:

  • Subtraction is available
  • Multiplication is commutative

 Does CHT hold in semirings?

  • Subtraction not indispensible (Rutherford, Straubing)
  • Commutativity problematic
slide-39
SLIDE 39

CHT in Commutative Semirings

(Straubing’s Proof)

 Lift original semiring to the semiring of paths:

  • Matrix A is lifted to a matrix GA of paths 

A = 0 a12 a21 0 a23 a31 0 a33

          GA =

(1,2) (2,1) (2,3) (3,1) (3,3)

         

slide-40
SLIDE 40

 Lift original semiring to the semiring of paths:

  • Matrix A is lifted to a matrix GA of paths 
  • Permutation cycles  lifted cyclic paths 

 = {(1,2),(2,1)}  = (1,2)(2,1)

CHT in Commutative Semirings

(Straubing’s Proof)

slide-41
SLIDE 41

 Lift original semiring to the semiring of paths:

  • Matrix A is lifted to a matrix GA of paths 
  • Permutation cycles lifted cyclic paths 

 Prove CHT in the semiring of paths:

GA

n-q P

q 

q0 n

= GA

n-q P

q 

q0 n

(CHT holds?)

CHT in Commutative Semirings

(Straubing’s Proof)

slide-42
SLIDE 42

 Lift original semiring to the semiring of paths:

  • Matrix A is lifted to a matrix GA of paths 
  • Permutation cycles lifted cyclic paths 

 Prove CHT in the semiring of paths:

  • Show bijection between pos/neg products 

GA

P3

= GA

2 P

1 

3 2 1

(1,2) (2,1) (2,3) (3,1) (3,3)

3 2 1

(1,2) (2,1) (2,3) (3,1) (3,3)

(3,3)(1,2)(2,1) (3,3)(1,2)(2,1)

CHT in Commutative Semirings

(Straubing’s Proof)

slide-43
SLIDE 43

 Lift original semiring to the semiring of paths:

  • Matrix A is lifted to a matrix GA of paths 
  • Permutation cycles lifted cyclic paths 

 Prove CHT in the semiring of paths:

  • Show bijection between pos/neg products 

 Port results back to the original semiring:

  • Apply products: (A)
  • Path application: (1...n)(A) = A(1)...A(n)

CHT in Commutative Semirings

(Straubing’s Proof)

slide-44
SLIDE 44

CHT in Idempotent Semirings

 Lift original semiring to the semiring of paths:

  • Matrix A: order in paths  important
  • Permutation cycles: rotations are distinct
slide-45
SLIDE 45

CHT in Idempotent Semirings

 Lift original semiring to the semiring of paths:

  • Matrix A: order in paths  important
  • Permutation cycles: rotations are distinct

 = {(1,2),(2,1)}  =

(1,2)(2,1) (2,1)(1,2)0

       

slide-46
SLIDE 46

 Lift original semiring to the semiring of paths:

  • Matrix A: order in paths  important
  • Permutation cycles: rotations are distinct

 Prove CHT in the semiring of paths:

  • Products Gn-||: cycles to be properly inserted

CHT in Idempotent Semirings

slide-47
SLIDE 47

 Lift original semiring to the semiring of paths:

  • Matrix A: order in paths  important
  • Permutation cycles: rotations are distinct

 Prove CHT in the semiring of paths:

  • Products Gn-||: cycles to be properly inserted

  Gn-|| = Gn-|| + GGn-||-1 +...+ Gn-||

CHT in Idempotent Semirings

slide-48
SLIDE 48

 Lift original semiring to the semiring of paths:

  • Matrix A: order in paths  important
  • Permutation cycles: rotations are distinct

 Prove CHT in the semiring of paths:

  • Products Gn-||: cycles to be properly inserted

 Port results back to the original semiring:

  • Apply products: Gn-||(A)

CHT in Idempotent Semirings

slide-49
SLIDE 49

 Theorem: Gn =

 

Pq  n

q1 n

GA

n-||

Proof:

LHS  RHS: Let   LHS

  • Pidgeon-hole:

 has at least one cycle  in s

  • Structural:

 is a simple cycle of length k

  • Remove  in : [s/ ] is in Gn-||
  • Shuffle-product:   Gn-|| reinserts 

RHS  LHS: Let   RHS

  • No wrong path: The shuffle is sound
  • Idempotence:

Takes care of multiple copies

CHT in Idempotent Semirings

slide-50
SLIDE 50

 Theorem: Gn =

 

Pq  n

q1 n

GA

n-||

Proof:

LHS  RHS: Let   LHS

  • Pidgeon-hole:

 has at least one cycle  in s

  • Structural:

 is also a simple cycle

  • Remove  in : [s/ ] is in Gn-||
  • Shuffle-product:   Gn-|| reinserts 

RHS  LHS: Let   RHS

  • No wrong path: The shuffle is sound
  • Idempotence:

Takes care of multiple copies

CHT in Idempotent Semirings

slide-51
SLIDE 51

 Theorem: Gn =

 

Pq  n

q1 n

GA

n-||

Proof:

LHS  RHS: Let   LHS

  • Pidgeon-hole:

 has at least one cycle  in s

  • Structural:

 is also a simple cycle

  • Remove  in : [s/ ] is in Gn-||
  • Shuffle-product:   Gn-|| reinserts 

RHS  LHS: Let   RHS

  • No wrong path: The shuffle is sound
  • Idempotence:

Takes care of multiple copies

CHT in Idempotent Semirings

slide-52
SLIDE 52

 Define: (i,i) =

 if (i,i) = 0 0 if (i,i) =      

 Theorem: classic CHT can be derived by using:

  •  Gn-|| =   G

n-|| +   G n-||

  • application of CHT to G

n-|| and G n-||

 Matrix CHT: can be regarded as a constructive

version of the pumping lemma.

CHT in Idempotent Semirings

slide-53
SLIDE 53

 Define: (i,i) =

 if (i,i) = 0 0 if (i,i) =      

 Theorem: classic CHT can be derived by using:

  •  Gn-|| =   G

n-|| +   G n-||

  • application of CHT to G

n-|| and G n-||

 Matrix CHT: can be regarded as a constructive

version of the pumping lemma.

CHT in Idempotent Semirings

slide-54
SLIDE 54

 Define: (i,i) =

 if (i,i) = 0 0 if (i,i) =      

 Theorem: classic CHT can be derived by using:

  •  Gn-|| =   G

n-|| +   G n-||

  • application of CHT to G

n-|| and G n-||

 Matrix CHT: can be regarded as a constructive

version of the pumping lemma.

CHT in Idempotent Semirings

slide-55
SLIDE 55
slide-56
SLIDE 56

Finite Automata as Linear Systems

 Now define the linear system LM= [S,A,C]:

xt(n+1) = xt(n)A, x0 = S() yt(n) = xt(n)C, C = F()

 Example: consider following automaton:

x3 x2 x1 a a b b 1 1 A(a) = 0 1 0 , x ( ) = 0 1 A(b) = 0 0 , C( ) = 1 1 1                                  

L1

A = A(a)a + A(b)b

slide-57
SLIDE 57

Observability

t n-1 t

[y(0) y(1) ... y(n-1)] = x C AC ... A C] = x (1)

Let L = [S,A,C] be an n-state automaton. It's output: [ O L is observable if x is uniquely determin Exampl t ed by ( e:

  • bser

h v 1). l e abi i

 

n 1 2 3

1

b b b b b

a a ε a a a x A C 1 1 1 1 O = 1 1 1 1 1 x x 1

ty matrix

  • f

O L is:

x3 x2 x1 a a b b

L1