Chapter 8. Principal-Components Analysis Neural Networks and - PowerPoint PPT Presentation

Chapter ¡8. ¡ Principal-‑Components ¡Analysis Neural ¡Networks ¡and ¡Learning ¡Machines ¡ (Haykin) Lecture ¡Notes ¡of ¡ Self-‑learning ¡Neural ¡Algorithms Byoung-‑Tak ¡Zhang School ¡of ¡Computer ¡Science ¡and ¡Engineering Seoul ¡National ¡University 1

Contents 8.1 ¡Introduction ¡ ¡……………………………………………………..….. ¡3 8.2 ¡Principles ¡of ¡Self-‑organization ¡ ¡………………………………. ¡4 8.3 ¡Self-‑organized ¡Feature ¡Analysis ¡ ¡………………………….... ¡6 8.4 ¡Principal-‑Components ¡Analysis …………………………..... ¡8 8.5 ¡Hebbian-‑Based ¡Maximum ¡Eigenfilter ……………....…. ¡15 8.6 ¡Hebbian-‑Based ¡PCA ¡ ¡…………………………………………….. ¡19 8.7 ¡Case ¡Study: ¡Image ¡Decoding ¡ ¡……………………………….. ¡22 Summary ¡and ¡Discussion ¡ ¡ ¡…………………………..…………….. ¡25 2

8.1 ¡Introduction n Supervised ¡learning • Learning ¡from ¡labeled ¡examples n Semisupervised learning • Learning ¡from ¡unlabeled ¡and ¡labeled ¡examples n Unsupervised ¡learning • Learning ¡from ¡examples ¡without ¡a ¡teacher l Self-‑organized ¡learning • Neurobiological ¡considerations • Locality ¡of ¡learning ¡(immediate ¡local ¡behavior ¡of ¡neurons) l Statistical ¡learning ¡theory • Mathematical ¡considerations • Less ¡emphasis ¡on ¡locality ¡of ¡learning 3

8.2 ¡Principles ¡of ¡Self-‑Organization ¡(1/2) n Principle ¡1: ¡Self-‑amplification (self-‑reinforcement) l Synaptic ¡modification ¡self-‑amplifies ¡ by ¡Hebb’s postulate ¡of ¡ learning 1) If ¡two ¡neurons ¡of ¡a ¡synapse ¡are ¡activated ¡simultaneously, ¡ then ¡ synaptic ¡strength ¡is ¡selectively ¡ increased. 2) If ¡two ¡neurons ¡of ¡a ¡synapse ¡are ¡activated ¡asynchronously, ¡ then ¡synaptic ¡strength ¡is ¡selectively ¡ weakened or ¡eliminated. ! Δ w kj ( n ) = η y k ( n ) x j ( n ) l Four ¡key ¡mechanisms ¡of ¡Hebbian synapse l Time-‑dependent ¡mechanism l Local ¡mechanism l Interactive ¡mechanism l Conjunctional ¡or ¡correlational ¡mechanism 4

8.2 ¡Principles ¡of ¡Self-‑Organization ¡(2/2) n Principle ¡2: ¡Competition • Limitation ¡of ¡available ¡resources • The ¡most ¡vigorously ¡growing ¡(fittest) ¡synapses ¡or ¡neurons ¡are ¡ selected ¡at ¡the ¡expense ¡of ¡the ¡others. • Synaptic ¡plasticity (adjustability ¡of ¡a ¡synaptic ¡weight) n Principle ¡3: ¡Cooperation • Modifications ¡in ¡synaptic ¡weights ¡at ¡the ¡neural ¡level and ¡in ¡ neurons ¡at ¡the ¡network ¡level tend ¡to ¡cooperate ¡with ¡each ¡other. • Lateral ¡interaction ¡among ¡a ¡group ¡of ¡excited ¡neurons n Principle ¡4: ¡Structural ¡information • The ¡underlying ¡structure ¡(redundancy) ¡in ¡the ¡input ¡signal ¡is ¡ acquired by ¡a ¡self-‑organizing ¡system • Inherent ¡characteristic ¡of ¡the ¡input ¡signal 5

8.3 ¡Self-‑organized ¡Feature ¡Analysis Figure ¡8.1 ¡Layout ¡of ¡modular ¡self-‑adaptive ¡Linsker’s model, ¡with ¡ overlapping ¡receptive ¡fields. ¡Mammalian ¡visual ¡system ¡model. 6

8.4 ¡Principal-‑Components ¡Analysis (1/8) Does ¡there ¡exist ¡an ¡invertible ¡linear ¡transformation ¡ T such ¡ that ¡the ¡truncation ¡of ¡ Tx is ¡optimum ¡in ¡the ¡mean-‑square-‑ error ¡sense? x : m #dimentional!vector X : m #dimentional!random!vector q : m #dimentional!unit!vector Projection: !!!!!!! A = X T q = q T X Variance!of! A : !!!!!! σ 2 = E[ A 2 ] = E[( q T X )( X T q )] = q T E[ XX T ] q = q T Rq R : m #by# m !correlation!matrix !!!!!! R = Ε [ XX T ] ! 7

8.4 ¡Principal-‑Components ¡Analysis ¡(2/8) !!!!! ψ ( q ) = ! σ 2 = q T Rq !!!!!!!!!!!!!!** !!!!! RQ = Q Λ For!any!small!perturbation! δ q : Eigen!decomposition: !!!!! ψ ( q + δ q ) = ψ ( q ) !!!!!i)! Q T RQ = Λ .!.!.!.!.!.!. ⎧ λ j ,!!!! k = j ⎪ Introduce!a!scalar!factor! λ : T Rq j = !!!!!!!!!! q j !!!!!!!** ⎨ 0,!!!!!! k ≠ j ⎪ !!!!! Rq = λ q !!!!!!!(eigenvalue!problem) ⎩ ! λ 1 , λ 2 ,..., λ m :!Eigenvalues!of! R m !!!!!ii)! R = Q Λ Q T = ∑ λ i T q i q i q 1 , q 2 ,..., q m :!Eigenvectors!of! R i = 1 !!!!!!!!!!!(spectral!theorem) !!!!! Rq j = λ j q j !!!!!!!! j = 1,!2,!...,! m ! !!!!! λ 1 > λ 2 > ! > λ j > ! > λ m From!!**,!we!see!that !!!!! Q = [ q 1 , q 2 ,..., q j ,..., q m ]!! !!!!! ψ ( q j ) = ! λ j !!!!!!! j = 1,2,..., m ! !!!!! RQ = Q Λ !! 8 !

8.4 ¡Principal-‑Components ¡Analysis ¡(3/8) • Summary ¡of ¡the ¡eigenstructure of ¡PCA 1) The ¡eigenvectors ¡of ¡the ¡correlation ¡matrix ¡ R for ¡the ¡random ¡vector ¡ X define ¡the ¡unit ¡ vectors ¡ q j , ¡representing ¡the ¡principal ¡ directions ¡along ¡with ¡the ¡variance ¡probes ¡ ψ ( q j ) ! have ¡their ¡extremal values. 2) The ¡associated ¡eigenvalues ¡define ¡the ¡ extremal values ¡of ¡the ¡variance ¡probes ¡ ψ ( u j ) ! 9

8.4 ¡Principal-‑Components ¡Analysis ¡(4/8) Data!vector! x :!a!realization!of! X a :!a!realization!of! A !!!!! a j = q j T x = x T q j !!!!!!!!!!! j = 1,2,..., m a j :!the!projections!of! x !onto!principal!directions !!!!!!!!(principal!components) Reconstruction!(synthesis)!of!the!original!data! x : T = [ x T q 1 , x T q 2 ,..., x T q m ] T = Q T x !!!!! a = [ a 1 , a 2 ,..., a m ] !!!!! Qa = QQ T x = Ix = x !!!!! m ∑ !!!!! x = Qa = a j q j ! j = 1 10

8.4 ¡Principal-‑Components ¡Analysis ¡(5/8) Dimensionality!reduction Figure ¡8.2 ¡Two ¡phases ¡of ¡PCA !!!!! λ 1 , λ 2 ,..., λ ℓ :!largest! ℓ !eigenvalues!of! R (a) ¡Encoding, ¡(b) ¡Decoding ⎡ ⎤ a 1 ⎢ ⎥ ⎢ ⎥ ℓ a 2 ∑ !!!!!ˆ x = = [ q 1 , q 2 ,..., q ℓ ] ,!!! ℓ ≤ m a j q j ⎢ ⎥ " ⎢ ⎥ j = 1 ⎢ ⎥ a ℓ ⎣ ⎦ Encoder!for! x :!linear!projection!from! # m !to! # ℓ ⎡ ⎤ ⎡ ⎤ T q 1 a 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ T a 2 q 2 = !! x ,!!!!!!!!!! ℓ ≤ m !!!!! ⎢ ⎥ ⎢ ⎥ " ⎢ ⎥ " ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ a ℓ T q ℓ ⎣ ⎦ ⎣ ⎦ ! 11

8.4 ¡Principal-‑Components ¡Analysis ¡(6/8) Figure ¡8.3: ¡Relationship ¡between Approximation!error!vector: data ¡vector ¡ x , !!!!! e = x 0 ˆ x its ¡reconstructed ¡version ¡ !ˆ x m ∑ !!!!! e = a i q i and ¡error ¡vector ¡ e . ! i = ℓ + 1 12

8.4 ¡Principal-‑Components ¡Analysis ¡(7/8) Figure ¡8.4: ¡A ¡cloud ¡of ¡data ¡points. ¡Projection ¡onto ¡Axis ¡1 ¡has ¡ maximum ¡variance ¡and ¡shows ¡bimodal. 13

8.4 ¡Principal-‑Components ¡Analysis ¡(8/8) Figure ¡8.5: ¡Digital ¡compression ¡of ¡handwritten ¡digits ¡using ¡PCA. 14

8.5 ¡Hebbian-‑Based ¡Maximum ¡Eigenfilter (1/4) Linear!neuron!with!Hebbian!adaptation m ∑ !!!!! y = w i x i i = ℓ + 1 Synaptic!weight! w i !varies!with!time !!!!! w i ( n + 1) = w i ( n ) + η y ( n ) x i ( n ),!!!! i = 1,2,..., m !!!!! " ( ) !!!!! w i ( n + 1) = w i ( n ) + η y ( n ) x i ( n ) − y ( n ) w i ( n ) x i '( n ) = x i ( n ) − y ( n ) w i ( n ) !!!!! w i ( n + 1) = w i ( n ) + η y ( n ) x i '( n ) ! 15

8.5 ¡Hebbian-‑Based ¡Maximum ¡Eigenfilter (2/4) Figure ¡8.6: ¡Signal-‑flow ¡graph ¡representation ¡of ¡maximum ¡eigenfilter 16

8.5 ¡Hebbian-‑Based ¡Maximum ¡Eigenfilter (3/4) Matrix!formulation !!!!! x ( n ) = [ x 1 ( n ), x 2 ( n ),..., x m ( n )] T !!!!! w ( n ) = [ w 1 ( n ), w 2 ( n ),..., w m ( n )] T !!!!! y ( n ) = x T ( n ) w ( n ) = w T ( n ) x ( n ) !!!!! w ( n + 1) = w ( n ) + η y ( n )[ x ( n ) − y ( n ) w ( n )] !!!!!!!!!!!!!!!!!!!!! = w ( n ) + η x T ( n ) w ( n )[ x ( n ) − w T ( n ) x ( n ) w ( n )] !!!!!!!!!!!!!!!!!!!!! = w ( n ) + η [ x T ( n ) x ( n ) w ( n ) − w T ( n ) x ( n ) x T ( n ) w ( n ) w ( n )] ! 17

8.5 ¡Hebbian-‑Based ¡Maximum ¡Eigenfilter (4/4) Aymptotic!stability!of!maximum!eigenfilter !!!!! w ( t ) → q 1 !!!!!!!!!as!! t → ∞ A!single!linear!neuron!governed!by!the!self;organizing!learning!rule adaptively!extracts!the!first!principal!component!of!a!stationary!input. !!!!! x ( n ) = y ( n ) q 1 !!!!!!!!!for!! n → ∞ A!Hebbian;based!linear!neuron!with!learning!rule w ( n + 1) = w ( n ) + η y ( n )[ x ( n ) − y ( n ) w ( n )] converges!with!probability!1!to!a!fixed!point: n →∞ σ 2 ( n ) = λ 1 1)!lim n →∞ w ( n ) = q 1 !!!with!!!lim n →∞ || w ( n )||! = !1! 2)!lim ! 18

Chapter 8. Principal-Components Analysis Neural Networks and - PowerPoint PPT Presentation

Chapter 8. Principal-Components Analysis Neural Networks and Learning Machines (Haykin) Lecture Notes of Self-learning Neural Algorithms Byoung-Tak Zhang School of

Introduction to Machine Learning Session 3b: Principal Components Analysis Reto West

Multivariate analysis DAAG Chapter 12 Learning objectives In this section, we will learn some

Continuous Latent Variables Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 12 Principal Component

RECSM Summer School: Machine Learning for Social Sciences Session 3.2: Principal Components

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

Non-linear dimensionality reduction Recasting Principal Components R.W. Oldford Reducing

Recasting Principal Components R.W. Oldford University of Waterloo Reducing dimensions -

Section 1 Principal Component Analysis 1 / 16 Principal Component Analysis ST 810-006

Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection

24/11/2018 Principal Dr Irene Ng Vice Principal Mrs Regina Po Vice Principal Mr Bryan Ong Vice

Year 10 GCSE Key People You Need to Know: Mr Arnell Principal Ms Morris Deputy Principal

Functional Principal Component Analysis May 14, 2018 Empirical Principal Component FPC for the

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 Inheritance Concepts

Financial Econometrics Econ 40357 Principal Components N.C. Mark University of Notre Dame and

Principal Component Analysis Powerpoint Presentation What is multivariate analysis? Summarizing

Unsupervised Learning Principal Component Analysis CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu

WELCOME! SMEI Virtual Series Setting up for September: Exploring strategies for music teaching

Elastic deformations on the plane and approximations (lecture VVI) Aldo Pratelli Department

Visual Displays. Some evidence through artificial and real data K. Fern andez-Aguirre M.A.

Dimensionality Reduc1on Lecture 23 David Sontag New York University Slides adapted from Carlos

Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent

Factor Analysis and Related Methods James H. Steiger Vanderbilt University Primary Goals for

IN5490 Advanced Topics in Artificial Intelligence for Intelligent Systems Md. Zia Uddin