Outline Evolution of neurocomputing Artificial neural networks - - PDF document

outline
SMART_READER_LITE
LIVE PREVIEW

Outline Evolution of neurocomputing Artificial neural networks - - PDF document

Introduction to Neural Networks and Deep Learning Ling Guan References: S. Haykin, Neural Networks , 2 nd Edition, New Jersey: Prentice Hall, 1. 2004. C. M. Bishop, Neural Networks for Pattern Re cognition, New York: 2. Oxford University


slide-1
SLIDE 1

Introduction to Neural Networks and Deep Learning

Ling Guan

References:

1.

  • S. Haykin, Neural Networks, 2nd Edition, New Jersey: Prentice Hall,

2004.

2.

  • C. M. Bishop, Neural Networks for Pattern Recognition, New York:

Oxford University Press, 1995.

3.

  • I. Goodfellow, Y. Bengio and A. Courville, Deep Learning,

Massachusetts: MIT Press, 201.

1

Outline

  • Evolution of neurocomputing
  • Artificial neural networks
  • Feed forward neural networks
  • Radial basis functions (RBF): A feed forward neural net for

classification

  • Modular neural net (MNN): Divide and conquer
  • General regression neural network (GRNN): Optimum feature

selection

  • Self-organizing map (SOM): Unsupervised data classification
  • Deep neural network

2

slide-2
SLIDE 2

Evolution of NeuroComputing

Image courtesy of DeView

3

Evolution of NeuroComputing (2)

4

slide-3
SLIDE 3

Artificial Neural Networks

  • Nonlinear computing machines to mimic the functions of

human brain.

  • Machine learning perspective: Learning process.
  • Signal processing perspective: Nonlinear filters.
  • Pattern recognition perspective: Universal classifiers.
  • Statistical analysis perspective: Non-parametric modeling and

estimation.

  • Operational research perspective: generalized optimization

procedures.

5

The Human Brain

Structure: massive, highly sparse and modularized Key aspects in the study of neural nets:

  • learning
  • architecture

6

slide-4
SLIDE 4

Learning in Neural Networks

  • The property of primary significance for a neural

network is its ability

  • to learn from its environment, and
  • to improve its performance through learning.
  • Learning: a process by which the free parameters of a

NN are adapted through a process of stimulation by the environment in which the network is embedded.

  • The type of learning is determined by the manner in

which the parameter changes take place.

7

Feed Forward Neural Networks

  • Structurally, feed forward neural networks

(FFNNs) have clearly defined input and

  • utput.
  • With hidden layers and hidden units with

nonlinear activation functions, e.g. a FFNN can perform nonlinear information processing.

  • A FFNN is a universal approximation

machine trained by supervised learning using numerical examples.

  • FFNNs are popularly used in non-

parametric statistical pattern recognition.

8

slide-5
SLIDE 5

A radial-basis function (RBF) network consists of three layers.

  • The input layer is made up of

source nodes connecting the RBF to its environment.

  • The hidden layer provides a set of

functions (radial-basis functions) that transform the feature vectors in the input space to the hidden space:

  • The output layer is linear, supplying

the response of the network by calculating the weighted output of the hidden layer at the output layer.

Radial-basis Function Networks

9

RBF Network Cont.

The Gaussian-shaped RBF Function is given by

The summation of M Gaussian units yields a similarity function for the input vector as follows:

M m m r m m mG

w S

1 ) (

) , , ( ) (  x x x x

 

            

 

2 1 2 ) ( ) (

2 exp ) , , (

m P p r mp p m r m m

x x G   x x

10

slide-6
SLIDE 6

Modular Networks

E1

  • ....

R E C O G N I Z E D E M O T I O N

.... ....

D E C I S I O N

. . . . . . . . . . . .

Er y1 yj y2

M O D U L E

Ynet E2

N E T W O R K I N P U T

  • Hierarchical Structure
  • Each Sub-network Er an

expert system

  • The decision module

classifies the input vector as a particular class when Ynet = arg max yj

  • r linear combination of the

yi

’s 11

Sub-network Architecture

 Feedforward Architecture  Backward propagation of errors in learning  Each sub-network is specialized in one particular class.

12

slide-7
SLIDE 7

Feature Selection

 Sequential forward selection (SFS) for feature subset

construction

 General Regression Neural Network (GRNN) for

evaluating the relevancy of each subset

 The approach is at least piece-wise linear.

13

Sequential Forward Selection

 SFS generates a sequence of subsets  Construct the reduced dimension training

set from the original training set

.

 Adopt a suitable error measure

m F F F F F

m N m

        ,

1

 

m p F p p F

m m

R x y x 

  , ,

), , (

N p F p p F

R x y x 

, ,

), , (

  

 

P p p F p F

m m

g P E

1 2 , )

( 1 x y

14

slide-8
SLIDE 8

New Subset Construction

 Construct from and  Form the new subsets as follows  The subset is selected as follows:

1 

m

F

 

m m

i i F , ,

1

   

 

N m m m

i i F F F , ,

1

 

    

j m

G

, 1 

 

N m j i F G

j m j m

, , 1 ,

, 1

       

j m

G j j m m

E j G F

, 1 *

min arg ,

* , 1 1

  

   

1 

m

F

15

General Regression Neural Network (GRNN)

 GRNN is a special example of a radial basis

function (RBF) network

 No iterative training procedures are required.  Each Gaussian kernel is associated with a training

pattern

 The input vector is assigned as the center of

the kernel.

 Award winning work!

P p

p p

  , 1 ), , (  y x

p

x

16

slide-9
SLIDE 9

Self-Organizing Map (SOM)

  • A self-organizing map is a neural network which learns without a teacher.
  • Self-organization learning tends to follow neurobiological structure to a

much greater extent than supervised learning.

  • Self-organization learning consists of repeatedly modifying the weights

 in response to activation patterns and  in accordance with prescribed rules

until a final configuration develops.

  • Essence of self-organization: Global order arises from local interaction

(Turing).

  • Three self-organization models:

 Principal components analysis networks (linear).  Self-organizing maps (SOMs).  Information-theoretic models. 17

Self-Organizing Tree Map

 Self- Organizing Tree Map (SOTM) is a tree structured Self-

Organizing Map

 It offers:

  • Independent learning based on competitive learning technique
  • A unique feature map that preserves topological ordering

 SOTM is more suitable than the conventional SOM and k-means

when input feature space is of high dimensionality

18

slide-10
SLIDE 10

Self Organizing Tree Map (SOTM)

  • - specialized tool for data/pattern analysis

SOTM

No nodes converge to areas of zero data density

SOM

Nodes converge to areas of zero data density

19

SOHVM – a special form of SOTM K-means FCM : Fuzzy C-Means GK : Gustafson-Kessel GG : Gath-Geva

SOHVM self determines centres N=9; Other methods initialised with N=9.

SOHVM vs fixed & fuzzy techniques

20

slide-11
SLIDE 11

21

2D Mapping of SOTM

DSOTM Classification in a Two-Dimensional Feature Space.

(A) (B) (C) (D) (E) (F)

22

slide-12
SLIDE 12

Spherical SOM[6]

 Closed structure, no boundary problem.  3D, provides a step towards 3D analysis and

visualization in immersive environment (e.g. in a CAVE).

Open Boundaries

23

SOM SSOM  Use a combination of spatial features and local

features (intensity, gradient magnitude, X,Y,Z location).

 Train Spherical SOM.  Visualize the SOM with colors mapped to cluster

densities (U-Matrix).

 User interacts with the map, simple selection/de-

selection of nodes.

 RGBA texture generated accordingly for

immediate volume rendering.

SSOM for Rendering

24

slide-13
SLIDE 13

Sample Results

Full Render Map Selection Corresponding Render

25

Deep Learning

 Architecture: Feedforward of more than three layers (deep) of NN,

massively complicated

 Motivation:

  • In theory, a three layer FFNN can approximate any nonlinear

functions

  • In practice, it is impossible due to multiple factors: architecture, size
  • f training samples, etc.
  • More layers and more hidden neurons to compensate for the

abovementioned shortcomings

 The pro and con

  • Performance is extremely impressive in numerous applications, e.g.

image recognition.

  • Lack of theoretical justification.
  • Very difficult in architecture design and training of free parameters.

26

slide-14
SLIDE 14

Why Deep Architectures

Inspired from nature: The mammalian visual cortex is hierarchical. Deep machines are more efficient for representing certain classes of functions, particularly those involved in static visual recognition.

 Recognition pathway in the visual cortex has multiple stages.  Lots of intermediate representations.

Image courtesy of Simon Thorpe

27

A Popular Architecture of Deep Learning

Learning hierarchical feature representations. Deep: More than one stage of non-linear feature transformation.

Feature visualization of convolutional net trained on ImageNet [Zeiler & Fergus 2013] Low-Level Feature Mid-Level Feature High-Level Feature

Linear Transformation

Low-Level Feature Mid-Level Feature High-Level Feature Trainable Classifier

Linear Transformation Linear Transformation Linear Transformation

28

slide-15
SLIDE 15

Deep learning: Feature Discovery

29

Deep Netwrok Models

 Energy based model (EBM)  Restricted Boltzmann machine (RBM)  Deep belief network (DBN)  Higher-order Boltzmann machines (HOBM)  Recurrent neural network (RNN)  Convolutional Neural Network (CNN)

Image courtesy of Simon Thorpe

30

slide-16
SLIDE 16

Deep Network Challenges

 Architectures are getting more complex and difficult to interpret  Deep Reinforcement Networks  Attention: Focusing Computations  Initializing Networks: batch normalization  Joint optimization of deep networks at multiple stages  Visual analytics in architecture design of deep networks

31

Feature 1 Feature 2

:

Feature N Classification

Mapping (generate effective represent ation) Recognition

Feature generation Feature Generation Feature Generation Feature Generation

:

Data Result

Feature mapping

Knowledge Discovery in Machine Learning

Knowledge discovery 32

slide-17
SLIDE 17

In the schematic diagram on the previous slide, knowledge discovery includes:

Feature generation block: generate features/descriptors by

 Identify the key-points  Generate features at the key-points

 Classical methods incorporating prior knowledge (hand crafted features)  Deep leaning structure such as CNN (hand crafted architecture) 

The feature mapping block: map the features into more effective representation by

 Feature selection  Explicit mapping by Statistical Machine Learning (SML), or  Implicit mapping by FFNN, normally including pooling, a statistical processing

step

Feature generation and mapping are two different, but complementary processing steps. Both are critically important in information discovery. SML methods are solidly rooted in mathematics, and the analysis procedure could be clearly and convincingly presented.

33

Knowledge Discovery in Machine Learning-2