The Knowledge Content of Neural Networks Keith L. Downing The - PowerPoint PPT Presentation

The Knowledge Content of Neural Networks Keith L. Downing The Norwegian University of Science and Technology (NTNU) Trondheim, Norway keithd@idi.ntnu.no March 25, 2014 Keith L. Downing The Knowledge Content of Neural Networks

Overview Linear Separability Saliency Principle Components Analysis Hierarchical Clustering based on ANN Layer Behavior Topographic Maps Keith L. Downing The Knowledge Content of Neural Networks

Neurons as Detectors x y activationz wx wy 1 z 0 t z netz x y 2 5 z tz = 1 2 x + 5 y ≥ 1 ⇔ y ≥ − 2 5 x + 1 5 Keith L. Downing The Knowledge Content of Neural Networks

Linear Separability + Y Y + + + - + 0 - X - 0 X - - If each data case has n features, then, when plotted in n-dimensional space, can the positive and negative instances be separated by a hyperplane of n-1 dimensions. E.g. If n = 2, the hyperplane = a line. If so, then a single-neuron detector can easily be reverse-engineered to detect the positive instances. Keith L. Downing The Knowledge Content of Neural Networks

Linear Separability of SOME Booleans Y x y - 1 + AND 0.5 0.5 -1 z 1 tz = 1 X - -1 - Y + + 1 x y OR 0.5 0.5 -1 z tz = 0 1 X + - -1 AND and OR are linearly separable for any input-vector size. Keith L. Downing The Knowledge Content of Neural Networks

...but not ALL Booleans Y - + 1 x y XOR ?? ?? - 1 X 1 z tz =?? + - - 1 This simple, non-linearly-separable example nearly killed neural network research. Perceptrons , Minsky & Papert (1972). Detecting non-linearly-separable classes requires more than 2 layers of neurons, but weights in multi-layer nets could not be learned prior to the popularization of backprop in the mid 1980’s. Keith L. Downing The Knowledge Content of Neural Networks

XOR requires 3 Layers x y -0.5 -0.5 0.5 0.5 tu = 1 u v tv = 1 AND AND 0.5 0.5 z tz = 0 OR y = x + 2 Y - + 1 y = x - 2 XOR v X - 1 1 + - - 1 u Keith L. Downing The Knowledge Content of Neural Networks

ANNs Realize Complex Mappings ANNs can perform mappings of any complexity, whether linearly separable or not. Although, it may require a lot of hidden layers and neurons. However, for a k-layered ANN (with k > 3) an equivalent ANN with k = 3 can be designed. - Y - + + - 10 - L3 + + - L2 - + - - + - - + + -10 + + 10 X - - L1 + - - - - - - -10 Keith L. Downing The Knowledge Content of Neural Networks

Level 1: Region Borders Each of the 3 borderlines is expressed by a simple line, which translates into the weights of three detector neurons. x y x y x y -1 1 1 1 4 1 A1 A2 A3 0 5 30 y - x > 0 y + x > 5 y + 4x > 30 These fire on all input vectors (x,y) that are above the line. Keith L. Downing The Knowledge Content of Neural Networks

Level 2: Regions Each region of positive training instances is expressed as a conjunction of above and below relationships w.r.t. the borderlines. x y 4 1 1 1 1 -1 A1 A2 A3 0 5 30 -1 1 -1 1 R3 Region 3 is above border 2 and below borders 1 and 3. Keith L. Downing The Knowledge Content of Neural Networks

Level 3: Final Classification A positive instance of the concept is an (x, y) case in any of the 3 regions, so the high-level detector, M, represents the disjunct of the 3 regions. x y 4 1 1 1 1 -1 Above 0 A1 5 A2 30 A3 Line 1 -1 -1 1 1 -1 1 R1 R3 R2 1 1 And 2 1 1 1 M 1 Or Keith L. Downing The Knowledge Content of Neural Networks

Neurons Detect Salient Contexts Three-spined stickleback experiments (Tinbergen, 1951) Males develop red bellies when establishing territory. Sight of the salient concept, a red belly, makes male’s aggressive, even on abstract mock-up figures. Something Nothing Keith L. Downing The Knowledge Content of Neural Networks

Saliency for Baby Chickens (Tinbergen, 1951) Mock-ups resembling hawks elicit fear. Those resembling a goose do not. Something Nothing "Hawk" "Goose" Keith L. Downing The Knowledge Content of Neural Networks

What Excites a Toad?? Worms or moving rectangles resembling worms (Ewert, 1980). Neurons in area T5(2) of the toad brain detect worm-ness. Strong Response Weak Response "Worm" "Partial Worm" No Response "Anti-Worm" Keith L. Downing The Knowledge Content of Neural Networks

What Excites an Artificial Neuron?? A +2 +6 +1 B +7 -3 C +5 Bright left eye Dull nose Sexy movie-star cheek mole Smile preferable to a frown Keith L. Downing The Knowledge Content of Neural Networks

Two Keys to Intelligent Behavior Knowing when to differentiate between two situations 1 based on salient features (for which the situations have unequal values), and thus act differently in each. Knowing when to generalize over two situations based on 2 salient similarities , and thus treat each the same . Salient features are very task dependent. Easy task → salient feature(s) have high variance among the cases. Hard task → salient feature(s) have low variance among the cases (e.g. Where’s Waldo?) Keith L. Downing The Knowledge Content of Neural Networks

Principal Component Analysis (PCA) with ANNs Principle components of a data set = vector that captures the highest amounts of variance among the features. Important ANN Property If : the values of a data set are scaled (to a common range for each feature such as [0, 1]) and normalized by subtracting the mean vector from each element, these values are fed into a single output neuron, z, and the incoming weights to z are modified by correlation-based Hebbian means ⇒ z’s input-weight vector will reflect the principle components of the data set. Keith L. Downing The Knowledge Content of Neural Networks

Weight Vectors Define Region Borders The border between regions carved out by a single output neuron is perpendicular to that neuron’s weight vector xw x + yw y ≥ t z ⇔ y ≥ − w x x + t z w y w y The border is a line with slope = − w x w y . So, any vector with slope + w y w x is perpendicular to that border. Since neuron z’s incoming-weight vector is � w x , w y � , it has slope + w y w x and is therefore perpendicular to the borderline. Keith L. Downing The Knowledge Content of Neural Networks

Of Mice and Elephants Mouse Raw Data Points Elephant Avg Vector Gray-Scale Color 0 0 Size Animal Raw Data Scaled Data Normalized Data Mouse (0.05, 60) (0, 0.6) (-0.27, -0.04) Mouse (0.04, 62) (0, 0.62 (-0.27, -0.02) Mouse (0.06, 68) (0, 0.68) (-0.27, 0.04 Elephant (5400, 61) (0.54, 0.61) (0.27, -0.03) Elephant (5250, 66) (0.53, 0.66) (0.26, 0.03) Elephant (5300, 69) (0.53, 0.69) (0.26, 0.05) Keith L. Downing The Knowledge Content of Neural Networks

Hebbian Learning ⇒ Principle Components △ w i = λ x i y Weight Vector Mouse Normalized Data Points Borderline Elephant Avg Vector Gray-Scale Color 0 0 Size Input (Size, Color) Output δ w size δ w color (-0.27, -0.04) -0.031 +0.0017 +0.0002 (-0.27, -0.02) -0.029 +0.0016 +0.0001 (-0.27, 0.04) -0.023 +0.0012 -0.0002 (0.27, -0.03) +0.024 +0.0013 -0.0001 (0.26, 0.03) +0.029 +0.0015 +0.0002 (0.26, 0.05) +0.031 +0.0016 +0.0003 Sum weight change: +0.0089 +0.0005 Keith L. Downing The Knowledge Content of Neural Networks

PCA via ANN Summary If the detectors of a network modify their input-weight vectors according to basic Hebbian principles, then, after training, the activation levels of detectors can be used to differentiate the input patterns along the dimensions of highest variance . Hence, those detectors will differentiate between objects (or situations) that are most distinct relative to the space of feature values observed in the training data. Train on animal pictures ⇒ Differentiate birds from horses better than horses from donkeys. Train on human faces ⇒ Differentiate males from females better than Swedes from Norwegians. The network figures out the most salient features on its own, via simple Hebbian means. Keith L. Downing The Knowledge Content of Neural Networks

Assessing Generality of an ANN Generalization: Ability to handle similar cases with similar actions. In ANNs, measure the correlation between input patterns and activity patterns of output- or hidden-layer neurons, giving a coarse indicator of generalization. Hierarchical clustering (using dendograms) gives a more detailed, case-by-case assessment. A quick look at the hierarchical tree usually indicates whether or not the ANN has learned useful similarities and distinctions between the inputs. Animal Name Hidden-Layer Activation Pattern Cat Felix 11000011 Dog Max 00111100 Cat Samantha 10001011 Dog Fido 00011101 Cat Tabby 11011001 Dog Bruno 10110101 Keith L. Downing The Knowledge Content of Neural Networks

Hierarchical Clustering Begin with N items, each of which includes a tag , which in this example is the hidden-layer activation pattern that it evokes. Encapsulate each item in a singleton cluster and form the cluster set, C, consisting of all these clusters. Repeat until size(C) = 1 Find the two clusters, c 1 and c 2 , in C that are closest , using distance metric D. Form cluster c 3 as the union of c 1 and c 2 ; it becomes their parent on the hierarchical tree. Add c 3 to C. Remove c 1 and c 2 from C Keith L. Downing The Knowledge Content of Neural Networks

The Knowledge Content of Neural Networks Keith L. Downing The - PowerPoint PPT Presentation

The Knowledge Content of Neural Networks Keith L. Downing The Norwegian University of Science and Technology (NTNU) Trondheim, Norway keithd@idi.ntnu.no March 25, 2014 Keith L. Downing The Knowledge Content of Neural Networks Overview

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Lecture 2: Design Studies Information Visualization CPSC 533C, Fall 2011 Tamara Munzner UBC

Reshaping data An introduction to WS 2018/2019 We will use data on fish abundance. Download

Behind Enemy Lines Espionage and Covert Operations Need for Intelligence Agencies Armies are

FOURTH QUARTER 2019 INVESTOR PRESENTATION Financing the Growth of Tomorrows Companies Today TM

Spatial Capture-Recapture Scenario Detectors Animal locations Effective area? No Problem:

iSCSI MIB Team Status 50 th IETF - Minneapolis March 2001 Mark Bakke, Cisco Lawrence Lamers, San

Results For the full year ended 30 June 2012 22 August 2012 Ross Batstone, Chief Executive

Event #3 3000 Discharge (cfs) Turbidity (FNU) ~4 days SSC (mg/L) 80,000 2000 60,000 40,000