Perceptrons 2-29-16 What is a neural network? activation - - PowerPoint PPT Presentation
Perceptrons 2-29-16 What is a neural network? activation - - PowerPoint PPT Presentation
Perceptrons 2-29-16 What is a neural network? activation connection functions A NN is a directed acyclic graph. weights Nodes are organized into layers. -0.5 Consecutive layers are fully connected. 1.5 0.2 Edges have
- A NN is a directed acyclic graph.
- Nodes are organized into layers.
- Consecutive layers are fully connected.
- Edges have a weight.
- Nodes have activation functions.
Other topologies are possible:
- sparser inter-layer connectivity
- edges within layers
- edges jumping layers
What is a neural network?
input layer hidden layer(s)
- utput layer
- 0.5
0.2 0.8
- 1.2
3.0 0.1 1.5 2.7
- 0.3
- 1.6
0.4
- 1.0
connection weights activation functions
- 0.5
0.8 3.0
What does a neural network compute?
Each node computes the weighted sum of its inputs.
- .5 * 1.2 + .8 * -.8 + 3 * .4 = -.04
This sum is then passed through the node’s activation function. f(x) = 1 / (1 + e-x) = 1 / (1 + e.04) ≈ .49 This output is passed on to the next layer.
1.2
- 0.8
0.4
Exercise: finish feeding values through the network.
Sigmoid activation function: Threshold activation function:
1.2
- 0.8
0.4
- 0.5
0.2 0.8
- 1.2
3.0 0.1 1.5 2.7
- 0.3
- 1.6
0.4
- 1.0
What type of learning problem is this?
Supervised or unsupervised?
- We’ll be studying neural nets for supervised learning.
- They can also be used for unsupervised learning.
Classification or regression?
- Depends on the output units:
○ Discrete-valued output units for classification. ○ Continuous-valued output units for regression.
What is the hypothesis space?
Unspecified parameters:
- Network topology
○ Number of hidden layers ○ Size of each hidden layer ○ Connectivity of hidden layers
- Activation functions
- Edge weights
typically hand-picked typically learned
What is a perceptron?
A perceptron is a 2-layer neural network.
- Only input & output; no hidden units.
All activation functions are thresholds.
- Threshold at 0.
One input is a constant 1.
- 0.5
0.2 0.8
- 1.2
3.0 0.1
1
Perceptrons represent a decision surface
- 0.5
0.8 3.0
1 x1 x2
Linear separability
Perceptrons can only classify linearly separable data.
Our task: learn perceptron weights from data.
Key idea: loop through the training data and update weights when wrong. while any training example is misclassified: for each training example:
- utput = run example through the network
for each node i in the output: if output[i] != target[i]: for each input weight w_j to node i: w_j = w_j + Delta(w_j) Delta(w_j) = learning_rate * (target[j] - output[j]) * example[j]
Example: learn AND
AND( 0 , 0 ) = 0 AND( 0 , 1 ) = 0 AND( 1 , 0 ) = 0 AND( 1 , 1 ) = 1
Exercise: learn OR
OR( 0 , 0 ) = 0 OR( 0 , 1 ) = 1 OR( 1 , 0 ) = 1 OR( 1 , 1 ) = 1
We can compose perceptrons like logic gates.
- We can represent AND, OR, and NOT with perceptrons.
- By composing these, we can make multi-layer networks.
- AND, OR, and NOT constitute a universal gate set, so we
can make any boolean function by combining perceptrons. In fact, we can represent any boolean function with a 2-layer perceptron.