An introduction to Neural Networks and Deep Learning Talk given at - PowerPoint PPT Presentation

An introduction to Neural Networks and Deep Learning Talk given at the Department of Mathematics of the University of Bologna February 20, 2018 Andrea Asperti DISI - Department of Informatics: Science and Engineering University of Bologna Mura Anteo Zamboni 7, 40127, Bologna, ITALY andrea.asperti@unibo.it Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 1

A branch of Machine Learning What is Machine Learning? There are problems that are difficult to address with traditional programming techniques: ◮ classify a document according to some criteria (e.g. spam, sentiment analysis, ...) ◮ compute the probability that a credit card transaction is fraudulent ◮ recognize an object in some image (possibly from an inusual viewpoint, in new lighting conditions, in a cluttered scene) ◮ ... Typically the result is a weighted combination of a large number of parameters, each one contributing to the solution in a small degree. Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 2

The Machine Learning approach Suppose to have a set of input-output pairs (training set) {� x i , y i �} the problem consists in guessing the map x i �→ y i The M.L. approach: • describe the problem with a model depending on some parameters Θ (i.e. choose a parametric class of functions) • define a loss function to compare the results of the model with the expected (experimental) values • optimize (fit) the parameters Θ to reduce the loss to a minimum Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 3

Why Learning? Machine Learning problems are in fact optimization problems ! So, why talking about learning? Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 4

Why Learning? Machine Learning problems are in fact optimization problems ! So, why talking about learning? The point is that the solution to the optimization problem is not given in an analytical form (often there is no closed form solution). Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 5

Why Learning? Machine Learning problems are in fact optimization problems ! So, why talking about learning? The point is that the solution to the optimization problem is not given in an analytical form (often there is no closed form solution). So, we use iterative techniques (typically, gradient descent) to progressively approximate the result. Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 6

Why Learning? Machine Learning problems are in fact optimization problems ! So, why talking about learning? The point is that the solution to the optimization problem is not given in an analytical form (often there is no closed form solution). So, we use iterative techniques (typically, gradient descent) to progressively approximate the result. This form of iteration over data can be understood as a way of progressive learning of the objective function based on the experience of past observations. Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 7

Using gradients The objective is to minimize some loss function over (fixed) training samples, e.g. � Θ( w ) = E ( o ( w , x i ) , y i ) i by suitably adjusting the parameters w . See how it changes according to small perturbations ∆( w ) of the parameters w : this is the gradient ∇ w [ θ ] = [ ∂ Θ ∂ w 1 , . . . , ∂ Θ ∂ w n ] of Θ w.r.t. w . The gradient is a vector pointing in the direction of steepest ascent. Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 8

Gradient descent Goal: minimize some loss function Θ( w ) by suitably adjusting the parameters. We can reach a minimal configuration for Θ( w ) by iteratively taking small steps in the direction opposite to the gradient (gradient descent). This is a general technique . Warning: not guaranteed to work: ◮ may end up in local minima ◮ may get lost in plateau Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 9

Next arguments A bit of taxonomy Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 10

Different types of Learning Tasks • supervised learning : inputs + outputs (labels) - classification - regression supervised • unsupervised learning : just inputs - clustering - component analysis - autoencoding unsupervised • reinforcement learning actions and rewards - learning long-term gains - planning reinforcement Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 11

Classification vs. Regression Two forms of supervised learning: {� x i , y i �} Probably a cat! Expected New value input New input classification regression y is discete: y ∈ {• , + } y is (conceptually) continuous Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 12

Many different techniques • Different ways to define the models : Outlook - decision trees Sunny Overcast Rain - linear models Humidity Yes Wind - neural networks High Normal Strong Weak - ... Yes No Yes No decision tree neural net • Different error (loss) functions : - mean squared errors - logistic loss - cross entropy - cosine distance - maximum margin mean squared errors maximum margin - ... Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 13

Next argument Neural Networks Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 14

Neural Network A network of (artificial) neurons Artificial neuron Each neuron takes multiple inputs and produces a single output (that can be passed as input to many other neurons). Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 15

The artificial neuron inputs x 1 activation w 1 function w x 2 2 Σ output b w n bias x n +1 The purpose of the activation function is to introduce a thresholding mechanism (similar to the axon-hillock of cortical neurons). Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 16

Different activation functions The activation function is responsible for threshold triggering. 1 0 1 threshold: if x > 0 then 1 else 0 logistic function: 1+ e − x 1 0 ex − e − x hyperbolic tangent: rectified linear (RELU): if x > 0 then x else 0 ex + e − x Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 17

A comparison with the cortical neuron Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 18

Next argument Networks typology/topology Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 19

Layers A neural network is a collection of artificial neurons connected together. Neurons are usually organized in layers. If there is more than one hidden layer the network is deep, otherwise it is called a shallow network. Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 20

Feed-forward networks If the network is acyclic, it is called a feed-forward network. Feed-forward networks are (at present) the commonest type of networks in practical applications. Important Composing linear transformations makes no sense, since we still get a linear transformation. What is the source of non linearity in Neural Networks? Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 21

Feed-forward networks If the network is acyclic, it is called a feed-forward network. Feed-forward networks are (at present) the commonest type of networks in practical applications. Important Composing linear transformations makes no sense, since we still get a linear transformation. What is the source of non linearity in Neural Networks? The activation function Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 22

Dense networks The most typical feed-forward network is a dense network where each neuron at layer k − 1 is connected to each neuron at layer k . The network is defined by a matrix of parameters (weights) W k for each layer (+ biases). The matrix W k has dimension L k × L k +1 where L k is the number of neurons at layer k . Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 23

Parameters and hyper-parameters The weights W k are the parameters of the model: they are learned during the training phase. The number of layers and the number of neurons per layer are hyper-parameters: they are chosen by the user and fixed before training may start. Other important hyper-parameters govern training such as learning rate, batch-size, number of ephocs an many others. Andrea Asperti Universit` a di Bologna - DISI: Dipartimento di Informatica: Scienza e Ingegneria 24

An introduction to Neural Networks and Deep Learning Talk given at - PowerPoint PPT Presentation

An introduction to Neural Networks and Deep Learning Talk given at the Department of Mathematics of the University of Bologna February 20, 2018 Andrea Asperti DISI - Department of Informatics: Science and Engineering University of Bologna

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Data Mining II Neural Networks and Deep Learning Heiko Paulheim Deep Learning A recent

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Introduction to Deep Neural Networks 0. Logistics Spring 2020 1 Neural Networks are taking

Neural Networks - Deep Learning Artificial Intelligence @ Allegheny College Janyl Jumadinova

Neural Networks Greg Mori - CMPT 419/726 Bishop PRML Ch. 5 Feed-forward Networks Network

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Lecture 14- Kalman Filtering Parts of Ch. 12-13 (The Estimation Book) Mojtaba Soltanalian- UIC

LEAP COPTER Control a Quadcopter with your Hands Stefan Berner & Aleksandar Presic idea the

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile & Service Robotics

Data-Enabled Predictive Control of Autonomous Energy Systems Florian D orfler Automatic

The gaseous QUAD pixel detector Yevgen Bilevych, Klaus Desch, Jean -Paul Fransen, Harry van der

IMGD 3000 - Technical Game Development I: Intro to Sound in Games by Robert W. Lindeman

Optimal Receiver using Complex Baseband Representation Saravanan Vijayakumaran

= + 2 cos (2 ) 0.5 (1 cos(2 2 )) 0.5( ) ft dt ft dt b

An introduction to Neural Networks and Deep Learning Talk given at - PowerPoint PPT Presentation

An introduction to Neural Networks and Deep Learning Talk given at the Department of Mathematics of the University of Bologna February 20, 2018 Andrea Asperti DISI - Department of Informatics: Science and Engineering University of Bologna

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Data Mining II Neural Networks and Deep Learning Heiko Paulheim Deep Learning A recent

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Introduction to Deep Neural Networks 0. Logistics Spring 2020 1 Neural Networks are taking

Neural Networks - Deep Learning Artificial Intelligence @ Allegheny College Janyl Jumadinova

Neural Networks Greg Mori - CMPT 419/726 Bishop PRML Ch. 5 Feed-forward Networks Network

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Lecture 14- Kalman Filtering Parts of Ch. 12-13 (The Estimation Book) Mojtaba Soltanalian- UIC

LEAP COPTER Control a Quadcopter with your Hands Stefan Berner &amp; Aleksandar Presic idea the

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile &amp; Service Robotics

Data-Enabled Predictive Control of Autonomous Energy Systems Florian D orfler Automatic

The gaseous QUAD pixel detector Yevgen Bilevych, Klaus Desch, Jean -Paul Fransen, Harry van der

IMGD 3000 - Technical Game Development I: Intro to Sound in Games by Robert W. Lindeman

Optimal Receiver using Complex Baseband Representation Saravanan Vijayakumaran

= + 2 cos (2 ) 0.5 (1 cos(2 2 )) 0.5( ) ft dt ft dt b

LEAP COPTER Control a Quadcopter with your Hands Stefan Berner & Aleksandar Presic idea the

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile & Service Robotics