introduction
play

Introduction Supervised Learning CSCE CSCE 496/896 496/896 - PDF document

Introduction Supervised Learning CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic Artificial Basic Artificial CSCE 496/896 Lecture 2: Neural Neural Networks Networks Basic Artificial Neural Networks Stephen Scott Stephen Scott


  1. Introduction Supervised Learning CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic Artificial Basic Artificial CSCE 496/896 Lecture 2: Neural Neural Networks Networks Basic Artificial Neural Networks Stephen Scott Stephen Scott Supervised learning is most fundamental, “classic” form of machine learning Introduction Introduction “Supervised” part comes from the part of labels for Supervised Supervised Stephen Scott Learning Learning examples (instances) Basic Units Basic Units Many ways to do supervised learning; we’ll focus on Gradient (Adapted from Vinod Variyam, Ethem Alpaydin, Tom Mitchell, Gradient Descent Descent Ian Goodfellow, and Aur´ elien G´ eron) artificial neural networks , which are the basis for Nonlinearly Nonlinearly deep learning Separable Separable Problems Problems Backprop Backprop Types of Units Types of Units Putting Things Putting Things Together sscott@cse.unl.edu Together 1 / 60 2 / 60 Introduction Introduction ANNs Properties CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic Artificial Basic Artificial Neural Neural Networks Networks Consider humans: Properties of artificial neural nets (ANNs): Stephen Scott Stephen Scott Total number of neurons ⇡ 10 10 Many “neuron-like” switching units Introduction Introduction Neuron switching time ⇡ 10 � 3 second (vs. 10 � 10 ) Many weighted interconnections among units Supervised Supervised Learning Learning Connections per neuron ⇡ 10 4 – 10 5 Highly parallel, distributed process Basic Units Basic Units Emphasis on tuning weights automatically Gradient Scene recognition time ⇡ 0 . 1 second Gradient Descent Descent 100 inference steps doesn’t seem like enough Nonlinearly Nonlinearly Strong differences between ANNs for ML and ANNs for Separable Separable Problems ) massive parallel computation Problems biological modeling Backprop Backprop Types of Units Types of Units Putting Things Putting Things Together Together 3 / 60 4 / 60 Introduction When to Consider ANNs History of ANNs CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic Artificial Basic Artificial Neural Neural The Beginning: Linear units and the Perceptron Networks Networks Input is high-dimensional discrete- or real-valued (e.g., algorithm (1940s) Stephen Scott Stephen Scott raw sensor input) Spoiler Alert: stagnated because of inability to handle Introduction Introduction Output is discrete- or real-valued data not linearly separable Supervised Supervised Aware of usefulness of multi-layer networks, but could Output is a vector of values Learning Learning not train Basic Units Basic Units Possibly noisy data The Comeback: Training of multi-layer networks with Gradient Gradient Form of target function is unknown Descent Descent Backpropagation (1980s) Nonlinearly Nonlinearly Human readability of result is unimportant Many applications, but in 1990s replaced by Separable Separable Problems Problems large-margin approaches such as support vector Long training times acceptable Backprop Backprop machines and boosting Types of Units Types of Units Putting Things Putting Things Together Together 5 / 60 6 / 60

  2. Introduction Outline History of ANNs (cont’d) CSCE CSCE 496/896 496/896 The Resurgence: Deep architectures (2000s) Lecture 2: Lecture 2: Basic Artificial Better hardware 1 and software support allow for deep Basic Artificial Supervised learning Neural Neural Networks Networks ( > 5 –8 layers) networks Basic ANN units Stephen Scott Still use Backpropagation, but Stephen Scott Linear unit Larger datasets, algorithmic improvements (new loss Linear threshold units Introduction Introduction and activation functions), and deeper networks improve Perceptron training rule Supervised Supervised performance considerably Learning Learning Gradient Descent Very impressive applications, e.g., captioning images Basic Units Basic Units Nonlinearly separable problems and multilayer Gradient Gradient Descent Descent networks Nonlinearly Nonlinearly Separable Separable Backpropagation Problems Problems The Inevitable: (TBD) Backprop Backprop Types of activation functions Oops Types of Units Types of Units Putting everything together Putting Things Putting Things Together Together 1 Thank a gamer today. 7 / 60 8 / 60 Learning from Examples Learning from Examples (cont’d) : Engine power CSCE CSCE 496/896 496/896 Let C be the target function (or target concept ) to be Lecture 2: Lecture 2: Basic Artificial Basic Artificial learned Neural Neural Networks Networks Think of C as a function that takes as input an example Stephen Scott Stephen Scott (or instance ) and outputs a label x 2 Introduction Goal: Given training set X = { ( x t , y t ) } N Introduction t = 1 where y t = C ( x t ) , output hypothesis h 2 H that approximates Supervised Supervised Learning Learning C in its classifications of new instances Basic Units Basic Units Each instance x represented as a vector of attributes Gradient Gradient Descent Descent or features x t Nonlinearly Nonlinearly 2 E.g., let each x = ( x 1 , x 2 ) be a vector describing Separable Separable Problems Problems attributes of a car; x 1 = price and x 2 = engine power Backprop Backprop In this example, label is binary (positive/negative, Types of Units Types of Units yes/no, 1/0, + 1 / � 1 ) indicating whether instance x is a Putting Things Putting Things “family car” Together Together t x 1 x : Price 1 9 / 60 10 / 60 Thinking about C Thinking about C (cont’d) : Engine power CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic Artificial Basic Artificial Neural Neural Can think of target concept C as a function Networks Networks In example, C is an axis-parallel box, equivalent to Stephen Scott Stephen Scott x 2 upper and lower bounds on each attribute C e Introduction Introduction Might decide to set H (set of candidate hypotheses) to 2 Supervised Supervised the same family that C comes from Learning Learning Not required to do so Basic Units Basic Units Can also think of target concept C as a set of positive Gradient Gradient e Descent Descent instances 1 Nonlinearly Nonlinearly In example, C the continuous set of all positive points in Separable Separable Problems Problems the plane Backprop Backprop Use whichever is convenient at the time Types of Units Types of Units Putting Things Putting Things Together Together p p 1 2 x : Price 1 11 / 60 12 / 60

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend