Applied Machine Learning Introduction Siamak Ravanbakhsh COMP 551 - PowerPoint PPT Presentation

Applied Machine Learning Introduction Siamak Ravanbakhsh COMP 551 (fall 2020)

Objectives a short history of ML understanding the scope of machine learning relation to other areas understanding major families of machine learning tasks

What is Machine Learning? ML is the set of "algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions" ML is the "study of computer algorithms that improve automatically through experience" while there are some unifying principles, machine learning may still look like a toolbox with different tools suitable for different tasks

Placing ML Artificial Intelligence : its a broader domain (includes search, planning, multiagent systems, robotics, etc.) Statistics : historically precedes ML. ML is more focused on algorithmic, practical and powerful models (e.g., neural networks) and is built around AI Vision & Natural Language Processing : use many ML algorithms and ideas Optimization: extensively used in ML Data mining : scalability, and performance comes before having theoretical foundations, more space for using heuristics, exploratory analysis, and unsupervised algorithms Data science : an umbrella term for the above mostly used in industry when the output is knowledge/information to be used for decision making COMP 551 | Fall 2020

A short history of ML 1950 : Turing test participants in the imitation game: A) machine B) human C) an interrogator test : if the machine can imitate humans such that the interrogator C after some time cannot reliably tell which one of A or B is human then machine A passes the Turing test. extensive debates about the validity of the test and what it actually proves

A short history of ML 1950 : Turing test 1956: checker player that learned as it played (Arthur Samuel) coined the term Machine Learning uses a (min-max) search method learning happens in estimating the value of a state many important ideas appear in his work self-play, alpha-beta pruning, temporal difference learning, function approximation ... figure from from Samuel's paper (1959)

A short history of ML 1950 : Turing test 1956: checker player that learned as it played (Arthur Samuel) 1958 : first artificial neural networks Perceptron, and ADELINE (1959) (Rosenblot) (Widrow and Hoff) base on McCullach-Pitts mathematical model of neurons x i f ( x ) = σ ( ∑ i i ) w x i activation function which was in turn based on Hebbian learning: connected neural wiring with firing patterns Perceptron M1 could process a 20x20 pixels image we will discuss Perceptron's learning algorithm

A short history of ML 1950 : Turing test 1956: checker player that learned as it played (Arthur Samuel) 1958 : first artificial neural networks Perceptron, and ADELINE (1959) 1963 : support vector machines (Vapnik & Chervonenkis) we will discuss SVM's idea later in the course meanwhile neural networks are finding lots of applications Speech Recognition weather forcasting telephones 1969: Minskey and Pappert show the limitations of single-layer Perceptron for example, it cannot learn a simple XOR function the limitation does not extend to a multilayer perceptron (which was known back then) one of the factors in so-called AI winter 1970-1980: ruled based and symbolic AI dominates in contrast to connectionist AI as in neural networks expert systems find applications in industry these are rule-based systems with their specialized hardware

A short history of ML 1950 : Turing test 1956: checker player that learned as it played (Arthur Samuel) 1958 : first artificial neural networks Perceptron, and ADELINE (1959) 1963 : support vector machines (Vapnik & Chervonenkis) 1969: Minskey and Pappert show the limitations of single-layer Perceptron 1970-1980: ruled based and symbolic AI dominates 1980s Bayesian networks (Judea Pearl) combine graph structure with probabilistic (and causal) reasoning related to both symbolic and connectionist approach 1986 Backpropagation rediscovered (Rumelhart, Hinton & Williams) an efficient method for learning the weights in neural networks using gradient descent it was rediscovered many times since the 1960s we discuss it later in the course 1980-1990s: expert systems are being replaced with general-purpose computers

A short history of ML 1950 : Turing test 1956: checker player that learned as it played (Arthur Samuel) 1958 : first artificial neural networks Perceptron, and ADELINE (1959) 1963 : support vector machines (Vapnik & Chervonenkis) 1969: Minskey and Pappert show the limitations of single-layer Perceptron 1970-1980: ruled based and symbolic AI dominates 1980s Bayesian networks (Judea Pearl) 1986 Backpropagation rediscovered (Rumelhart, Hinton & Williams) 1980-1990s: expert systems are being replaced with general-purpose computers ... 2012 AlexNet wins Imagenet by a large margin 2012 - now a new AI spring around deep learning ... super-human performance in many tasks Future: what is next? in the short term, AI will impact domain sciences historically, hypes have been followed by disappointments is it the same this time? COMP 551 | Fall 2020

Some terminology y output input x targets ML algorithm features labels (hypothesis) predictors predictions independent variable dependent variable covariate response variable example <tumorsize, texture, perimeter> = <18.2, 27.6, 117.5> cancer = No

Some terminology (labelled) datasets : consist of many training examples or instances <tumorsize, texture, perimeter> , <cancer, size change> x (1) <18.2, 27.6, 117.5> , < No , +2 > x (2) <17.9, 10.3, 122.8> , < No , -4 > one instance <20.2, 14.3, 111.2> , < Yes , +3 > x (3) ⋮ ⋮ x ( N ) <15.5, 15.2, 135.5> , < No , 0 > COMP 551 | Fall 2020

families of ML methods 1. Supervised learning: we have labeled data classification regression structured prediction ( n ) ( n ) ( n ) = ( x , x ) x 1 2 ( n ) = −1 y

Supervised learning Classification: categorical/discrete output Regression: continuous output <tumorsize, texture, perimeter> , <size change> <tumorsize, texture, perimeter> , <cancer> <18.2, 27.6, 117.5> , < +2 > <18.2, 27.6, 117.5> , < No > <17.9, 10.3, 122.8> , < -4 > <17.9, 10.3, 122.8> , < No > <20.2, 14.3, 111.2> , < Yes > <20.2, 14.3, 111.2> , < +3 > <15.5, 15.2, 135.5> , < No > <15.5, 15.2, 135.5> , < 0 > target target

Supervised learning: example

Supervised learning: example a variety of language processing tasks are in this category Machine translation: data consists of input-output sentence pairs (x,y) similarly we may consider text-to-speech , with text and voice as input and target (x,y) in speech recognition input and output above are swapped more recently end-to-end speech translation translation example from CNET

Supervised learning: example Image captioning input: image output : text image: COCO dataset

Supervised learning: example Object detection input: image output : a set of bounding box coordinates image: https://bitmovin.com/object-detection/ COMP 551 | Fall 2020

Families of ML methods 2. Unsupervised earning: only unlabeled data clustering helps explore and understand the data dimensionality reduction closer to data mining density estimation / generative modeling we have much more unlabeled data anomaly detection more open challeges discovering latent factors and structures ...

Unsupervised learning: example Clustering Similar to classification but labels/classes should be inferred and are not given to the algorithm <tumorsize, texture, perimeter> , <cancer> <18.2, 27.6, 117.5> , < No > <17.9, 10.3, 122.8> , < No > <20.2, 14.3, 111.2> , < Yes > <15.5, 15.2, 135.5> , < No >

Unsupervised learning: example Generative modeling (density estimation): learn the data distribution p ( x ) COMP 551 | Fall 2020

Families of ML methods Semisupervised learning: a few labeled examples we can include structured problems such as matrix completion (a few entries are observed) link prediction

COMP 551 | Fall 2020

Families of ML methods Reinforcement Learning: weak supervision through the reward signal sequential decision making biologically motivated also related: imitation learning : learning from demonstrations behavior cloning (is supervised learning!) inverse reinforcement learning (learning the reward function)

Reinforcement learning: example

Reinforcement learning: example COMP 551 | Fall 2020

Applied Machine Learning Introduction Siamak Ravanbakhsh COMP 551 - PowerPoint PPT Presentation

Applied Machine Learning Introduction Siamak Ravanbakhsh COMP 551 (fall 2020) Objectives a short history of ML understanding the scope of machine learning relation to other areas understanding major families of machine learning tasks What

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

Applied Machine Learning Introduction 1 APPLIED MACHINE LEARNING Practicalities Contact

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Applied Machine Learning Introduction 1 APPLIED MACHINE LEARNING Practicalities Slides and

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

APPLIED MACHINE LEARNING Probability Density Functions Gaussian Mixture Models 1 APPLIED

Applied Machine Learning Applied Machine Learning Convolutional Neural Networks Siamak

Applied Machine Learning Applied Machine Learning Multilayer Perceptron Siamak Ravanbakhsh

Applied Machine Learning Applied Machine Learning Convolutional Neural Networks Siamak

Applied Machine Learning Applied Machine Learning Perceptron and Support Vector Machines Siamak

Applied Machine Learning Applied Machine Learning Decision Trees Siamak Ravanbakhsh Siamak

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Voting versus Lobbying David K. Levine, Andrea Mattozzi and Salvatore Modica 1 The Setting

PANEL A: Price of coltan PANEL B: Price of gold Figure 3: Support villages in the sample Notes :

The Role of Commitment in Repeated Games Julio Gonz alez-D az Ignacio Garc a-Jurado

Single agent or multiple agents Many domains are characterized by multiple agents rather than a

Balance Laws 1 Lecture 2 ME EN 412 Andrew Ning aning@byu.edu Outline Practice Problems

Opening up the Swedish Labour Market Through Cross-sector Collaboration JOHAN LINKER, PHD

Contextual factors of the external effectiveness of the university effectiveness of the

PPG1007 Ministers Briefing Workshop Nick Dalla Guarda Danielle Pineda February 16, 2018 1