CS480/680 Machine Learning Lecture 1: January 7 th , 2020 Course - - PowerPoint PPT Presentation

cs480 680 machine learning lecture 1 january 7 th 2020
SMART_READER_LITE
LIVE PREVIEW

CS480/680 Machine Learning Lecture 1: January 7 th , 2020 Course - - PowerPoint PPT Presentation

CS480/680 Machine Learning Lecture 1: January 7 th , 2020 Course Introduction Zahra Sheikhbahaee CS480/680 Winter 2020 Zahra Sheikhbahaee University of Waterloo 1 Outline Introduction to Machine Learning Course website and details:


slide-1
SLIDE 1

CS480/680 Machine Learning Lecture 1: January 7th, 2020

Course Introduction Zahra Sheikhbahaee

University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 1

slide-2
SLIDE 2

CS480/680 Winter 2020 Zahra Sheikhbahaee 2

Outline

  • Introduction to Machine Learning
  • Course website and details:

https://cs.uwaterloo.ca/~zsheikhb/CS480-winter2020.html#

  • Learn (Assignment, grades)

https://learn.uwaterloo.ca/

University of Waterloo

slide-3
SLIDE 3

CS480/680 Winter 2020 Zahra Sheikhbahaee 3

Instructor

Who am I?

University of Waterloo

  • Dr. Zahra Sheikhbahaee

Postdoctoral Researcher

PhD in Astrophysics

slide-4
SLIDE 4

The Team for CS480/680

  • TA’s

§ Gaurav Gupta g27gupta@uwaterloo.ca § Zeou Hu z97hu@uwaterloo.ca § Arash Mollajafari Sohi amollaja@uwaterloo.ca § Zahra Rezapour Siahgourabi zrezapou@uwaterloo.ca § Colin Vandenhof cm5vande@uwaterloo.ca

University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 4

slide-5
SLIDE 5

Prerequisites of this Course

  • Programming :python
  • Probability: distributions
  • Calculus: partial derivatives
  • Linear algebra: vector/matrix manipulations,

properties

  • Statistics: mean, median, mode, standard

deviation

University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 5

slide-6
SLIDE 6

Exam & Evaluation

  • Midterm 25%

Ø Feb 28 Ø Start/end time: 8:30-10:00pm

  • Assignment 35%
  • Final 40%

Ø Grad students : Project with a submitted proposal by 10th

  • f February (6 pages and written in the format of a paper,

Novel and innovative method ) Ø Under grad student: Optional either exam or a project

University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 6

slide-7
SLIDE 7

CS480/680 Winter 2020 Zahra Sheikhbahaee 7

Machine Learning

  • Traditional computer science

– Program computer for every task

  • New paradigm

– Provide examples to machine – Machine learns to accomplish a task based on the examples

University of Waterloo

slide-8
SLIDE 8

Definitions

  • Arthur Samuel (1959): Machine learning is the field of study that gives

computers the ability to learn without being explicitly programmed.

  • Tom Mitchell (1998): A computer program is said to learn from experience E

with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

  • Ethem Alpaydın: Machine learning is programming computers to optimize a

performance criterion using example data or past experience. We need learning in cases where we cannot directly write a computer program to solve a given problem, but need example data or experience.

In statistics, going from particular observations to general descriptions is called inference and learning is called estimation and classification is called discriminant analysis.

CS480/680 Winter 2020 Zahra Sheikhbahaee 8 University of Waterloo

slide-9
SLIDE 9

Three Categories

Supervised learning

Ø Classification Ø Regression

Reinforcement learning Unsupervised learning

Ø Clustering Ø reducing dimensionality

CS480/680 Winter 2020 Zahra Sheikhbahaee 9 University of Waterloo

slide-10
SLIDE 10

Supervised Learning

  • Classification: e.g. digit recognition (postal code)

𝑔: ℝ$ ⟶ {1, … , 𝑙}

  • Simplest approach:

memorization

CS480/680 Winter 2020 Zahra Sheikhbahaee 10 University of Waterloo

slide-11
SLIDE 11

Supervised Learning

  • Nearest neighbour: It can be used to solve both classification and

regression problems.

CS480/680 Winter 2020 Zahra Sheikhbahaee 11 University of Waterloo

slide-12
SLIDE 12

Definition of Supervised Learning

  • Inductive learning or inferring general rules for a

limited set of examples:

– Given a training set of examples of the form (𝑦, 𝑔(𝑦))

  • 𝑦 is the input, 𝑔(𝑦) is the output

– Return a function ℎ that approximates 𝑔

  • ℎ is called the hypothesis

CS480/680 Winter 2020 Zahra Sheikhbahaee 12 University of Waterloo

slide-13
SLIDE 13

Prediction

  • Find function ℎ that fits 𝑔 at instances 𝑦

CS480/680 Winter 2020 Zahra Sheikhbahaee 13 University of Waterloo

slide-14
SLIDE 14

Prediction

  • Find function ℎ that fits 𝑔 at instances 𝑦

CS480/680 Winter 2020 Zahra Sheikhbahaee 14 University of Waterloo

slide-15
SLIDE 15

Prediction

  • Find function ℎ that fits 𝑔 at instances 𝑦

CS480/680 Winter 2020 Zahra Sheikhbahaee 15 University of Waterloo

slide-16
SLIDE 16

Prediction

  • Find function ℎ that fits 𝑔 at instances 𝑦

CS480/680 Winter 2020 Zahra Sheikhbahaee 16 University of Waterloo

slide-17
SLIDE 17

Prediction

  • Find function ℎ that fits 𝑔 at instances 𝑦

CS480/680 Winter 2020 Zahra Sheikhbahaee 17 University of Waterloo

slide-18
SLIDE 18

Generalization

  • Key: a good hypothesis will generalize well (i.e. predict unseen examples correctly)
  • The Occam’s razor: prefer the simplest hypothesis consistent with data
  • Capacity is a measure of complexity and measures the expressive power, richness
  • r flexibility of a set of functions (low capacity: struggle to fit the training set, high

capacity: overfit by memorizing properties of the training set).

  • The Vapnik-Chervonenkis dimension: A dataset containing N points can be labeled

in 2N ways as positive and negative and 2N different learning problems can be defined by N data points. If for any of these problems, we can find a hypothesis h ∈ H that separates the positive examples from the negative, then we say H shatters N points. The maximum number of points that can be arranged so that classifier H can shatter them and it is called the Vapnik- Chervonekis (VC) dimension of H, is denoted as VC(H), and measures the capacity of the hypothesis class H.

CS480/680 Winter 2020 Zahra Sheikhbahaee 18 University of Waterloo

slide-19
SLIDE 19

ImageNet Classification

  • 1000 classes
  • 1 million images
  • Deep neural networks

(supervised learning)

CS480/680 Winter 2020 Zahra Sheikhbahaee 19 University of Waterloo

slide-20
SLIDE 20

Unsupervised Learning

  • An output is not given as part of training set
  • Find model that explains the data

– Clustering: e.g. K-mean clustering

Compressed representation, features, generative model:

CS480/680 Winter 2020 Zahra Sheikhbahaee 20 University of Waterloo

slide-21
SLIDE 21

Unsupervised Feature Generation

  • Encoder trained on large number of images to build a

face detector from only unlabeled images

https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/38115.pdf CS480/680 Winter 2020 Zahra Sheikhbahaee 21 University of Waterloo

slide-22
SLIDE 22

CS480/680 Winter 2020 Zahra Sheikhbahaee 22

Reinforcement Learning

Agent Environment State Reward Action

When the output of the system is a sequence of actions. In such a case, a single action is not important; what is important is the policy that is the sequence of correct actions to reach the goal. The reward is a numerical signal which indicates how good actions are.

Goal: Learn to choose actions that maximize rewards

University of Waterloo

slide-23
SLIDE 23

CS480/680 Winter 2020 Zahra Sheikhbahaee 23

Game Playing

  • Example: Go (one of the oldest

and hardest board games)

  • Agent: player
  • Environment: opponent
  • State: board configuration
  • Action: next stone location
  • Reward: +1 win / -1 loose
  • 2016: AlphaGo defeats top player Lee Sedol (4-1)

– Game 2 move 37: AlphaGo plays unexpected move (odds 1/10,000)

University of Waterloo

slide-24
SLIDE 24

Reinforcement Learning

The theories that incorporate constraints on the information processing capacities of an agent are called theories of bounded rationality (Herbert Simon).

  • Perfect rationality: the agent can determine the best course of action,

without taking into account its limited computational resources.

  • Bounded rationality: the rationality of a realistic agent is limited by

resources such as time, access to information, capacity for information, and processing power and can only be rational to a certain extent. Agents modeled with unbounded rationality act to maximize utility, while agents modeled with bounded rationality can only aim for some satisfactory amount of utility (a regularized expected utility known as the free energy, where the regularizer is given by the information divergence from a prior to a posterior policy).

University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 24

slide-25
SLIDE 25

Applications of Machine Learning

  • Speech recognition

– Siri, Cortana

  • Natural Language Processing

– Machine translation, question answering, dialog systems

  • Computer vision

– Image and video analysis

  • Robotic Control

– Autonomous vehicles

  • Intelligent assistants

– Activity recognition, recommender systems

  • Computational finance

– Stock trading, credit scoring, fraud detection

CS480/680 Winter 2020 Zahra Sheikhbahaee 25 University of Waterloo

slide-26
SLIDE 26

This course

  • Supervised and unsupervised machine learning
  • But not reinforcement learning

CS480/680 Winter 2020 Zahra Sheikhbahaee 26 University of Waterloo