Machine Learning: Course Overview CS 760@UW-Madison Class - - PowerPoint PPT Presentation

machine learning
SMART_READER_LITE
LIVE PREVIEW

Machine Learning: Course Overview CS 760@UW-Madison Class - - PowerPoint PPT Presentation

Machine Learning: Course Overview CS 760@UW-Madison Class enrollment typically the class was limited to 30 weve allowed ~100 to register the waiting list full unfortunately, many on the waiting list will not be able to enroll


slide-1
SLIDE 1

Machine Learning: Course Overview

CS 760@UW-Madison

slide-2
SLIDE 2

Class enrollment

  • typically the class was limited to 30
  • we’ve allowed ~100 to register
  • the waiting list full
  • unfortunately, many on the waiting

list will not be able to enroll

  • but CS760 will be offered in the

next semester!

slide-3
SLIDE 3

Instructor

  • Yingyu Liang

email: yliang@cs.wisc.edu

  • ffice hours: Mon 4-5pm
  • ffice: 6393 Computer Sciences
slide-4
SLIDE 4

TA

  • Ying Fan

email: yingfan@cs.wisc.edu

  • ffice hours: Tues 1-2pm, Thu 1-2pm
  • ffice: CS 1351
slide-5
SLIDE 5

Monday, Wednesday and Friday?

  • we’ll have ~30 lectures in all, just like a standard TR class
  • will push the lectures forward (finish early, leave time for

projects and review)

  • see the schedule on the course website:

http://pages.cs.wisc.edu/~yliang/cs760_spring20

slide-6
SLIDE 6

Course emphases

  • a variety of learning settings: supervised learning,

unsupervised learning, reinforcement learning, active learning, etc.

  • a broad toolbox of machine-learning methods: decision trees,

nearest neighbor, neural nets, Bayesian networks, SVMs, etc.

  • some underlying theory: bias-variance tradeoff, PAC learning,

mistake-bound theory, etc.

  • experimental methodology for evaluating learning systems:

cross validation, ROC and PR curves, hypothesis testing, etc.

slide-7
SLIDE 7

Two major goals

  • 1. Understand what a learning system should do
  • 2. Understand how (and how well) existing systems work
slide-8
SLIDE 8

Course requirements

  • 7-8 homework assignments: 30%
  • programming
  • computational experiments (e.g. measure the effect of

varying parameter x in algorithm y)

  • some written exercises
  • Midterm Exam #1: 20%
  • Midterm Exam #2: 20%
  • final project: 30%
  • project group: 3-5 people
slide-9
SLIDE 9

Expected background

  • CS 540 (Intro to Artificial Intelligence) or equivalent
  • good programming skills
  • probability
  • linear algebra
  • calculus, including partial derivatives
slide-10
SLIDE 10

Programming languages

  • for the programming assignments, you can use

C C++ Java Perl Python R Matlab

  • suggest: Python
  • programs must be callable from the command line and must

run on the CS lab machines (this is where they will be tested during grading!)

slide-11
SLIDE 11

Course readings

Recommend to get one of the following books

  • Pattern Recognition and Machine Learning. C. Bishop. Springer, 2011.
  • Machine Learning: A Probabilistic Perspective. K. Murphy. MIT Press,

2012.

  • Understanding Machine Learning: From Theory to Algorithms. S.

Shalev-Shwartz, S. Ben-David. Cambridge University press, 2014.

slide-12
SLIDE 12

Course readings

  • the books can be found online or at Wendt Commons

Library

  • additional readings will come from online articles, surveys,

and chapters

  • will be posted on course website
slide-13
SLIDE 13

Machine Learning Examples

slide-14
SLIDE 14

What is machine learning?

  • “A computer program is said to learn from experience E

with respect to some class of tasks T and performance measure P, if its performance at tasks in T as measured by P, improves with experience E.”

  • ------ Machine Learning, Tom Mitchell, 1997

learning

slide-15
SLIDE 15

What is machine learning?

  • the study of algorithms that

improve their performance P at some task T with experience E

  • to have a well-defined learning task, we must specify:

< P, T, E >

slide-16
SLIDE 16

ML example: image classification

indoor

  • utdoor
slide-17
SLIDE 17

ML example: image classification

  • T : given new images, classify as indoor vs. outdoor
  • P : minimize misclassification costs
  • E : given images with indoor/outdoor labels
slide-18
SLIDE 18

ML example: spam filtering

slide-19
SLIDE 19

ML example: spam filtering

  • T : given new mail message, classify as spam vs. other
  • P : minimize misclassification costs
  • E : previously classified (filed) messages
slide-20
SLIDE 20

ML example: predictive text input

slide-21
SLIDE 21

ML example: predictive text input

  • T : given (partially) typed word, predict the word the user

intended to type

  • P : minimize misclassifications
  • E : words previously typed by the user

(+ lexicon of common words + knowledge of keyboard layout) domain knowledge

slide-22
SLIDE 22

ML example: Netflix Prize

slide-23
SLIDE 23
  • T : given a user/movie pair, predict the user’s rating (1-5

stars) of the movie

  • P : minimize difference between predicted and actual

rating

  • E : histories of previously rated movies

(user/movie/rating triples)

ML example: Netflix Prize

slide-24
SLIDE 24

ML example: autonomous helicopter

video of Stanford University autonomous helicopter from http://heli.stanford.edu/

slide-25
SLIDE 25

ML example: autonomous helicopter

  • T : given a measurement of the helicopter’s current

state (orientation sensor, GPS, cameras), select an adjustment of the controls

  • P : maximize reward (intended trajectory + penalty

function)

  • E : state, action and reward triples from previous

demonstration flights

slide-26
SLIDE 26

Google DeepMind's Deep Q-learning playing Atari Breakout From the paper “Playing Atari with Deep Reinforcement Learning”, by Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller

ML example: Atari Breakout

slide-27
SLIDE 27

ML example: AlphaGo

slide-28
SLIDE 28

Assignments

slide-29
SLIDE 29

Reading assignment

  • read
  • Chapter 1 of Murphy
  • article by Jordan and Mitchell on course website
  • course website:

http://pages.cs.wisc.edu/~yliang/cs760_spring20/

slide-30
SLIDE 30

HW1: Background test

  • posted on course website
  • will set up how to submit the solutions on Canvas
  • Two parts: minimum and medium tests
  • if pass both: in good shape
  • if pass minimum but not medium: can still take but

expect to fill in background

  • if fail both: suggest to fill in background before taking

the course

slide-31
SLIDE 31

Minimum background test

  • 80 pts in total; pass: 48pts
  • linear algebra: 20 pts
  • probability: 20 pts
  • calculus: 20 pts
  • big-O notations: 20 pts
slide-32
SLIDE 32

Minimum test example

slide-33
SLIDE 33

Minimum test example

slide-34
SLIDE 34

Medium background test

  • 20 pts in total; pass: 12 pts
  • algorithm: 5 pts
  • probability: 5 pts
  • linear algebra: 5 pts
  • programming: 5 pts
slide-35
SLIDE 35

Medium test example

slide-36
SLIDE 36

Medium test example

slide-37
SLIDE 37

THANK YOU

Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven, David Page, Jude Shavlik, Tom Mitchell, Nina Balcan, Elad Hazan, Tom Dietterich, and Pedro Domingos.