STAT 339 Probabilistic Modeling and Machine Learning 30 January - - PowerPoint PPT Presentation

stat 339 probabilistic modeling and machine learning
SMART_READER_LITE
LIVE PREVIEW

STAT 339 Probabilistic Modeling and Machine Learning 30 January - - PowerPoint PPT Presentation

STAT 339 Probabilistic Modeling and Machine Learning 30 January 2017 Colin Reimer Dawson Outline Data Science and Machine Learning Types of Learning Supervised Learning Unsupervised Learning Discovering Model Complexity Course Outline


slide-1
SLIDE 1

STAT 339 Probabilistic Modeling and Machine Learning

30 January 2017 Colin Reimer Dawson

slide-2
SLIDE 2

Outline

Data Science and Machine Learning Types of Learning Supervised Learning Unsupervised Learning Discovering Model Complexity Course Outline

slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9

Some Cool Things you can do with data

Thanks to David Shuman at Macalester College for this slide

slide-10
SLIDE 10

What is Machine Learning?

"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E." — Tom Mitchell

slide-11
SLIDE 11

What is Machine Learning?

"[Machine Learning is a] field of study that gives computers the ability to learn without being explicitly programmed." — Arthur Samuel

slide-12
SLIDE 12

Statistics, Computer Science, and Machine Learning

Machine Learning

slide-13
SLIDE 13
slide-14
SLIDE 14

Types of Learning

◮ Supervised Learning: Learning to make predictions when

you have many examples of “correct answers”

◮ Classification: answer is a category / label ◮ Regression: answer is a number

◮ Unsupervised Learning: Finding structure in unlabeled

data

◮ Reinforcement Learning: Finding actions that maximize

long-run reward (not part of this course)

slide-15
SLIDE 15

Supervised Learning

slide-16
SLIDE 16

Supervised Learning with a Probabilistic Model

◮ Training data: {(ti, xi)}n i=1; ti = label, xi = features. ◮ Fit a model of all of the features: P(x, t), or P(t|x) ◮ Testing: Assign P(tnew|xnew, Model)

slide-17
SLIDE 17

Data in Higher Dimensions

slide-18
SLIDE 18

Data in Very High Dimensions

slide-19
SLIDE 19

Aside: Feature Extraction (“Eigenfaces”)

slide-20
SLIDE 20

Finding Clusters

◮ Clustering: Grouping data into categories without any

“ground truth” information

◮ Example Application: Modeling people’s taste in movies

slide-21
SLIDE 21

Model-Free Clustering

Model-free example: Given a distance metric, maximize distances among cluster centers; then assign points to closest center.

slide-22
SLIDE 22

Clustering with a Probabilistic Model

0.5 0.3 0.2 (a) 0.5 1 0.5 1

Output: A set of cluster weights and a probability distribution for each cluster