Recitation First recitation tomorrow 56:30 here Linear algebra - - PowerPoint PPT Presentation

recitation
SMART_READER_LITE
LIVE PREVIEW

Recitation First recitation tomorrow 56:30 here Linear algebra - - PowerPoint PPT Presentation

Recitation First recitation tomorrow 56:30 here Linear algebra Geoff Gordon10-701 Machine LearningFall 2013 1 Probability P(a) = P(u) = P(~a) = Geoff Gordon10-701 Machine LearningFall 2013 2 Conventions Geoff


slide-1
SLIDE 1

Geoff Gordon—10-701 Machine Learning—Fall 2013

Recitation

  • First recitation tomorrow 5–6:30 here
  • Linear algebra

1

slide-2
SLIDE 2

Geoff Gordon—10-701 Machine Learning—Fall 2013

Probability

2

P(a) = P(u) = P(~a) =

slide-3
SLIDE 3

Geoff Gordon—10-701 Machine Learning—Fall 2013

Conventions

3

slide-4
SLIDE 4

Geoff Gordon—10-701 Machine Learning—Fall 2013

Union, intersection

4

slide-5
SLIDE 5

Geoff Gordon—10-701 Machine Learning—Fall 2013

Conditioning

5

slide-6
SLIDE 6

Geoff Gordon—10-701 Machine Learning—Fall 2013

Law of total probability

6

slide-7
SLIDE 7

Geoff Gordon—10-701 Machine Learning—Fall 2013

Marginals

7

slide-8
SLIDE 8

Geoff Gordon—10-701 Machine Learning—Fall 2013

Finite vs. infinite |u|

  • http://www.amazon.com/Probability-Measure-Wiley-Series-

Statistics/dp/1118122372

  • http://en.wikipedia.org/wiki/Regular_conditional_probability
  • http://en.wikipedia.org/wiki/Borel%E2%80%93Kolmogorov_paradox

8

slide-9
SLIDE 9

Geoff Gordon—10-701 Machine Learning—Fall 2013

How I learned to stop worrying and love the density function…

9

ent:

1 √ 2πσ exp(− 1 2(x − µ)2/σ2) 1 b−a

a ≤ x ≤ b

  • /w
slide-10
SLIDE 10

Geoff Gordon—10-701 Machine Learning—Fall 2013

Multivariate densities

10

slide-11
SLIDE 11

Geoff Gordon—10-701 Machine Learning—Fall 2013

Random variables

11

Probability space (σ-algebra)

slide-12
SLIDE 12

Geoff Gordon—10-701 Machine Learning—Fall 2013

Bayes rule

  • recall def of conditional:
  • P(a|b) = P(a^b) / P(b) if P(b) != 0

12

slide-13
SLIDE 13

Geoff Gordon—10-701 Machine Learning—Fall 2013

Bayes rule: sum version

13

slide-14
SLIDE 14

Geoff Gordon—10-701 Machine Learning—Fall 2013

Test for a rare disease

  • About 0.1% of all people are infected
  • Test detects all infections
  • Test is highly specific: 1% false positive
  • You test positive. What is the probability you have

the disease?

14

slide-15
SLIDE 15

Geoff Gordon—10-701 Machine Learning—Fall 2013

Test for a rare disease

  • About 0.1% of all people are infected
  • Test detects all infections
  • Test is highly specific: 1% false positive
  • You test positive. What is the probability you have

the disease?

14

Bonus: what is probability an average med student gets this question wrong?

slide-16
SLIDE 16

Geoff Gordon—10-701 Machine Learning—Fall 2013

Follow-up test

  • Test 2: detects 90% of infections, 5% false positives
  • P(+disease | +test1, +test2) =

15

slide-17
SLIDE 17

Geoff Gordon—10-701 Machine Learning—Fall 2013

Using Bayes rule

16

$$$

slide-18
SLIDE 18

Geoff Gordon—10-701 Machine Learning—Fall 2013

Using Bayes rule

16

$$$

slide-19
SLIDE 19

Geoff Gordon—10-701 Machine Learning—Fall 2013

Independence

17

slide-20
SLIDE 20

Geoff Gordon—10-701 Machine Learning—Fall 2013

Conditional independence

18

slide-21
SLIDE 21

xkcd.com

London taxi drivers: A survey has pointed out a positive and

significant correlation between the number of accidents and wearing

  • coats. They concluded that coats could hinder movements of drivers and

be the cause of accidents. A new law was prepared to prohibit drivers from wearing coats when driving. Finally another study pointed out that people wear coats when it rains…

Conditionally Independent

31

slide credit: Barnabas humor credit: xkcd

slide-22
SLIDE 22

Geoff Gordon—10-701 Machine Learning—Fall 2013

Samples

20

slide-23
SLIDE 23

Geoff Gordon—10-701 Machine Learning—Fall 2013

Recall: spam filtering

21

slide-24
SLIDE 24

Geoff Gordon—10-701 Machine Learning—Fall 2013

Bag of words

22

slide-25
SLIDE 25

Geoff Gordon—10-701 Machine Learning—Fall 2013

A ridiculously naive assumption

  • Assume:
  • Clearly false:
  • Given this assumption, use Bayes rule

23

slide-26
SLIDE 26

Geoff Gordon—10-701 Machine Learning—Fall 2013

Graphical model

24

spam x1 x2

. . .

xn

spam xi

i=1..n

slide-27
SLIDE 27

Geoff Gordon—10-701 Machine Learning—Fall 2013

Naive Bayes

  • P(spam | email ∧ award ∧ program ∧ for ∧ internet

∧ users ∧ lump ∧ sum ∧ of ∧ Five ∧ Million)

25

slide-28
SLIDE 28

Geoff Gordon—10-701 Machine Learning—Fall 2013

In log space

zspam = ln(P(email | spam) P(award | spam) ... P(Million | spam) P(spam)) z~spam = ln(P(email | ~spam) ... P(Million | ~spam) P(~spam))

26

slide-29
SLIDE 29

Geoff Gordon—10-701 Machine Learning—Fall 2013

Collect terms

zspam = ln(P(email | spam) P(award | spam) ... P(Million | spam) P(spam)) z~spam = ln(P(email | ~spam) ... P(Million | ~spam) P(~spam)) z = zspam – zspam

27

slide-30
SLIDE 30

Geoff Gordon—10-701 Machine Learning—Fall 2013

Linear discriminant

28

slide-31
SLIDE 31

Geoff Gordon—10-701 Machine Learning—Fall 2013

Intuitions

29

slide-32
SLIDE 32

Geoff Gordon—10-701 Machine Learning—Fall 2013

How to get probabilities?

30

slide-33
SLIDE 33

Geoff Gordon—10-701 Machine Learning—Fall 2013

Improvements

31