Age Age Age Age E out ( h 1 ) E out ( h 2 ) E out ( h 3 ) E - PowerPoint PPT Presentation

recap: Verification h Income Learning From Data Lecture 4 Real Learning is Feasible Age Hoeffding: E out ( h ) ≈ E in ( h ) (with high probability) E out ( h ) Real Learning vs. Verification The Two Step Solution to Learning ↓ D Closer to Reality: Error and Noise P [ | E in ( h ) − E out ( h ) | > ǫ ] ≤ 2 e − 2 Nǫ 2 . M. Magdon-Ismail CSCI 4100/6100 Income Age E in ( h ) = 2 9 � A M Real Learning is Feasible : 2 /17 c L Creator: Malik Magdon-Ismail Learning: finite model − → Real Learning – Finite Learning Models recap: 1000 Monkeys Behind Closed Doors 5-question A/B test. Monkeys answer randomly. Child gets all right. h 1 h 2 h 3 h M Door 1 Door 2 Door 3 Door 4 Door 5 Door 6 Door 1000 Door 1001 Income Income Income Income · · · · · · Age Age Age Age E out ( h 1 ) E out ( h 2 ) E out ( h 3 ) E out ( h M ) ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ 2 wrong 3 wrong 0 wrong 5 wrong 0 wrong 4 wrong 1 wrong 3 wrong ↓ D • What are your chances of picking the child? • What can you do about it? (You can’t peek behind the door. ) Income Income Income Income · · · Age Age Age Age E in ( h 1 ) = 2 E in ( h 3 ) = 5 E in ( h M ) = 6 E in ( h 2 ) = 0 9 9 9 More Monkeys: E in Can’t Reach Out to E out . Pick the hypothesis with minimum E in ; will E out be small? � A c M Real Learning is Feasible : 3 /17 � A c M Real Learning is Feasible : 4 /17 L Creator: Malik Magdon-Ismail Recap: 1000 Monkeys − → L Creator: Malik Magdon-Ismail Recap: selection bias and coins − →

recap: Selection Bias Illustrated with Coins Hoeffding says that E in ( g ) ≈ E out ( g ) for Finite H Coin tossing example: P [ | E in ( g ) − E out ( g ) | > ǫ ] ≤ 2 |H| e − 2 ǫ 2 N , for any ǫ > 0 . • If we toss one coin and get no heads , its very surprising. P = 1 2 N P [ | E in ( g ) − E out ( g ) | ≤ ǫ ] ≥ 1 − 2 |H| e − 2 ǫ 2 N , for any ǫ > 0 . We expect it is biased: P [heads] ≈ 0. • Tossing 70 coins, and find one with no heads. Is it surprising? � 70 � 1 − 1 P = 1 − 2 N Do we expect P [heads] ≈ 0 for the selected coin? We don’t care how g was obtained, as long as it is from H Similar to the “birthday problem”: among 30 people, two will likely share the same birthday. • This is called selection bias . Some Basic Probability Proof: Let M = |H| . Selection bias is a very serious trap. For example medical screening. Events A, B The event “ | E in ( g ) − E out ( g ) | > ǫ ” implies Implication “ | E in ( h 1 ) − E out ( h 1 ) | > ǫ ” OR . . . OR “ | E in ( h M ) − E out ( h M ) | > ǫ ” If A = ⇒ B ( A ⊆ B ) then P [ A ] ≤ P [ B ]. So, by the implication and union bounds: Union Bound � M � P [ A or B ] = P [ A ∪ B ] ≤ P [ A ] + P [ B ]. P [ | E in ( g ) − E out ( g ) | > ǫ ] ≤ P OR m =1 | E in ( h M ) − E out ( h M ) | > ǫ Bayes’ Rule Search Causes Selection Bias M P [ A | B ] = P [ B | A ] · P [ A ] � ≤ P [ | E in ( h m ) − E out ( h m ) | > ǫ ] , P [ B ] m =1 2 Me − 2 ǫ 2 N . ≤ (The last inequality is because we can apply the Hoeffding bound to each summand) � A M Real Learning is Feasible : 5 /17 � A M Real Learning is Feasible : 6 /17 c L Creator: Malik Magdon-Ismail Hoeffding: finite model − → c L Creator: Malik Magdon-Ismail Hoeffding as error bar − → Interpreting the Hoeffding Bound for Finite |H| E in Reaches Outside to E out when |H| is Small P [ | E in ( g ) − E out ( g ) | > ǫ ] ≤ 2 |H| e − 2 ǫ 2 N , for any ǫ > 0 . � 2 N log 2 |H| 1 E out ( g ) ≤ E in ( g ) + δ . P [ | E in ( g ) − E out ( g ) | ≤ ǫ ] ≥ 1 − 2 |H| e − 2 ǫ 2 N , for any ǫ > 0 . If N ≫ ln |H| , then E out ( g ) ≈ E in ( g ). Theorem. With probability at least 1 − δ , Proof: Let δ = 2 |H| e − 2 ǫ 2 N . Then • Does not depend on X , P ( x ) , f or how g is found. � P [ | E in ( g ) − E out ( g ) | ≤ ǫ ] ≥ 1 − δ. 2 N log 2 |H| 1 • Only requires P ( x ) to generate the data points independently and also the test point. E out ( g ) ≤ E in ( g ) + δ . In words, with probability at least 1 − δ , | E in ( g ) − E out ( g ) | ≤ ǫ. This implies E out ( g ) ≤ E in ( g ) + ǫ. We don’t care how g was obtained, as long as g ∈ H From the definition of δ , solve for ǫ : What about E out ≈ 0 ? � 2 N log 2 |H| 1 ǫ = . δ � A c M Real Learning is Feasible : 7 /17 � A c M Real Learning is Feasible : 8 /17 L Creator: Malik Magdon-Ismail E in is close to E out for small H − → L Creator: Malik Magdon-Ismail 2 step approach − →

The 2 Step Approach to Getting E out ≈ 0 : Feasibility of Learning (Finite Models) (1) E out ( g ) ≈ E in ( g ). (2) E in ( g ) ≈ 0. • No Free Lunch: can’t know anything outside D , for sure . Together, these ensure E out ≈ 0. • Can “learn” with high probability if D is i.i.d. from P ( x ). E out ≈ E in ( E in can reach outside the data set to E out ). How to verify (1) since we do not know E out – must ensure it theoretically - Hoeffding. • We want E out ≈ 0 . We can ensure (2) (for example PLA) – modulo that we can guarantee (1) • The two step solution. We trade E out ≈ 0 for 2 goals: (i) E out ≈ E in ; out-of-sample error (ii) E in ≈ 0. We know E in , not E out , but we can ensure ( i ) if |H| is small. There is a tradeoff: model complexity Error This is a big step! • Small |H| = ⇒ E in ≈ E out � 2 N log 2 |H| 1 δ • Large |H| = ⇒ E in ≈ 0 is more likely. • What about infinite H - the perceptron? in-sample error |H| |H| ∗ � A M Real Learning is Feasible : 9 /17 � A M Real Learning is Feasible : 10 /17 c L Creator: Malik Magdon-Ismail Summary: feasibility of learning − → c L Creator: Malik Magdon-Ismail Complex f are harder to learn − → “Complex” Target Functions are Harder to Learn Revising the Learning Problem – Adding in Probability UNKNOWN TARGET FUNCTION f : X �→ Y UNKNOWN y n = f ( x n ) INPUT DISTRIBUTION What happened to the “difficulty” (complexity) of f ? P ( x ) TRAINING EXAMPLES x 1 , x 2 , . . . , x N ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) x g ( x ) ≈ f ( x ) FINAL LEARNING HYPOTHESIS ALGORITHM • Simple f = ⇒ can use small H to get E in ≈ 0 (need smaller N ). g A • Complex f = ⇒ need large H to get E in ≈ 0 (need larger N ). HYPOTHESIS SET H � A c M Real Learning is Feasible : 11 /17 � A c M Real Learning is Feasible : 12 /17 L Creator: Malik Magdon-Ismail Learning setup with probability − → L Creator: Malik Magdon-Ismail Error and Noise − →

Error and Noise Finger Print Recognition Two types of error.  +1 you f   f +1 − 1  − 1 intruder  h +1 no error false accept − 1 false reject no error Error Measure: How to quantify that h ≈ f . In any application you need to think about how to penalize each type of error. Noise: y n � = f ( x n ). f f Take Away +1 − 1 +1 − 1 Error measure is specified by the user. h +1 0 1 h +1 0 1000 − 1 10 0 − 1 1 0 If not, choose one that is – plausible (conceptually appealing) – friendly (practically appealing) Supermarket CIA � A M Real Learning is Feasible : 13 /17 � A M Real Learning is Feasible : 14 /17 c L Creator: Malik Magdon-Ismail Finger print example − → c L Creator: Malik Magdon-Ismail Pointwise errors − → Almost All Error Mearures are Pointwise Noisy Targets Compare h and f on individual points x using a pointwise error e ( h ( x ) , f ( x )): age 32 years gender male Binary error: e ( h ( x ) , f ( x )) = � h ( x ) � = f ( x ) � (classification) salary 40,000 debt 26,000 Consider two customers with the same credit data. years in job 1 year e ( h ( x ) , f ( x )) = ( h ( x ) − f ( x )) 2 Squared error: (regression) They can have different behaviors. years at home 3 years . . . . . . Approve for credit? In-sample error: N E in ( h ) = 1 � e ( h ( x n ) , f ( x n )) . N The target ‘function’ is not a deterministic function but a stochastic function. n =1 Out-of-sample error: ‘ f ( x )’ = P ( y | x ) E out ( h ) = E x [ e ( h ( x ) , f ( x ))] . � A c M Real Learning is Feasible : 15 /17 � A c M Real Learning is Feasible : 16 /17 L Creator: Malik Magdon-Ismail Noisy targets − → L Creator: Malik Magdon-Ismail Learning setup with error and noise − →

Learning Setup with Error Measure and Noisy Targets UNKNOWN TARGET DISTRIBUTION (target function f plus noise) P ( y | x ) UNKNOWN y n ∼ P ( y | x n ) INPUT DISTRIBUTION P ( x ) TRAINING EXAMPLES ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) x 1 , x 2 , . . . , x N x ERROR MEASURE g ( x ) ≈ f ( x ) FINAL LEARNING HYPOTHESIS ALGORITHM g A HYPOTHESIS SET H � A M Real Learning is Feasible : 17 /17 c L Creator: Malik Magdon-Ismail

Age Age Age Age E out ( h 1 ) E out ( h 2 ) E out ( h 3 ) E - PowerPoint PPT Presentation

recap: Verification h Income Learning From Data Lecture 4 Real Learning is Feasible Age Hoeffding: E out ( h ) E in ( h ) (with high probability) E out ( h ) Real Learning vs. Verification The Two Step Solution to Learning D Closer

Tori Mitchell Leaving Under 17s Age Manager Tori started out as an age manager for her younger

B.T. Age 5 began psychiatric inpatient care Age 12 continues in psychiatric hospital

The Stone Age by Berry The Stone Age is divided into three periods of time The

A Brief History of (my) Time Age 5 in an auto salvage yard Age 6 with an 8 transistor AM

How Steel Affected the Ancient World by Isaac S. The Bronze Age The Iron Age The Steel Age Life

Commission: Out of touch, out of date, out of pocket April 2017 Commission: Out of touch, out of

Stone Age to Iron Age A quick tour of Prehistory The Stone Age Let's Rock!! What is Prehistory?

Legal History: Age of Majority, Age of Consent, and Parent- Initiated Treatment Kevin Black,

THE MISSING Baby y Bo omers s I Age in 2 01 8 9 8 -7 8 Born 1 9 4 3 -1 9 6 0 Coming of Age 1

The Roaring Twenties: Age of Excess Consumer Economy Age of Excess The Age of Excess was a time

AGE Platform Europe For an age-friendly European Union promoting active and healthy ageing for

How Virtual and Augmented Reality Will Transform Healthcare Walter Greenleaf PhD Healthcare

GALATIANS Galatians in Five Weeks 1. Paul 2. This Age and the Age to Come (New Creation)

7. Dynamics & Age Outline 7.1. Dynamics & Age 7.2. Temporal Information 7.3. Search in

GALATIANS Galatians in Five Weeks 1. Paul 2. This Age and the Age to Come (New Creation)

God Reaches Out Week 1: God Reaches Out To Meet Us Where We Are Week 2: God Reaches Out In

Pick up an in-class quiz from the table near the door Data structures and algorithm analysis

Algorithms for NLP Language Modeling III Taylor Berg-Kirkpatrick CMU Slides: Dan Klein

The Final Fulfillment of the Babylon Judgments Rev 18 I. God sends another strong angel who

Neural Program Synthesis from Diverse Demonstration Videos Sriram Shao-Hua Sun* 1 Hyeonwoo Noh* 2

HouseClassifier.com dn-ds What is HouseClassifier? A web app that classifies a given image of a

Housing Allocation Committee April 6, 2018 Committee Members: -Stephany Unruh -Griselda Lopez

SWAB REGISTRATION SYSTEM (SRS) 29 July 2020 : 6.30pm 7.30pm // 8.00pm 9.00pm Webinar

From application to dormitory Housing process in University Study-Oriented System (USOS) Micha

Sambuz

Useful Links

Newsletter

Mail Us

Age Age Age Age E out ( h 1 ) E out ( h 2 ) E out ( h 3 ) E - PowerPoint PPT Presentation

recap: Verification h Income Learning From Data Lecture 4 Real Learning is Feasible Age Hoeffding: E out ( h ) E in ( h ) (with high probability) E out ( h ) Real Learning vs. Verification The Two Step Solution to Learning D Closer

Tori Mitchell Leaving Under 17s Age Manager Tori started out as an age manager for her younger

B.T. Age 5 began psychiatric inpatient care Age 12 continues in psychiatric hospital

The Stone Age by Berry The Stone Age is divided into three periods of time The

A Brief History of (my) Time Age 5 in an auto salvage yard Age 6 with an 8 transistor AM

How Steel Affected the Ancient World by Isaac S. The Bronze Age The Iron Age The Steel Age Life

Commission: Out of touch, out of date, out of pocket April 2017 Commission: Out of touch, out of

Stone Age to Iron Age A quick tour of Prehistory The Stone Age Let's Rock!! What is Prehistory?

Legal History: Age of Majority, Age of Consent, and Parent- Initiated Treatment Kevin Black,

THE MISSING Baby y Bo omers s I Age in 2 01 8 9 8 -7 8 Born 1 9 4 3 -1 9 6 0 Coming of Age 1

The Roaring Twenties: Age of Excess Consumer Economy Age of Excess The Age of Excess was a time

AGE Platform Europe For an age-friendly European Union promoting active and healthy ageing for

How Virtual and Augmented Reality Will Transform Healthcare Walter Greenleaf PhD Healthcare

GALATIANS Galatians in Five Weeks 1. Paul 2. This Age and the Age to Come (New Creation)

7. Dynamics &amp; Age Outline 7.1. Dynamics &amp; Age 7.2. Temporal Information 7.3. Search in

GALATIANS Galatians in Five Weeks 1. Paul 2. This Age and the Age to Come (New Creation)

God Reaches Out Week 1: God Reaches Out To Meet Us Where We Are Week 2: God Reaches Out In

Pick up an in-class quiz from the table near the door Data structures and algorithm analysis

Algorithms for NLP Language Modeling III Taylor Berg-Kirkpatrick CMU Slides: Dan Klein

The Final Fulfillment of the Babylon Judgments Rev 18 I. God sends another strong angel who

Neural Program Synthesis from Diverse Demonstration Videos Sriram Shao-Hua Sun* 1 Hyeonwoo Noh* 2

HouseClassifier.com dn-ds What is HouseClassifier? A web app that classifies a given image of a

Housing Allocation Committee April 6, 2018 Committee Members: -Stephany Unruh -Griselda Lopez

SWAB REGISTRATION SYSTEM (SRS) 29 July 2020 : 6.30pm 7.30pm // 8.00pm 9.00pm Webinar

From application to dormitory Housing process in University Study-Oriented System (USOS) Micha

Sambuz

Useful Links

Newsletter

Mail Us

7. Dynamics & Age Outline 7.1. Dynamics & Age 7.2. Temporal Information 7.3. Search in