Review S. Cheng (OU-Tulsa) October 17, 2017 1 / 28 Lecture 10 - PowerPoint PPT Presentation

Lecture 10 Review S. Cheng (OU-Tulsa) October 17, 2017 1 / 28

Lecture 10 Review Conditioning reduces entropy S. Cheng (OU-Tulsa) October 17, 2017 2 / 28

Lecture 10 Review Conditioning reduces entropy Chain rules: H ( X , Y , Z ) S. Cheng (OU-Tulsa) October 17, 2017 2 / 28

Lecture 10 Review Conditioning reduces entropy Chain rules: H ( X , Y , Z ) = H ( Z ) + H ( Y | X ) + H ( Z | X , Y ) H ( X , Y , U | V ) S. Cheng (OU-Tulsa) October 17, 2017 2 / 28

Lecture 10 Review Conditioning reduces entropy Chain rules: H ( X , Y , Z ) = H ( Z ) + H ( Y | X ) + H ( Z | X , Y ) H ( X , Y , U | V )= H ( X | V ) + H ( Y | X , V ) + H ( U | Y , X , V ) I ( X , Y , Z ; U ) S. Cheng (OU-Tulsa) October 17, 2017 2 / 28

Lecture 10 Review Conditioning reduces entropy Chain rules: H ( X , Y , Z ) = H ( Z ) + H ( Y | X ) + H ( Z | X , Y ) H ( X , Y , U | V )= H ( X | V ) + H ( Y | X , V ) + H ( U | Y , X , V ) I ( X , Y , Z ; U )= I ( X ; U ) + I ( Y ; U | X ) + I ( Z ; U | X , Y ) I ( X , Y , Z ; U | V )= I ( X ; U | V ) + I ( Y ; U | V , X ) + I ( Z ; U | V , X , Y ) Data processing inequality: if X ⊥ Y | Z , I ( X ; Y ) ≥ I ( X ; Z ) Independence and mutual information: X ⊥ Y ⇔ I ( X ; Y ) = 0 X ⊥ Y | Z ⇔ I ( X ; Y | Z ) = 0 KL-divergence: KL ( p || q ) � S. Cheng (OU-Tulsa) October 17, 2017 2 / 28

Lecture 10 Review Conditioning reduces entropy Chain rules: H ( X , Y , Z ) = H ( Z ) + H ( Y | X ) + H ( Z | X , Y ) H ( X , Y , U | V )= H ( X | V ) + H ( Y | X , V ) + H ( U | Y , X , V ) I ( X , Y , Z ; U )= I ( X ; U ) + I ( Y ; U | X ) + I ( Z ; U | X , Y ) I ( X , Y , Z ; U | V )= I ( X ; U | V ) + I ( Y ; U | V , X ) + I ( Z ; U | V , X , Y ) Data processing inequality: if X ⊥ Y | Z , I ( X ; Y ) ≥ I ( X ; Z ) Independence and mutual information: X ⊥ Y ⇔ I ( X ; Y ) = 0 X ⊥ Y | Z ⇔ I ( X ; Y | Z ) = 0 x p ( x ) log p ( x ) KL-divergence: KL ( p || q ) � � q ( x ) S. Cheng (OU-Tulsa) October 17, 2017 2 / 28

Lecture 10 Overview This time Identification/Decision trees Random forests Law of Large Number Asymptotic equipartition (AEP) and typical sequences S. Cheng (OU-Tulsa) October 17, 2017 3 / 28

Lecture 10 Identification/Decision tree Vampire database (https://www.youtube.com/watch?v=SXBG3RGr Rc) S. Cheng (OU-Tulsa) October 17, 2017 4 / 28

Lecture 10 Identification/Decision tree Identifying vampire Goal: Design a set of tests to identify vampires Potential difficulties Non-numerical data Some information may not matter Some may matter only sometimes Tests may be costly ⇒ conduct as few as possible S. Cheng (OU-Tulsa) October 17, 2017 5 / 28

Lecture 10 Identification/Decision tree Test trees Complexion Shadow Garlic ? N A R N Y Y P ++ +++ ++ -- --- + --- -- -- -- - + Accent N O H -- - -+ + ++ + : Vampire − : Not vampire S. Cheng (OU-Tulsa) October 17, 2017 6 / 28

Lecture 10 Identification/Decision tree Test trees Complexion Shadow Garlic ? N A R N Y Y P ++ +++ ++ -- --- + --- -- -- -- - + Accent N O H -- - -+ + ++ + : Vampire − : Not vampire How to pick a good test? S. Cheng (OU-Tulsa) October 17, 2017 6 / 28

Lecture 10 Identification/Decision tree Test trees Complexion Shadow Garlic ? N A R N Y Y P ++ +++ ++ -- --- + --- -- -- -- - + Accent N O H -- - -+ + ++ + : Vampire − : Not vampire How to pick a good test? Pick test that identifies most vampires (and non-vampires)! S. Cheng (OU-Tulsa) October 17, 2017 6 / 28

Lecture 10 Identification/Decision tree Sizes of homogeneous sets Complexion Shadow Garlic ? A R N N Y Y P ++ +++ ++ -- --- + --- -- -- -- - + Accent N O H -- - -+ + ++ + : Vampire − : Not vampire S. Cheng (OU-Tulsa) October 17, 2017 7 / 28

Lecture 10 Identification/Decision tree Sizes of homogeneous sets Complexion Shadow Garlic ? A R N N Y Y P ++ +++ ++ -- --- + --- -- -- -- - + Accent N O H -- - -+ + ++ + : Vampire − : Not vampire Shadow: 4 Garlic: 3 Complexion: 2 Accent: 0 S. Cheng (OU-Tulsa) October 17, 2017 7 / 28

Lecture 10 Identification/Decision tree Picking second test Let say we pick “shadow” as the first test after all. Then, for the remaining unclassified individuals, Complexion Garlic Accent N O A R Y N H P -- ++ + - +- +- +- -+ Garlic: 4 Complexion: 2 Accent: 0 S. Cheng (OU-Tulsa) October 17, 2017 8 / 28

Lecture 10 Identification/Decision tree Combined tests Shadow N ? Y Not Garlic Vampire vampire Y N Not Vampire vampire S. Cheng (OU-Tulsa) October 17, 2017 9 / 28

Lecture 10 Identification/Decision tree Combined tests Shadow N ? Y Not Garlic Vampire vampire Y N Not Vampire vampire Problem When our database size increases, none of the test likely to completely separate vampire from non-vampire. All tests will score 0 then. S. Cheng (OU-Tulsa) October 17, 2017 9 / 28

Lecture 10 Identification/Decision tree Combined tests Shadow N ? Y Not Garlic Vampire vampire Y N Not Vampire vampire Problem When our database size increases, none of the test likely to completely separate vampire from non-vampire. All tests will score 0 then. Entropy comes to the rescue! S. Cheng (OU-Tulsa) October 17, 2017 9 / 28

Lecture 10 Identification/Decision tree Conditional entropy as a measure of test efficiency Consider the database is randomly sampled from a distribution. A set is Very homogeneous ≈ high certainty Not so homogenous ≈ high randomness These can be measured with its entropy 0.7 0.6 Shadow 0.5 ? 0.4 N Entropy Y 0.3 0.2 ++-- --- + 0.1 H ( V | S =?) = 1 H ( V | S = Y ) = 0 H ( V | S = N ) = 0 0 0 0.2 0.4 0.6 0.8 1 Pr(Head) S. Cheng (OU-Tulsa) October 17, 2017 10 / 28

Lecture 10 Identification/Decision tree Conditional entropy as a measure of test efficiency Consider the database is randomly sampled from a distribution. A set is Very homogeneous ≈ high certainty Not so homogenous ≈ high randomness These can be measured with its entropy 0.7 0.6 Shadow 0.5 ? 0.4 N Entropy Y 0.3 0.2 ++-- --- + 0.1 H ( V | S =?) = 1 H ( V | S = Y ) = 0 H ( V | S = N ) = 0 0 0 0.2 0.4 0.6 0.8 1 Pr(Head) Remaining uncertainty given the test: 4 8 H ( V | S =?) S. Cheng (OU-Tulsa) October 17, 2017 10 / 28

Lecture 10 Identification/Decision tree Conditional entropy as a measure of test efficiency Consider the database is randomly sampled from a distribution. A set is Very homogeneous ≈ high certainty Not so homogenous ≈ high randomness These can be measured with its entropy 0.7 0.6 Shadow 0.5 ? 0.4 N Entropy Y 0.3 0.2 ++-- --- + 0.1 H ( V | S =?) = 1 H ( V | S = Y ) = 0 H ( V | S = N ) = 0 0 0 0.2 0.4 0.6 0.8 1 Pr(Head) Remaining uncertainty given the test: 8 H ( V | S =?) + 3 4 8 H ( V | S = Y ) S. Cheng (OU-Tulsa) October 17, 2017 10 / 28

Lecture 10 Identification/Decision tree Conditional entropy as a measure of test efficiency Consider the database is randomly sampled from a distribution. A set is Very homogeneous ≈ high certainty Not so homogenous ≈ high randomness These can be measured with its entropy 0.7 0.6 Shadow 0.5 ? 0.4 N Entropy Y 0.3 0.2 ++-- --- + 0.1 H ( V | S =?) = 1 H ( V | S = Y ) = 0 H ( V | S = N ) = 0 0 0 0.2 0.4 0.6 0.8 1 Pr(Head) Remaining uncertainty given the test: 8 H ( V | S =?) + 3 4 8 H ( V | S = Y ) + 1 8 H ( V | S = N ) = 0 . 5 S. Cheng (OU-Tulsa) October 17, 2017 10 / 28

Review S. Cheng (OU-Tulsa) October 17, 2017 1 / 28 Lecture 10 - PowerPoint PPT Presentation

Lecture 10 Review S. Cheng (OU-Tulsa) October 17, 2017 1 / 28 Lecture 10 Review Conditioning reduces entropy S. Cheng (OU-Tulsa) October 17, 2017 2 / 28 Lecture 10 Review Conditioning reduces entropy Chain rules: H ( X , Y , Z ) S.

FE Review-Transportation 1 FE Review-Transportation 2 FE Review-Transportation 3 FE

FE Review-Mechanics of Materials 1 FE Review-Mechanics of Materials 2 FE Review-Mechanics of

MTA-RF: Fabrication Readiness Review Bowring Review Daniel Bowring Lawrence Berkeley National

Keeyask Engineering Review Jan 30 2017 Project Design Review Contract Cost Review

Part 1 Part 1 I ntroduction Review of I ntroduction Review of I ntroduction, Review of I

Peer Review Process Boris Sokolov, PhD Scientific Review Officer Center for Scientific Review

SAB Review: SAB Review: IRIS Toxicological Review IRIS Toxicological Review of Acrylamide of

STATE DRUG OVERDOSE REVIEW FATALITY REVIEW TEAM November 28, 2017 Fatality Review Teams The

5-Year Review OCP Monitoring Program 5 Year Review Annual Review Five Year Review

Welcome & Introduction Welcome & Introduction Annual Review 2017 Annual Review 2017

Title I Annual Review July 7, 2014 Goal: Complete Title I Annual Review Outcomes: Review

Virginia Webb, PhD, RD Procurement Review Process First review cycle Review last

Review of the Department of Justice Review of the Department of Justice Review of the Department of

ML&P Sale Worksession #1 Plan for Transaction Review November 2 nd : Review of

SAMHSA GRANT REVIEW THE MYSTERY OF REVIEW REVEALED TENETS OF REVIEW Each application must

London Borough of Croydon Peer Review 20 th 22 nd June 2018 Review team Name Title Review

Clustering: K-Means, GMM, EM March 11, 2016 Boris Ivanovic* csc411ta@cs.toronto.edu *Based on

MOL2NET , 2017 , 3, doi: 10.3390/mol2net-03-04608 2 with SuperPro Designer software from a raw

ATLAS Pixel Detector Upgrade The Insertable B-Layer David Bertsche November 8th, 2012 ATLAS

Optical Properties ISSUES TO ADDRESS... What happens when light shines on a material ?

Neural LMs Image: (Bengio et al, 03) One Hot Vectors Neural LMs (Bengio et al, 03)

Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim Follow an

Introduction to Linear Programming Linear Programming is the study of optimization problems in

Causation When C causes E, C helps to make E happen. Learning about causes allows us

Review S. Cheng (OU-Tulsa) October 17, 2017 1 / 28 Lecture 10 - PowerPoint PPT Presentation

Lecture 10 Review S. Cheng (OU-Tulsa) October 17, 2017 1 / 28 Lecture 10 Review Conditioning reduces entropy S. Cheng (OU-Tulsa) October 17, 2017 2 / 28 Lecture 10 Review Conditioning reduces entropy Chain rules: H ( X , Y , Z ) S.

FE Review-Transportation 1 FE Review-Transportation 2 FE Review-Transportation 3 FE

FE Review-Mechanics of Materials 1 FE Review-Mechanics of Materials 2 FE Review-Mechanics of

MTA-RF: Fabrication Readiness Review Bowring Review Daniel Bowring Lawrence Berkeley National

Keeyask Engineering Review Jan 30 2017 Project Design Review Contract Cost Review

Part 1 Part 1 I ntroduction Review of I ntroduction Review of I ntroduction, Review of I

Peer Review Process Boris Sokolov, PhD Scientific Review Officer Center for Scientific Review

SAB Review: SAB Review: IRIS Toxicological Review IRIS Toxicological Review of Acrylamide of

STATE DRUG OVERDOSE REVIEW FATALITY REVIEW TEAM November 28, 2017 Fatality Review Teams The

5-Year Review OCP Monitoring Program 5 Year Review Annual Review Five Year Review

Welcome &amp; Introduction Welcome &amp; Introduction Annual Review 2017 Annual Review 2017

Title I Annual Review July 7, 2014 Goal: Complete Title I Annual Review Outcomes: Review

Virginia Webb, PhD, RD Procurement Review Process First review cycle Review last

Review of the Department of Justice Review of the Department of Justice Review of the Department of

ML&amp;P Sale Worksession #1 Plan for Transaction Review November 2 nd : Review of

SAMHSA GRANT REVIEW THE MYSTERY OF REVIEW REVEALED TENETS OF REVIEW Each application must

London Borough of Croydon Peer Review 20 th 22 nd June 2018 Review team Name Title Review

Clustering: K-Means, GMM, EM March 11, 2016 Boris Ivanovic* csc411ta@cs.toronto.edu *Based on

MOL2NET , 2017 , 3, doi: 10.3390/mol2net-03-04608 2 with SuperPro Designer software from a raw

ATLAS Pixel Detector Upgrade The Insertable B-Layer David Bertsche November 8th, 2012 ATLAS

Optical Properties ISSUES TO ADDRESS... What happens when light shines on a material ?

Neural LMs Image: (Bengio et al, 03) One Hot Vectors Neural LMs (Bengio et al, 03)

Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim Follow an

Introduction to Linear Programming Linear Programming is the study of optimization problems in

Causation When C causes E, C helps to make E happen. Learning about causes allows us

Welcome & Introduction Welcome & Introduction Annual Review 2017 Annual Review 2017

ML&P Sale Worksession #1 Plan for Transaction Review November 2 nd : Review of