OUTLINE THEORETICAL IMPEDIMENTS TO MACHINE LEARNING - PDF document

NIPS-2017 OUTLINE THEORETICAL IMPEDIMENTS TO MACHINE LEARNING • Model-blind machine learning is a curve-fitting exercise – slow and dumb WITH SEVEN SPARKS FROM • The Causal hierarchy THE CAUSAL REVOLUTION • What we miss by depriving ML of causal models Judea Pearl University of California, Los Angeles • The Seven Sparks of the Causal Revolution judea@cs.ucla.edu 2 CAUSAL MODELS AND THE COUNTERFACTUALS: COGNITIVE REVOLUTION THE HOMOSAPIENS’ SECRET • 10,000 years ago, human beings accounted for less than a tenth of 1 percent of all vertebrate life on planet Earth. Today, that percentage, including livestock and pets, is in the neighborhood of 98! (Daniel Dennett, 2006) • What Happened? • What computational facility did humans acquire 10,000 years ago that they did not possess before? PREDICTION, INTERVENTION, 3-LEVEL HIERARCHY AND COUNTERFACTUALS 3. COUNTERFACTUALS ACTIVITY: Imagining, Retrospection QUESTIONS: What if I had done . . . ? Why? (Was it X that caused Y? What if X had not U 1 l U 2 W occurred? What if I had acted differently?) P – Price EXAMPLES: Was it the aspirin that stopped my headache? Would Kennedy be alive if Oswald had not Q – Demand d 1 d 2 killed him? What if I had not smoked the last 2 years? I – Income b 2 W – Wages 2. INTERVENTION Q P ACTIVITY: Doing, Intervening b 1 QUESTIONS: What if I do . . . ? Why? Questions: (What would Y be if I do X? 1. What is the expected value of the demand Q if the How can I make Y happen?) EXAMPLES: If I take aspirin, will my headache be cured? price is reported to be P = p 0 ? E [ Q | P = p 0 ] 1. ASSOCIATION 2. What is the expected value of the demand Q if the ACTIVITY: Seeing, Observing price is set to P = p 0 ? E [ Q | do ( P = p 0 ) ] QUESTIONS: What if I see . . . ? (How are the variables related? 3. Given that the current price is P = p 0 , what would the How would seeing X change my belief in Y?) expected value of the demand Q have been if we were EXAMPLES: What does a symptom tell me about a disease? to set the price at P = p 1 ? What does a survey tell us about the election results? E [ Q P = p | P = p 0 ] 6 1 1

THE STRUCTURAL CAUSAL MODEL (SCM) THE SEVEN PILLARS A BI-LINGUAL LOGIC FOR CAUSAL INFERENCE Semantic Pillar 1: Transparency and Testability of Causal Assumptions Q G Pillar 2: The control of confounding Causal Diagram Counterfactual Language Pillar 3: Counterfactuals Algorithmization (To specify what we (To specify what we wish Pillar 4: Mediation Analysis and the Assessment know − Assumptions) to know − Queries) of Direct and Indirect Effects U Pillar 5: External Validity and Sample Selection Bias P(X,Y,Z) P(Y | do(x)) Pillar 6: Missing Data (Karthika Mohan, 2017) X Z Y (selected) estimation Data G Q Pillar 7: Causal Discovery Causal E S ( Q ) Q Inference estimand 8 PILLAR 1: PILLAR 2: MEANINGFUL COMPACT REPRESENTATION THE CONTROL OF CONFOUNDING FOR CAUSAL ASSUMPTIONS Problem: Determine if a desired causal relation Task: Represent causal knowledge in compact, can be estimated from data and how. transparent, and testable way. Solution: The menace of Confounding has • Are the assumption plausible? Sufficient? been demystified and “deconfounded” • Are the assumptions compatible with the available • "back-door" – reduces covariate selection to data? If not, which needs repair? a game Result: Transparency and testability galore • “front door” – extends it beyond adjustment Graphical criteria tell us, for any pattern of paths, what • do-calculus – predicts the effect of policy pattern of dependencies we should expect in the data. interventions whenever feasible 10 9 PILLAR 4: PILLAR 3: MEDIATION ANALYSIS – THE ALGORITHMIZATION OF DIRECT AND INDIRECT EFFECTS COUNTERFACTUALS Task: Given {Model + Data}, determine what Joe's Task: Given {Data + Model}, Unveil and quantify salary would be had he had one more year of the mechanisms that transmit changes from a education. cause to its effects. Solution: Algorithms have been developed for determining if/how the probability of any counterfactual Result: The graphical representation of sentence is estimable from experimental or counterfactuals tells us when direct and indirect observational studies, or combination thereof. effects are estimable from data, and, if so, how How? necessary (or sufficient) mediation is for the • Every model determines the truth value of every effect. counterfactual by a toy-like “surgery” procedure. • Corollary: “Causes of effect” formalized 11 12 2

PILLAR 6: PILLAR 5: MISSING DATA (Mohan, 2015) TRANSFER LEARNING, EXTERNAL VALIDITY, AND SAMPLE SELECTION BIAS Problem: Given data corrupted by missing values and a model of what causes missingness. Determine when Task: A machine trained in one environment finds relations of interest can be estimated consistently “as if no data were missing.” that environmental conditions changed. When/how can it amortize past learning to the new Results: Graphical criteria unveil when estimability is environment? possible, when it is not, and how. Corollaries: Solution: Complete formal solution obtained • When the missingness model is testable and when it is through the do -calculus and “selection not. diagrams” (Bareinboim et al., 2016) • When model-blind estimators can yield consistent estimation and when they cannot. Lesson: Ancient threats disarmed by working • All results are query specific. solutions. 14 • Missing data is a causal problem. 13 CONCLUSIONS PILLAR 7: CAUSAL DISCOVERY • Model-blind approaches to AI impose intrinsic limitations on the cognitive tasks that they can perform. Task: Search for a set of models (graphs) that • The seven tasks described, exemplify what can be are compatible with the data, and represent done with models that cannot be done without, them compactly. regardless how big the data. • DATA SCIENCE is only as much of a science as it Results: In certain circumstances, and under facilitates the interpretation of data -- a two-body weak assumptions, causal queries can be problem involving both data and reality. estimated directly from this compatibility set. • DATA SCIENCE lacking a model of reality may be (Spirtes, Glymour and Scheines (2000); Jonas statistics but hardly a science. Peters etal (2018)) • Human-level AI cannot emerge from model-blind learning machines. 15 16 Paper available: http://ftp.cs.ucla.edu/pub/stat_ser/r475.pdf Refs: http://bayes.cs.ucla.edu/jp_home.html THANK YOU Joint work with: Elias Bareinboim Karthika Mohan Ilya Shpitser Jin Tian Many more . . . 3

OUTLINE THEORETICAL IMPEDIMENTS TO MACHINE LEARNING - PDF document

NIPS-2017 OUTLINE THEORETICAL IMPEDIMENTS TO MACHINE LEARNING Model-blind machine learning is a curve-fitting exercise slow and dumb WITH SEVEN SPARKS FROM The Causal hierarchy THE CAUSAL REVOLUTION What we miss by

SURVEY REQUIREMENTS IN TEXAS FNMA - B7-2-05: Title Exceptions and Impediments

DEKALB COUNTYS 2019 ANALYSIS OF IMPEDIMENTS TO FAIR HOUSING CHOICE PUBLIC HEARING 1 DEKALB

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

George M. MacDonald Director Division of Sustainability MAINE DEPARTMENT OF ENVIRONMENTAL

resources in the Greater Heretaunga and Ahuriri catchment area March 2017 Welcome and

Sediment management options for Ruamhanga whaitua For discussion 9 April 2018 Image:

Approach to Comparability as Virginia Moved to Online Testing National Conference on Student

Michael Ewing, Coordinator Environmental Pillar of Social Partnership. Are: Highly complex

Profile of County Galway PPN Membership Total Numb Total mber of Gr Group oups Registere

CO 2 Value Europe The new industry association dedicated to Carbon Capture & Utilization

Loss-Estimation Modeling of Earthquake Scenarios for Each County in Nevada Using HAZUS-MH Nevada

OUTLINE THEORETICAL IMPEDIMENTS TO MACHINE LEARNING - PDF document

NIPS-2017 OUTLINE THEORETICAL IMPEDIMENTS TO MACHINE LEARNING Model-blind machine learning is a curve-fitting exercise slow and dumb WITH SEVEN SPARKS FROM The Causal hierarchy THE CAUSAL REVOLUTION What we miss by

SURVEY REQUIREMENTS IN TEXAS FNMA - B7-2-05: Title Exceptions and Impediments

DEKALB COUNTYS 2019 ANALYSIS OF IMPEDIMENTS TO FAIR HOUSING CHOICE PUBLIC HEARING 1 DEKALB

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

George M. MacDonald Director Division of Sustainability MAINE DEPARTMENT OF ENVIRONMENTAL

resources in the Greater Heretaunga and Ahuriri catchment area March 2017 Welcome and

Sediment management options for Ruamhanga whaitua For discussion 9 April 2018 Image:

Approach to Comparability as Virginia Moved to Online Testing National Conference on Student

Michael Ewing, Coordinator Environmental Pillar of Social Partnership. Are: Highly complex

Profile of County Galway PPN Membership Total Numb Total mber of Gr Group oups Registere

CO 2 Value Europe The new industry association dedicated to Carbon Capture &amp; Utilization

Loss-Estimation Modeling of Earthquake Scenarios for Each County in Nevada Using HAZUS-MH Nevada

CO 2 Value Europe The new industry association dedicated to Carbon Capture & Utilization