Error Exponents for Composite Hypothesis Testing of Markov Forest - PowerPoint PPT Presentation

Error Exponents for Composite Hypothesis Testing of Markov Forest Distributions Vincent Tan, Anima Anandkumar, Alan S. Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology ISIT (Jun 18, 2010) 1/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 1 / 17

Motivation Continuation of line of work on error exponents for learning tree-structured graphical models: Discrete Case: Tan, Anandkumar, Tong, Willsky, ISIT 2009. 2/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 2 / 17

Motivation Continuation of line of work on error exponents for learning tree-structured graphical models: Discrete Case: Tan, Anandkumar, Tong, Willsky, ISIT 2009. Gaussian Case: Tan, Anandkumar, Willsky, Trans. SP 2010. 2/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 2 / 17

Motivation Continuation of line of work on error exponents for learning tree-structured graphical models: Discrete Case: Tan, Anandkumar, Tong, Willsky, ISIT 2009. Gaussian Case: Tan, Anandkumar, Willsky, Trans. SP 2010. Instead of learning, we instead focus on hypothesis testing. 2/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 2 / 17

Motivation Continuation of line of work on error exponents for learning tree-structured graphical models: Discrete Case: Tan, Anandkumar, Tong, Willsky, ISIT 2009. Gaussian Case: Tan, Anandkumar, Willsky, Trans. SP 2010. Instead of learning, we instead focus on hypothesis testing. Provides intuition for which classes of graphical models are easy for learning in terms of the detection error exponent. 2/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 2 / 17

Motivation Continuation of line of work on error exponents for learning tree-structured graphical models: Discrete Case: Tan, Anandkumar, Tong, Willsky, ISIT 2009. Gaussian Case: Tan, Anandkumar, Willsky, Trans. SP 2010. Instead of learning, we instead focus on hypothesis testing. Provides intuition for which classes of graphical models are easy for learning in terms of the detection error exponent. Is there a relation between the detection error exponent and the exponent associated to structure learning? 2/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 2 / 17

Background on Tree-Structured Graphical Models Graphical model: family of multivariate probability distributions that factorize according to a given graph G = ( V , E ) . Vertices in the set V = { 1 , . . . , d } correspond to variables and � V � edges in E ⊂ to conditional independences. 2 3/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 3 / 17

Background on Tree-Structured Graphical Models Graphical model: family of multivariate probability distributions that factorize according to a given graph G = ( V , E ) . Vertices in the set V = { 1 , . . . , d } correspond to variables and � V � edges in E ⊂ to conditional independences. 2 Example for tree-structured P ( x ) with d = 4 . 3/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 3 / 17

Background on Tree-Structured Graphical Models Graphical model: family of multivariate probability distributions that factorize according to a given graph G = ( V , E ) . Vertices in the set V = { 1 , . . . , d } correspond to variables and � V � edges in E ⊂ to conditional independences. 2 Example for tree-structured P ( x ) with d = 4 . X 3 X 2 ✉ ✉ ❅ � � ❅ � ❅ ❅� X 1 ✉ X 4 ✉ 3/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 3 / 17

Background on Tree-Structured Graphical Models Graphical model: family of multivariate probability distributions that factorize according to a given graph G = ( V , E ) . Vertices in the set V = { 1 , . . . , d } correspond to variables and � V � edges in E ⊂ to conditional independences. 2 Example for tree-structured P ( x ) with d = 4 . X 3 X 2 ✉ ✉ ❅ � � ❅ � ❅ ❅� X 1 ✉ X 4 ✉ P ( x 1 , x 2 , x 3 , x 4 ) = P 1 ( x 1 ) × P 1 , 2 ( x 1 , x 2 ) × P 1 , 3 ( x 1 , x 3 ) × P 1 , 4 ( x 1 , x 4 ) . P 1 ( x 1 ) P 1 ( x 1 ) P 1 ( x 1 ) 3/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 3 / 17

Learning vs Hypothesis Testing Canonical Problem: Given x 1 , . . . , x n ∼ P , learn structure of P . 4/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 4 / 17

Learning vs Hypothesis Testing Canonical Problem: Given x 1 , . . . , x n ∼ P , learn structure of P . If P is a tree, can use Chow and Liu (1968) as an efficient implementation of ML. 4/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 4 / 17

Learning vs Hypothesis Testing Canonical Problem: Given x 1 , . . . , x n ∼ P , learn structure of P . If P is a tree, can use Chow and Liu (1968) as an efficient implementation of ML. Denote set of distributions Markov on a tree T 0 ∈ T as D ( T 0 ) . Set of distributions Markov on any tree is D ( T ) . 4/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 4 / 17

Learning vs Hypothesis Testing Canonical Problem: Given x 1 , . . . , x n ∼ P , learn structure of P . If P is a tree, can use Chow and Liu (1968) as an efficient implementation of ML. Denote set of distributions Markov on a tree T 0 ∈ T as D ( T 0 ) . Set of distributions Markov on any tree is D ( T ) . Composite hypothesis testing problem considered here: H 0 : x 1 , . . . , x n ∼ Λ 0 ⊂ D ( T ) H 1 : x 1 , . . . , x n ∼ Λ 1 ⊂ D ( T ) Λ i closed and Λ 0 ∩ Λ 1 = ∅ . 4/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 4 / 17

Definition of Worst-Case Type-II Error Exponent Neyman-Pearson setup. Acceptance regions ( A n ) . 5/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 5 / 17

Definition of Worst-Case Type-II Error Exponent Neyman-Pearson setup. Acceptance regions ( A n ) . Def: Type-II error exponent for a fixed Q ∈ Λ 1 given ( A n ) : n →∞ − 1 n log Q n ( A n ) J (Λ 0 , Q ; A n ) := lim inf 5/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 5 / 17

Definition of Worst-Case Type-II Error Exponent Neyman-Pearson setup. Acceptance regions ( A n ) . Def: Type-II error exponent for a fixed Q ∈ Λ 1 given ( A n ) : n →∞ − 1 n log Q n ( A n ) J (Λ 0 , Q ; A n ) := lim inf Def: Optimal Type-II error exponent J ∗ (Λ 0 , Q ) := J (Λ 0 , Q ; A n ) sup A n : P n ( A n ) ≤ α, ∀ P ∈ Λ 0 5/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 5 / 17

Definition of Worst-Case Type-II Error Exponent Neyman-Pearson setup. Acceptance regions ( A n ) . Def: Type-II error exponent for a fixed Q ∈ Λ 1 given ( A n ) : n →∞ − 1 n log Q n ( A n ) J (Λ 0 , Q ; A n ) := lim inf Def: Optimal Type-II error exponent J ∗ (Λ 0 , Q ) := J (Λ 0 , Q ; A n ) sup A n : P n ( A n ) ≤ α, ∀ P ∈ Λ 0 Def: Worst-Case Optimal Type-II error exponent J ∗ (Λ 0 , Λ 1 ) := inf Q ∈ Λ 1 J ∗ (Λ 0 , Q ) 5/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 5 / 17

Definition of Worst-Case Type-II Error Exponent Neyman-Pearson setup. Acceptance regions ( A n ) . Def: Type-II error exponent for a fixed Q ∈ Λ 1 given ( A n ) : n →∞ − 1 n log Q n ( A n ) J (Λ 0 , Q ; A n ) := lim inf Def: Optimal Type-II error exponent J ∗ (Λ 0 , Q ) := J (Λ 0 , Q ; A n ) sup A n : P n ( A n ) ≤ α, ∀ P ∈ Λ 0 Def: Worst-Case Optimal Type-II error exponent J ∗ (Λ 0 , Λ 1 ) := inf Q ∈ Λ 1 J ∗ (Λ 0 , Q ) Optimizing distribution Q ∗ called the least favorable distribution. 5/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 5 / 17

Why Difficult? 6/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 6 / 17

Why Difficult? Many trees: If there are d nodes, there are d d − 2 trees! Searching for the dominant error event may be intractable. 6/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 6 / 17

Why Difficult? Many trees: If there are d nodes, there are d d − 2 trees! Searching for the dominant error event may be intractable. Natural Questions: Any closed-form expressions for the worst-case error exponent for special Λ 0 , Λ 1 ? How does this depend on the true distribution? 6/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 6 / 17

Why Difficult? Many trees: If there are d nodes, there are d d − 2 trees! Searching for the dominant error event may be intractable. Natural Questions: Any closed-form expressions for the worst-case error exponent for special Λ 0 , Λ 1 ? How does this depend on the true distribution? Connections to learning? Intuition and characterization of the least favorable distribution? 6/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 6 / 17

A Simplification Assume that H 0 is simple and P is Markov on T 0 = ( V , E 0 ) . H 0 : x 1 , . . . , x n ∼ { P } H 1 : x 1 , . . . , x n ∼ Λ 1 = D ( T ) \ D ( T 0 ) 7/17 Vincent Tan (SSG, LIDS, MIT) Testing Markov Forest Distributions ISIT 2010 7 / 17

Error Exponents for Composite Hypothesis Testing of Markov Forest - PowerPoint PPT Presentation

Error Exponents for Composite Hypothesis Testing of Markov Forest Distributions Vincent Tan, Anima Anandkumar, Alan S. Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology

Algebra practice part 4 E. Exponents 3 4 Positive exponents Negative exponents Examples:

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Logarithms A Quick Review of Exponents Exponent 5 7 Base Exponents have two parts: 1. Base: The

2 - 2 Laws of Exponents The laws of exponents are listed below:

Some Recent Progress in the Applications of Niho Exponents Nian Li Faculty of Mathematics and

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

Significance Testing Evaluation, session 6 CS6200: Information Retrieval Statistical

The Power and Limits of Statistics DPRRGSP 2018-11-29 @ReinhardFurrer Applied Statistics

The Gaussian parameterized by mean and SD (position / width) product of two Gaussians is

Review of basic frequentist concepts Shravan Vasishth March 10, 2020 1 Foundations 1.1 Random

Lecture 8: Information Theory and Statistics I-Hsiang Wang Department of Electrical Engineering

Quality Data Categories Administered by: Funded by: Target audience MultilingualWebLT

Type Error Slicing What is a type error and how do you locate one? Christian Haack Joe Wells

Failure is not an Option Error handling strategies for Kotlin programs Nat Pryce & Duncan

Error Exponents for Composite Hypothesis Testing of Markov Forest - PowerPoint PPT Presentation

Error Exponents for Composite Hypothesis Testing of Markov Forest Distributions Vincent Tan, Anima Anandkumar, Alan S. Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology

Algebra practice part 4 E. Exponents 3 4 Positive exponents Negative exponents Examples:

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Logarithms A Quick Review of Exponents Exponent 5 7 Base Exponents have two parts: 1. Base: The

2 - 2 Laws of Exponents The laws of exponents are listed below:

Some Recent Progress in the Applications of Niho Exponents Nian Li Faculty of Mathematics and

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

Significance Testing Evaluation, session 6 CS6200: Information Retrieval Statistical

The Power and Limits of Statistics DPRRGSP 2018-11-29 @ReinhardFurrer Applied Statistics

The Gaussian parameterized by mean and SD (position / width) product of two Gaussians is

Review of basic frequentist concepts Shravan Vasishth March 10, 2020 1 Foundations 1.1 Random

Lecture 8: Information Theory and Statistics I-Hsiang Wang Department of Electrical Engineering

Quality Data Categories Administered by: Funded by: Target audience MultilingualWebLT

Type Error Slicing What is a type error and how do you locate one? Christian Haack Joe Wells

Failure is not an Option Error handling strategies for Kotlin programs Nat Pryce &amp; Duncan

Failure is not an Option Error handling strategies for Kotlin programs Nat Pryce & Duncan