Optimal design, orthogonal polynomials and random matrices Holger - PowerPoint PPT Presentation

Optimal design, orthogonal polynomials and random matrices Holger Dette 5 joint work with F. Bretz 1 , S. Delvaux 2 , L. Imhof 3 , W.J. Studden 4 1 Novartis, Basel 2 Katholieke Universiteit Leuven 3 University of Bonn 4 Purdue University 5 Ruhr-University Bochum September 2011, K¨ oln 1 / 54

Overview Contents Motivating example: dose finding experiment Some optimal design theory Optimal design for weighted polynomial regression Weak asymptotics of optimal designs 2 / 54

Overview Contents Motivating example: dose finding experiment Some optimal design theory Optimal design for weighted polynomial regression Weak asymptotics of optimal designs Random matrices - the Gaussian ensemble Random band matrices Matrix orthogonal polynomials The limiting spectrum of random band matrices 2 / 54

Optimal Design Motivating example: drug development (clinical phase) pre-clinic clinic market phase I phase II phase III � � ❅ � ❅ � ✠ � ❄ ❅ ❅ ❘ ❅ first experiments efficacy, with humans (large) clinical trials dose finding, (proof of efficacy, safety ... side effects) Phase I: 20 − 40 patients Phase II: 100 − 300 patients Phase III: 1000 − 10000 patients What dose level should be used in the the phase III trial? 3 / 54

Optimal Design Motivating Example: drug development Confirmatory trial (phase II) to determine the appropriate target dose Main goal: estimation of the minimum effective dose level (target dose), which produces at least the clinically relevant effect Mathematical (extremely simplified) description of the dose response relationship (Michaelis Menten model) 0.4 0.35 0.3 0.25 0.2 0.15 } ∆ 0.1 0.05 0 0 50 100 150 MED 2 4 / 54

Optimal Design (Nonlinear) regression model x ∈ X Y = η ( x , θ ) + σ ( x , θ ) ε, X denotes the design space ε random error, E [ ε ] = 0 , E [ ε 2 ] = 1 m independent observations Y 1 , . . . , Y m at experimental conditions x 1 , . . . , x m to estimate the vector of parameters θ Expectation of Y (at experimental condition x ) is given by η ( x , θ ) Variance of Y (at experimental condition x ) is given by σ 2 ( x , θ ) Example: Michaelis Menten model θ 1 x θ 1 x x ∈ X = (0 , ∞ ) η ( x , θ ) = , σ ( x , θ ) = , x + θ 2 x + θ 2 5 / 54

Optimal Design Problem: At which points x i should we take observations ? Definition: An approximate design ξ is a probability measure on the design space X . Example: � 25 � 80 150 ξ = 1 1 1 3 3 3 ⇒ 1 / 3 of the total observations at each point 25 , 80 and 150 m = 30 → 10 , 10 , 10 m = 40 → 13 , 14 , 13 6 / 54

Optimal Design Measuring the quality of designs Weighted least squares estimator: ˆ θ θ ) ∼ 1 ⇒ Cov(ˆ mM − 1 ( ξ ) where � T ∂η ( x , θ ) � ∂η ( x , θ ) � 1 M ( ξ ) = σ 2 ( x , θ ) ∂θ ∂θ X � T ∂σ 2 ( x , θ ) � ∂σ 2 ( x , θ ) 1 + d ξ ( x ) 2 σ 4 ( x , θ ) ∂θ ∂θ denotes the information matrix of the design ξ (this measure refers to the normality assumption). 7 / 54

Optimal Design Measuring the quality of designs Weighted least squares estimator: ˆ θ θ ) ∼ 1 ⇒ Cov(ˆ mM − 1 ( ξ ) where � T ∂η ( x , θ ) � ∂η ( x , θ ) � 1 M ( ξ ) = σ 2 ( x , θ ) ∂θ ∂θ X � T ∂σ 2 ( x , θ ) � ∂σ 2 ( x , θ ) 1 + d ξ ( x ) 2 σ 4 ( x , θ ) ∂θ ∂θ denotes the information matrix of the design ξ (this measure refers to the normality assumption). Goal: Maximize M ( ξ ) w.r.t. the choice of the design ξ (impossible!!) 7 / 54

Optimal Design Optimality criteria Only a partial ordering in the space of nonnegative definite matrices Maximize real valued (statistical meaningful) functions of M ( ξ ) → optimality criteria 8 / 54

Optimal Design Optimality criteria Only a partial ordering in the space of nonnegative definite matrices Maximize real valued (statistical meaningful) functions of M ( ξ ) → optimality criteria The application determines the criterion c -optimality (MED-estimation) ξ ∗ = arg max ξ ( c T M − 1 ( ξ ) c ) − 1 where c is a vector determined by the regression model. D -optimality (precise estimation of all parameters) ξ ∗ = arg max | M ( ξ ) | ξ In this talk we will only consider D -optimal designs and polynomial models ! 8 / 54

Optimal Design Classical (weigthed) polynomial regression model Polynomial regression model [ θ = ( θ 0 , . . . , θ n − 1 ) T , x ∈ ( −∞ , ∞ )] n − 1 � θ j x j η ( x , θ ) = j =0 σ 2 ( x , θ ) = e x 2 Example: n = 2, linear regression model (with heteroscedastic error) ∂ ∂ ∂θη ( x , θ ) = (1 , x , . . . , x n − 1 ) T , ∂θσ 2 ( x , θ ) = 0 9 / 54

Optimal Design D -optimal design problem (weighted polynomial regression) A D -optimal design maximizes the determinant �� x i + j e − x 2 d ξ ( x ) | M ( ξ ) | = � � � � i , j =0 ,..., n − 1 R � e − x 2 d ξ ( x ) xe − x 2 d ξ ( x ) x n − 1 e − x 2 d ξ ( x ) � � � � . . . � � � � R R R � � xe − x 2 d ξ ( x ) x 2 e − x 2 d ξ ( x ) x n e − x 2 d ξ ( x ) � � � . . . � � � � = R R R � � . . ... ... � � . . . . � � � � x n − 1 e − x 2 d ξ ( x ) x 2 e − x 2 d ξ ( x ) x 2 n − 2 e − x 2 d ξ ( x ) � � � � . . . � � � � � R R R in the class of all probability measures of R . 10 / 54

Optimal Design D-optimal design problem Theorem 1: The D -optimal design ξ ∗ is a uniform distribution on the set � � z | H n ( z ) = 0 where H n denotes the n -th Hermite polynomial, orthogonal with respect to the measure e − x 2 dx Two Proofs: Equivalence theorems (from design theory) and second order differential equations (Stieltjes) Moment theory 11 / 54

Optimal Design Proof; Step 1 (idea): identification of the weights Equivalence theorem for D -optimality (Kiefer and Wolfowitz, 1960): ξ ∗ is D -optimal if and only if ∀ x ∈ R e − x 2 (1 , x , . . . , x n − 1 ) M − 1 ( ξ ∗ )(1 , x , . . . , x n − 1 ) T ≤ n Moreover, there is equality for all support points of the D -optimal design. 12 / 54

Optimal Design Proof; Step 1 (idea): identification of the weights Equivalence theorem for D -optimality (Kiefer and Wolfowitz, 1960): ξ ∗ is D -optimal if and only if ∀ x ∈ R e − x 2 (1 , x , . . . , x n − 1 ) M − 1 ( ξ ∗ )(1 , x , . . . , x n − 1 ) T ≤ n Moreover, there is equality for all support points of the D -optimal design. Example: weighted polynomial regression of degree 7 ( n = 8) D -optimal design (solid curve) Equidistant design on 10 points in the interval [ − 4 , 4] Note: D -optimal design has 8 support points (saturated) 12 10 8 y 6 4 2 -6 -4 -2 0 2 4 6 x 12 / 54

Optimal Design Proof; Step 1 (idea): identification of the weights Equivalence theorem for D -optimality: ξ ∗ is D -optimal if and only if ∀ x ∈ R e − x 2 (1 , x , . . . , x n − 1 ) M − 1 ( ξ ∗ )(1 , x , . . . , x n − 1 ) T ≤ n Moreover, there is equality for all support points of the D -optimal design. The optimal design has n support points � � x 1 x 2 . . . x n ξ ∗ = ⇒ w 1 w 2 . . . w n 13 / 54

Optimal Design Proof; Step 1 (idea): identification of the weights Equivalence theorem for D -optimality: ξ ∗ is D -optimal if and only if ∀ x ∈ R e − x 2 (1 , x , . . . , x n − 1 ) M − 1 ( ξ ∗ )(1 , x , . . . , x n − 1 ) T ≤ n Moreover, there is equality for all support points of the D -optimal design. The optimal design has n support points � � x 1 x 2 . . . x n ξ ∗ = ⇒ w 1 w 2 . . . w n n n e − x 2 � � � | M ( ξ ∗ ) | ( x i − x j ) 2 = w i i 1 ≤ i < j ≤ n i = 1 i = 1 − → max x i , w i w i = 1 − → , i = 1 , . . . , n n 13 / 54

Optimal Design Proof; Step 2 (idea): identification of the support Let f ( x ) = ( x − x 1 ) . . . ( x − x n ) denote the supporting polynomial . The necessary condition for an extremum yields a system of n non-linear equations f ′′ ( x j ) − 2 x j f ′ ( x j ) = 0 j = 1 , . . . n 14 / 54

Optimal Design Proof; Step 2 (idea): identification of the support Let f ( x ) = ( x − x 1 ) . . . ( x − x n ) denote the supporting polynomial . The necessary condition for an extremum yields a system of n non-linear equations f ′′ ( x j ) − 2 x j f ′ ( x j ) = 0 j = 1 , . . . n Derive a differential equation for the supporting polynomial f ′′ ( x ) − 2 xf ′ ( x ) = − 2 nf ( x ) This differential equation has exactly one polynomial solution f ( x ) = cH n ( x ) 14 / 54

Weak asymptotics of optimal designs Weak asymptotics of roots of Hermite polynomials: Theorem 2: � √ nz 1 � � ξ ∗ � z ≤ t | H n n ((0 , t ]) = n # = 0 If n → ∞ , then : ξ ∗ n converges weakly to an absolute continuous measure µ ∗ with density d µ ∗ = 1 � 2 − x 2 I [ − √ √ 2] ( x ) 2 , dx π µ ∗ is called the Wigner semi-circle law 15 / 54

Optimal design, orthogonal polynomials and random matrices Holger - PowerPoint PPT Presentation

Optimal design, orthogonal polynomials and random matrices Holger Dette 5 joint work with F. Bretz 1 , S. Delvaux 2 , L. Imhof 3 , W.J. Studden 4 1 Novartis, Basel 2 Katholieke Universiteit Leuven 3 University of Bonn 4 Purdue University 5

Random Orthogonal Polynomials: From matrices to point processes Diane Holcomb, KTH Integrability

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

Orthogonal Complements and Orthonormal Matrices Orthogonal Complements Defn. For a set W , the

Planar orthogonal polynomials and related determinantal processes: random normal matrices and

Asymptotic Analysis of Random Matrices and Orthogonal Polynomials Arno Kuijlaars University of

Properties of orthogonal polynomials Kerstin Jordaan University of South Africa LMS Research

MATHEMATICS 1 CONTENTS Matrices Special matrices Operations with matrices Matrix

Orthogonal polynomials and zeros of optimal approximants Daniel Seco (with Bnteau, Khavinson,

Universality for zeros of random polynomials Motivation Random polynomials Turgay Bayraktar

Small-span characteristic polynomials of integer symmetric matrices James McKee (RHUL) ANTS 9,

Orthogonal Polynomials and Spectral Algorithms Nisheeth K. Vishnoi 1.0 d=0 d=1 0.5 d=5

Moments of Random Matrices and Hypergeometric Orthogonal Polynomials Francesco Mezzadri

Asymptotic Analysis of Random Matrices and Orthogonal Polynomials Arno Kuijlaars University of

Asymptotic Analysis of Random Matrices and Orthogonal Polynomials Arno Kuijlaars University of

Zeros of classical orthogonal polynomials in two variables Lidia Fern andez Joint work with

Quadrature of highly oscillatory integrals: the role of (complex) orthogonal polynomials Alfredo

Almost sure optimal hedging strategy emmanuel.gobet@polytechnique.edu Centre de Mathmatiques

How logarithm laws may fail in a mixing system: an example with a reparametrization of a

Quenched invariance principle for random walks and random divergence forms in random media on

Weyl asymptotics for non-self-adjoint operators with small random perturbations Johannes Sj

Histograms and Boxplots Continuous Improvement Toolkit . www.citoolkit.com The Continuous

Welcome to the co u rse DATA VISU AL IZATION W ITH L ATTIC E IN R Deepa y an Sarkar Associate

INFO 1998: Introduction to Machine Learning Lecture 3: Data Visualization INFO 1998: Introduction

Inform Confuse www.glhickey.com @graemeleehickey Co Confl flicts s of f interest None