A model selection algorithm for mixture experiments including - PowerPoint PPT Presentation

A model selection algorithm for mixture experiments including process variables Hugo Maruri and Eva Riccomagno Department of Statistics, London School of Economics and Dipartimento di Matematica, Universit` a di Genova mODa 8, Almagro, Spain June 2, 2007 marurimoda8.tex 1

Abstract Experiments with mixture and process variables are often constructed as the cross product of a mixture and a factorial design. Often it is not possible to implement all the runs of the cross product design, or the cross product model is too large to be of practical interest. We propose a methodology to select a model with a given number of terms and minimal condition number. The search methodology is based on weighted term orderings and can be extended to consider other statistical criteria. June 2, 2007 marurimoda8.tex 2

Contents of the talk 1. Mixture experiments with process variables and their models 2. Homogeneous supports for mixtures 3. An algorithm for model selection 4. Examples 5. Conclusions June 2, 2007 marurimoda8.tex 3

Mixture experiments with process variables • The response is assumed to depend on other factors apart from the mixture components (Cornell, 2002). • Mixture factors are x = ( x 1 , . . . , x k ) and process variables z = ( z 1 , . . . , z q ) . • For instance, the one of z could be the amount of material used and hence the name mixture-amount experiments. • The design is a finite set of points D ⊂ R k + q . The projection of D over the x -space is D x and D z is the projection over the z -space. • Often D x is a simplex centroid or simplex lattice design, while D z is a full factorial design. June 2, 2007 marurimoda8.tex 4

Example 1: The bread data set (Næs et al., 1998) • Three types of wheat flour ( x 1 , x 2 , x 3 ) and two process factors ( z 1 , z 2 ) . • Response: Loaf volume after baking. • D x a simplex lattice design { 3 , 3 } and D z a 3 2 factorial. Design D = D x × D z with 90 runs. z 2 x 1 z 1 x 2 x 3 D z D x D x × D z June 2, 2007 marurimoda8.tex 5

Models for the combined effect of the factors (Prescott, 2004) Additive regression model y ( x, z ) = f ( x ) + g ( z ) + ε. (1) Complete cross product model y ( x, z ) = f ( x ) g ( z ) + ε (2) Intermediate models q k � � y ( x, z ) = f ( x ) + g ( z ) + f ij ( x i , z j ) + ε. (3) i =1 j =1 Often f is taken to be a Scheff´ e quadratic or cubic polynomial model, in a relevant parametrization, and g is a quadratic or cubic model. June 2, 2007 marurimoda8.tex 6

Models for the combined effect of the factors 2 A mixture amount model of the form y ( x, m ) = f 0 ( x ) + mf 1 ( x ) + . . . + m p f p ( x ) + ε is suggested in (Cornell, 2002) with γ ( p ) γ ( p ) γ ( p ) � � � f p ( x ) = i x i + ij x i x j + . . . + 1 ,...,l x i 1 . . . x i l , i i<j i 1 <...<i l p is a positive integer, l ≤ q and the γ ( p ) are regression parameters. ε are assumed i.i.d. errors. Proposal : Search for a submodel of the complete cross product using • hierarchy (divisibility) condition • a statistical criterion (minimal condition number) June 2, 2007 marurimoda8.tex 7

Homogeneous models in mixtures with CCA (Maruri et al., (2006)). See every d ∈ D in P k − 1 ( R ) , i.e. C d = { αd } . 1) We construct the homogeneous ideal I ( C D ) . 2) Given a term order τ we use GB-driven CoCoA code to obtain a model of degree s . C D D x 2 × x 1 LT s Example 2 Simplex lattice { 3 , 2 } . A model with s = 2 for any τ is { x 2 1 , x 2 2 , x 2 3 , x 1 x 2 , x 1 x 3 , x 2 x 3 } . ⇒ Linear relation with K-models (Draper 1998) and S-models (Scheff` e, 1958). Design (Cone) Ideal: all polynomials that vanish on the design (cone). June 2, 2007 marurimoda8.tex 8

An algorithm for model selection Cross product support Consider a product design D = D x × D z with no replicated runs. Let E x = { x α : α ∈ L x } and E z = { z α : α ∈ L z } be sets of linearly independent monomials in R [ x 1 , . . . , x k ] /I ( D x ) and R [ z 1 , . . . , z q ] /I ( D z ) , respectively. Let E x ⊗ E z be the Kronecker product of E x and E z . Then E x ⊗ E z is a set of linearly independent monomials in R [ x, z ] /I ( D ) . Moreover if also D z and D x have no replicated points, then it is a R -vector space basis and it has dimension n x n z where n i is the number of points in D i , i = z, x . • Tipically E x and E z have a simple structure derived from the designs D x and D z June 2, 2007 marurimoda8.tex 9

An algorithm for model selection 2 Minimal condition number The condition number is defined as λ = λ max (4) λ min where λ max and λ min ≥ 0 are the maximum and minimum eigenvalues of the information matrix X T L X L and X L is the design-model matrix for the model L . • Large values of λ indicate X T L X L close to singular, i.e. λ min ≈ 0 . • Small condition number λ indicates more stability in the least square estimates and smaller variance inlation factor then big condition numbers. • Useful when searching among homogeneous models as it favours Kronecker models, which are conjectured robust to miss-specification of information matrix in mixtures (Prescott et al ., 2002). June 2, 2007 marurimoda8.tex 10

An algorithm for model selection 3 Algorithm Input A fraction F ⊆ D x × D z ; D x and D z and supports E x and E z ; n = number of final terms. For identifiability, n ≤ # F must hold. Output A submodel L 0 with minimal condition number λ 0 , formed with the smallest terms of E x × E z wrt a weighted order. Technique Generate candidate submodels by ordering E x × E z (complete cross product) with weight vectors w ∈ W + , and look for the candidate with smallest condition number. • The search is driven by a finite set of weights W + , i.e. it ends. • The model L 0 respects a hierarchical structure. • Use of arbitrary supports E x and E z . • The Algorithm is of order O (( n x n z ) 2( qk − 1) n 2 ) = poly ( n x n z ) . June 2, 2007 marurimoda8.tex 11

Example 3: Mixture amount design Factors x = ( x 1 , x 2 ) , z = ( m ) listed as ( x 1 , x 2 , m ) . x 1 x 2 m x 2 1 x 1 x 2 x 2 2 mx 2 1 mx 1 x 2 mx 2 2 0 1 1 0 0 1 0 0 1 0 2 2 0 0 4 0 0 8 1 1 2 1 1 1 2 2 2 2 0 2 4 0 0 8 0 0 We have E x = { x 2 1 , x 1 x 2 , x 2 2 } and E z = { 1 , m } . The algorithm returns the support for a mixture amount model x 2 1 , x 2 2 , x 1 x 2 , mx 2 � � L 0 = for w = (1 , 2 , 3) . 2 • Set of representatives W + can be expensive to compute, use of approximate set ˜ W + , simulated over the ( q + k − 1) -simplex. June 2, 2007 marurimoda8.tex 12

Example 1 (cont.): Bread data set Analysis in (Prescott, 2004). • Final model with 15 terms, R 2 = 0 . 998 , ˆ σ = 21 . 04 . • Condition number λ = 86 . 83 . • Fitted model Y = x 1 (522 . 8 + 13 . 0 z 1 + 56 . 3 z 2 − 39 . 4 z 2 ˆ 1 − 10 . 2 z 2 2 ) + x 2 (448 . 1 + 1 . 7 z 1 + 37 . 2 z 2 + 3 . 7 z 2 1 − 28 . 4 z 2 2 ) + x 3 (599 . 3 + 54 . 3 z 1 + 73 . 8 z 2 − 46 . 0 z 2 1 + 1 . 0 z 2 2 ) i.e. a mixture of predictive models for every type of flour. • Symmetric support June 2, 2007 marurimoda8.tex 13

Example 1 (cont.): Bread data set (Using Algorithm). Factors listed as ( x 1 , x 2 , x 3 , z 1 , z 2 ) . e model E x = { x 1 , x 2 , x 3 , x 2 1 , x 2 2 , x 2 • Scheff` 3 , x 1 x 2 , x 1 x 3 , x 2 x 3 } and full product model E z = { 1 , z 1 , z 2 , z 2 1 , z 1 z 2 , z 2 2 } . • Model with λ 0 = 47 . 47 and support L 0 = { x 1 , x 2 , x 3 } ⊗ { 1 , z 1 , z 2 } ∪ { x 2 , x 3 } ⊗ { z 2 1 , z 1 z 2 , z 2 2 } for w = (17 , 12 , 10 , 3 , 2) ∈ ˜ W . σ = 22 . 7 and R 2 = 0 . 998 : • Fitted model with ˆ ˆ Y = x 1 (489 . 7 + 13 . 0 z 1 + 56 . 3 z 2 ) + x 2 (467 . 9 + 1 . 7 z 1 + 37 . 1 z 2 ) + x 3 (619 . 1 + 54 . 2 z 1 + 73 . 8 z 2 ) + x 2 ( − 19 . 9 z 2 1 + 3 . 6 z 1 z 2 − 34 . 6 z 2 2 ) + x 3 ( − 69 . 6 z 2 1 + 13 . 3 z 1 z 2 − 5 . 1 z 2 2 ) • Slight asymmetry allows for reduction in condition number. June 2, 2007 marurimoda8.tex 14

Final comments • The Algorithm blends the change of basis (Faug` ere et al ., 1993) with a statistical criterion. • The search space of the Algorithm presented is much smaller than a full search. • It can be adapted to consider other criterion or even composite criteria. For example it could be used for hierarchical model selection (Peixoto, 1987) (Bates et al ., 2003). • Expensive computation of set of weights W , but approximate set ˜ W allows fast search. Stopping rule still empirical. • A possible drawback is the potential exclusion of symmetric models. This is inherent by the use of term orders ( w -order), e.g. there is no term order such that x 2 1 ≻ x 2 2 ≻ x 1 x 2 . June 2, 2007 marurimoda8.tex 15

References Bates et al . (2003). Technometrics 45 ,246-255. Cornell (2002). Experiments with mixtures . Draper, Pukelsheim (1998). JSPI 71 (1-2),303-311. Faug` ere et al . (1993). Jour. Symb. Comp. 16 (4),329-344. Maruri, Notari, Riccomagno (2006). Statistica Sinica (in print). Næs et al . (1998). Chem. Int. Lab. Syst. 41 , 221-235. Peixoto (1987). Am. Stat. 41 (4),311-313. Prescott et al . (2002) Technometrics 44 (3),260-268. Prescott (2004). Qual. Tech. & Qual. Manag. 1 (1), 87-103. Scheff` e (1958). JRSS B 20 ,344-360. June 2, 2007 marurimoda8.tex 16

A model selection algorithm for mixture experiments including - PowerPoint PPT Presentation

A model selection algorithm for mixture experiments including process variables Hugo Maruri and Eva Riccomagno Department of Statistics, London School of Economics and Dipartimento di Matematica, Universit` a di Genova mODa 8, Almagro, Spain

Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The

Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R

Mixture Selection, Mechanism Design, and Signaling Ho Yee Cheung Shaddin Dughmi Yu Cheng Ehsan

MIXTURE DENSITY NETWORKS MIXTURE DENSITY NETWORKS Charles Martin SO FAR; RNNS THAT MODEL

Metropolis-Hastings Algorithm for Mixture Model and its Weak Convergence Kengo, KAMATANI

Space-Time Areal Mixture Model: Relabeling Algorithm and Model Selection Issues Md Monir

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Flexible Mixture Modeling and Model-Based Clustering in R Bettina Grn September 2017 c

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

Classification of High Dimensional Data By Two-way Mixture Models Jia Li Statistics Department

Assignment 3 Zahra Sheikhbahaee Zeou Hu & Colin Vandenhof February 2020 1 [2 points]

The EM Algorithm The EM algorithm Mixture models Why EM works EM variants Learning

Experiments on deflection of charged Experiments on deflection of charged Experiments on

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Intervention tracks scope-taking (in Japanese and English) Michael Yoshitaka Erlewine Hadas

P2P Systems: Gossip Protocols CS 6410 By Alane Suhr & Danny Adams 1 Outline Timeline

Linear Systems of Equations I Example Solve x 1 + 2 x 2 + x 3 = 1 x 1 + 3 x 2

Distributed Systems: Ordering and Consistent Cuts by Maofan (Ted) Yin my428@cornell.edu Time,

Part 10: Vector Space Classification Francesco Ricci 1 Content p Recap on nave Bayes p

Basic Elec. Engr Basic Elec. Engr. Lab . Lab ECS 204 ECS 204 Asst. Prof. Dr. Prapun Suksompong

Roger Williams 02.18.12 || English 2327: American Literature I || D. Glen Smith, instructor

Geometry Beyond 3D Noah Snavely Google Inc., Cornell University Bay Area Vision Meeting, 2014

A model selection algorithm for mixture experiments including - PowerPoint PPT Presentation

A model selection algorithm for mixture experiments including process variables Hugo Maruri and Eva Riccomagno Department of Statistics, London School of Economics and Dipartimento di Matematica, Universit` a di Genova mODa 8, Almagro, Spain

Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The

Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R

Mixture Selection, Mechanism Design, and Signaling Ho Yee Cheung Shaddin Dughmi Yu Cheng Ehsan

MIXTURE DENSITY NETWORKS MIXTURE DENSITY NETWORKS Charles Martin SO FAR; RNNS THAT MODEL

Metropolis-Hastings Algorithm for Mixture Model and its Weak Convergence Kengo, KAMATANI

Space-Time Areal Mixture Model: Relabeling Algorithm and Model Selection Issues Md Monir

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Flexible Mixture Modeling and Model-Based Clustering in R Bettina Grn September 2017 c

STAT 213 Model Selection II Colin Reimer Dawson Oberlin College March 30, 2018 1 / 13 Outline

Classification of High Dimensional Data By Two-way Mixture Models Jia Li Statistics Department

Assignment 3 Zahra Sheikhbahaee Zeou Hu &amp; Colin Vandenhof February 2020 1 [2 points]

The EM Algorithm The EM algorithm Mixture models Why EM works EM variants Learning

Experiments on deflection of charged Experiments on deflection of charged Experiments on

SECONDHAND SELECTION Sales Price - 275,000.00 EU SECONDHAND SELECTION INTERNAL VIEWS SECONDHAND

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

SELECTION Deterministic Stochastic Proportionate selection: Roulette Wheel Selection

Intervention tracks scope-taking (in Japanese and English) Michael Yoshitaka Erlewine Hadas

P2P Systems: Gossip Protocols CS 6410 By Alane Suhr &amp; Danny Adams 1 Outline Timeline

Linear Systems of Equations I Example Solve x 1 + 2 x 2 + x 3 = 1 x 1 + 3 x 2

Distributed Systems: Ordering and Consistent Cuts by Maofan (Ted) Yin my428@cornell.edu Time,

Part 10: Vector Space Classification Francesco Ricci 1 Content p Recap on nave Bayes p

Basic Elec. Engr Basic Elec. Engr. Lab . Lab ECS 204 ECS 204 Asst. Prof. Dr. Prapun Suksompong

Roger Williams 02.18.12 || English 2327: American Literature I || D. Glen Smith, instructor

Geometry Beyond 3D Noah Snavely Google Inc., Cornell University Bay Area Vision Meeting, 2014

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Assignment 3 Zahra Sheikhbahaee Zeou Hu & Colin Vandenhof February 2020 1 [2 points]

P2P Systems: Gossip Protocols CS 6410 By Alane Suhr & Danny Adams 1 Outline Timeline