Session A: Supersaturated Design (Wednesday, March 4, 8:30AM-10:00AM) - - PDF document

▶

Sep 23, 2022 513 likes •646 views

Session A: Supersaturated Design (Wednesday, March 4, 8:30AM-10:00AM) Searching for Powerful Supersaturated Designs David Edwards, Virginia Commonwealth University An important property of any experimental design is its ability to detect active

SLIDE 1

Session A: Supersaturated Design (Wednesday, March 4, 8:30AM-10:00AM)

Searching for Powerful Supersaturated Designs David Edwards, Virginia Commonwealth University An important property of any experimental design is its ability to detect active factors. For supersaturated designs, in which factors outnumber experimental runs, power is even more critical. In this talk, we consider several popular supersaturated design construction criteria, propose several of our own, and discuss the performance of an extensive simulation study to evaluate these construction criteria in terms

f power. We use two analysis methods - forward selection and the Dantzig selector - and find that

although the latter clearly outperforms the former, most supersaturated design construction methods are indistinguishable in terms of power. We demonstrate further, however, that when the sign of each main effect can be correctly specified in advance, supersaturated designs obtained by minimizing the variance

f all squared pairwise inner products of the information matrix (subject to a constraint on the average of

these off-diagonal elements) have significantly higher power to detect active factors when compared to standard criteria. Benefits and Fast Construction of Efficient Two-Level Foldover Designs Anna Errore, University of Minnesota Recent work in two-level screening experiments has demonstrated the advantages of using small foldover designs, even when such designs are not orthogonal for the estimation of main effects. In this paper, we provide further support for this argument and develop a fast algorithm for constructing efficient two-level foldover (EFD) designs. We show that these designs have equal or greater efficiency for estimating the main effects model versus competitive designs in the literature and that our algorithmic approach allows the fast construction of designs with many more factors and/or runs. Our compromise algorithm allows the practitioner to choose among many designs making a trade-off between efficiency of the main effect estimates and correlation of the two-factor interactions. Using our compromise approach practitioners can decide just how much efficiency they are willing to sacrifice to avoid confounded two-factor interactions as well as lowering an omnibus measure of correlation among the two-factor interactions. E(s2) and UE(s2) – Optimal Supersaturated designs C.S. Cheng, Academia Sinica The popular E(s2)-criterion for choosing two-level supersaturated designs minimizes the sum of squares

f the entries of the information matrix over the designs in which the two levels of each factor appear the

same number of times. Jones and Majumdar (2014) proposed the UE(s2)-criterion which is essentially the same as the E(s2)-criterion except that the requirement of factor-level balance is dropped. We compare UE(s2)-optimal designs and the traditional E(s2)-optimal designs based on the average efficiencies over lower-dimensional projections. Since the requirement of level-balance is bypassed, usually there are many UE( s2)-optimal designs with diverse performances when other things are considered. Jones and Majumdar (2014) mentioned the maximization of the number of level-balanced factors as a possible secondary criterion. We show by example that this does not work well from a projective point of view and

SLIDE 2

propose a more appropriate secondary criterion. We also identify several families of designs that are both E(s2)-and UE(s2)-optimal. This is a joint work with Pi-Wen Tsai.

Session B: Covering Arrays (Wednesday, March 4, 10:30AM-12:00PM)

Covering Arrays: Applications, algorithms, and challenges Joseph Morgan, SAS Institute Inc. A homogenous covering array CA( N; t, k, v), is an N x k array on v symbols such that any t column projection contains all vt level combinations at least once. In this talk we will describe key generalizations to this basic homogenous covering array model, show the link to orthogonal arrays, and explain why these constructs are increasingly viewed by the software engineering community as an important tool for software validation. In the process we will provide an overview of algorithms and construction methods and discuss some of the challenges that remain. Covering arrays avoiding forbidden edges Elizabeth Maltais, University of Ottawa Covering arrays avoiding forbidden edges (CAFEs) are combinatorial designs with applications to the design of test suites in the following scenario: all required interactions between pairs of components are covered by at least one test, while a specified list of forbidden interactions is avoided by all tests. When no interactions are forbidden, CAFEs are simply covering arrays of strength two. In this talk, we survey some important introductory results on CAFEs, including their relationship to edge clique covers, and computational complexity results. We also discuss further generalizations of CAFEs which allow for

ptional interactions as well as forbidden and required interactions.

Column Replacement and Covering Arrays Charles J. Colbourn, Arizona State University The construction of covering arrays with many factors is a challenging problem, particularly for larger

strengths. Computational methods have proved very effective for strengths two and three, to produce

covering arrays with tens or even hundreds of factors. For larger strengths and for more factors, certain algebraic constructions furnish examples. Nevertheless, the primary methods for producing such covering arrays are recursive constructions, which make "large" covering arrays from "small" ingredient arrays. Two main classes of recursive constructions have been developed, the "cut-and-paste" constructions and the "column replacement" constructions. Column replacement methods use pattern matrices to select columns from ingredient matrices. Combinatorial requirements on the pattern matrices lead to arrays known as hash families, while those

n the ingredients lead to variants of covering arrays. In this talk we outline how these methods can be

used to produce covering arrays for large factor spaces that often have the fewest rows. More importantly, we focus on the generality with which these column replacement methods can be applied.

SLIDE 3

Session C: Blocking (Wednesday, March 4, 1:00PM-2:30PM)

Optimal Regular Graph Designs Sera Cakiroglu, Cancer Research UK A typical problem in optimal design theory is finding an experimental design that is optimal with respect to some criteria in a class of designs. The most popular criteria include the A- and D-criteria. In 1977, John and Mitchell conjectured that if an incomplete block design is D-optimal (or A-optimal), then it is an RGD (if any RGDs exist). The conjecture is wrong in general but holds if the number of blocks is large enough. Using a graph theoretical representation of the A- and D-optimality criteria, we capitalized on the power of symbolic computing with Mathematica and performed an exact computer search for the best regular graph designs in large systems for up to 20 points. I will present computational and theoretical results including examples that support some open conjectures and an example that shows that A- and D-optimality is not equivalent even among regular graph designs. Optimal semi-Latin squares Leonard Soicher, Queen Mary University of London An (n x n)/k semi-Latin square is a block design for nk treatments in blocks of size k, with the blocks arranged to form an n by n array, such that each treatment occurs exactly once in each row and exactly

nce in each column of the array. Semi-Latin squares have applications in areas including the design of

agricultural experiments, consumer testing, and via their duals, human-machine interaction. I will survey some recent, and some not so recent, results and constructions for optimal semi-Latin squares, including results on A-, D-, E-, and Schur-optimality. I will also mention some open problems. Block designs with very low replication Rosemary Bailey, University of St Andrews In the early stages of testing new varieties, it is common that there are only small quantities of seed of many new varieties. In the UK (and some other countries with centuries of agriculture on the same land) variation within a field can be well represented by a division into blocks. Even when that is not the case, subsequent phases (such as testing for milling quality, or evaluation in a laboratory) have natural blocks, such as days or runs of a machine. I will discuss how to arrange the varieties in a block design when the average replication is less than two.

SLIDE 4

Session D: Computer Experiments (Wednesday, March 4, 3:15PM-4:45PM)

Design considerations for sparse basis expansion problems with applications to computational materials science Shane Reese, Brigham Young University Economic progress depends critically on the high-performance materials such as lightweight alloys, high- energy-density battery materials, recyclable motor vehicle and building components, and energy-efficient

lighting. Industrial growth areas depend, in part, on fundamental understanding of materials science and

the atomic particle behavior. We discuss the role of statistical model selection in complex computational models of crystal structure in material properties. Density functional theory suggests that the stationary electronic state can be expressed through many-electron time-independent Schrödinger equation and this presents a high dimensional model selection problem. The cluster expansion formulation allows rapid assessment of a wide variety of alloy combinations and prediction of important materials science

properties. We propose a Bayesian compressive sensing (BCS) approach to perform model selection in

the sparse basis formed by the cluster expansion based on computer experiments run with ab initio codes (VASP). The methods are illustrated by applying the approach to two common alloy systems and extensions to 700 other systems are discussed. The value in the methodology is that physically interpretable bases and sparseness prior implementations are demonstrated to be faster and more feasible for large systems with small training datasets. In this talk we explore design considerations for both the initial training datasets for the BCS approach as well as follow-up experimental runs selection. Current Challenges and the Application of Computer Experiments in Industry William Myers, The Procter & Gamble Company The implementation of computer experiments is a competitive advantage in business environments where fast and cost effective product development is critical. In many industrial applications computer experiments are replacing physical experiments because the physical creation and testing of prototypes is very prohibitive in terms of time and cost. Computer experiments typically involve complex systems with numerous input variables. A primary goal with computer experiments is to develop a metamodel – a good empirical approximation to the original complex computer model – which provides an easier and faster approach to sensitivity analysis, prediction and optimization. This talk will cover real industry applications as well current design and modeling challenges that are encountered.

SLIDE 5

Discretization mesh selection for differential equation models as a problem of statistical design Oksana Chkrebtii, The Ohio State University When models are defined implicitly by systems of differential equations with no closed form solution, small local errors in finite-dimensional solution approximations can propagate into large deviations from the true underlying model trajectory. Therefore the choice of discretization mesh size and its design present a complex trade-off between accuracy of the estimated solution and computational resources. A natural question is how we can arrange the discretization grid over the domain in such a way as to reduce uncertainty in the resulting approximation. We combine a new Bayesian framework for quantifying uncertainty in the approximation of unknown ODE and PDE solutions with sequential design principles for

btaining optimal mesh arrangements to evaluate complex differential equation models.

Session E: Orthogonal Arrays: Structure, Properties and Applications (Thursday, March 5, 8:00AM-9:30AM)

Strong Orthogonal Array and Related Space-Filling Designs Boxin Tang, Simon Fraser University Composite Designs Based on Orthogonal Arrays and Definitive Screening Designs Hongquan Xu, UCLA Generalized Supplementary Difference Sets (GSDS): A Key to the Construction of Orthogonal Arrays Frederick Kin Hing Phoa, Academia Sinica

Session F: Model Selection in Design of Experiments (Thursday, March 5, 10:00AM-11:30AM)

CME Analysis: a New Method for Unraveling Aliased Effects in Two-Level Fractional Factorial Experiments Jeff Wu, Georgia Tech. Ever since the founding work by Finney, it has been widely known and accepted that aliased effects in two-level regular designs cannot be “de-aliased” without adding more runs. A surprising result by Wu in his 2011 Fisher Lecture showed that aliased effects can sometimes be “de-aliased” using a new framework based on the concept of conditional main effects (CMEs). This idea is further developed into a methodology that can be readily used. Some key properties are derived that govern the relationships among CMEs or between them and related effects. As a consequence, some rules for data analysis are

developed. Based on these rules, a new CME-based methodology is proposed. Three real examples are

SLIDE 6

used to illustrate the methodology. The CME analysis can offer substantial increase in the R-squared value with fewer effects in the chosen models. Moreover, the selected CME effects are often more interpretable. (joint work with Heng Su) Bayesian model selection and prediction from split-plot experiments with multivariate response: a dissolution case study Emily Matthews, University of Southampton Split-plot designs with multivariate responses are common in industrial experimentation, where multiple measurements are often made on each experimental unit. The motivation for this work is the modelling

f multivariate dissolution data for formulation testing of a pharmaceutical product. The aim of the study

was both to identify those active factors having a substantive impact on the responses of interest, and also preliminary identification of treatments with predicted maximum response. As is becoming increasingly common, we adopted a Bayesian modelling approach, and developed methodology for model selection for multivariate linear mixed models. Gibbs sampling is used to obtain approximations to the posterior probability of each factorial effect being active, which can be interrogated graphically. To identify optimum operating conditions, model-averaged predictions from the Gibbs sampler were treated as the outputs from a black-box simulator, and Expected Improvement was used to maximise the predictions. Design of Variable Resolution for Model Selection

C. Devon Lin, Queen’s University

Prior information or background knowledge may suggest that interactions arise only within certain factors in practice. When such knowledge is available, designs of variable resolution were proposed and their merits are theoretically justified by Lin (2012). In this talk, we provide algorithms to obtain optimal designs

f variable resolution based on the minimum G2-aberration criterion and design efficiency. We

demonstrate the merit of designs of variable resolution in model selection via a simulation study. Some comments are made on the performance of optimal designs of variable resolution in using certain model selection methods.

Session: Kathryn Chaloner Memorial (Thursday, March 5, 12:30PM-1:30PM)

Kathryn Chaloner: A Leader in Design Sharon Lohr, Westat Kathryn Chaloner made outstanding contributions to many areas of statistics, through her research, work in clinical trials, administration, teaching, and mentorship. In this talk I will review a few of her contributions to the field of Bayesian design of experiments, and describe some of the research and applications that have been influenced by her work.

SLIDE 7

Exploration of Individualized Trial Designs in Oncology Studies Qian Shi, Mayo Clinic Common designs used in practice including single-arm design, such as Simon’s optimal (or mimimax) and Fleming two stage designs. Randomized design has become popular for phase II trials in oncology as the primary endpoint shifts from response to survival endpoints. The research in cancer treatments has entered an exciting new era with the blooming discoveries of biomarkers. These findings enrich the understanding of biological progress of disease and call for development of individualized interventions. The commonly used phase II designs cannot fulfill with this need when the study goal is beyond treatment comparison, and/or other trial considerations (safety, lack of historical data, limited resource, etc.) are critical to include. We illustrate designs tailed to study-specific needs of targeting, adaptation, biomarker- driven and efficiently screening through three real-life examples including one open trial and two studies in development sponsored by US National Cancer Institution. Assessing and Comparing Anesthesiologists’ Performance on Mandated Metrics using a Bayesian Hierarchical Model Emine O. Bayman, University of Iowa Background: The Joint Commission requires periodic evaluation of the clinical performance of anesthesiologists as part of the Ongoing Professional Performance Evaluation quality improvement program. Methods: We assessed the frequency of cases in which (1) the first arterial or non-invasive blood pressure (BP) or (2) the first pulse oximetry measured oxygen saturation (SpO2) was recorded ≥ 5 minutes after the start of induction (defined as non-compliance). 135 preoperative patient or procedural characteristics were analyzed with decision trees. Performance assessments were based on 63,913 general anesthetics performed over five sequential six-month periods; each with at least 11,743 anesthetics and 53

anesthesiologists. A Bayesian hierarchical model, designed to identify anesthesiologists as “performance
utliers” after adjustment for covariates, was developed and compared with frequentist methods.

Results: The global incidences of non-compliance (with frequentist 95% confidence interval) were 5.35 (5.17 to 5.53)% for BP and 1.22 (1.14 to 1.30)% for SpO2 metrics. Using unadjusted rates and frequentist statistics, up to 43% of anesthesiologists would be deemed noncompliant for the BP metric and 70% of anesthesiologists for the SpO2 metric. Using Bayesian analyses with covariate adjustment, only 2.44 (1.28 to 3.60)% and 0.00% of the anesthesiologists would be deemed “noncompliant” for BP and SpO2, respectively. Conclusion: Bayesian hierarchical multivariate methodology with covariate adjustment is better suited to faculty monitoring than the non-hierarchical frequentist approach.

SLIDE 8

Session G: DOE for Pharmaceutical Applications (Thursday, March 5, 1:45PM- 3:15PM)

On parameter estimation and optimal design for nonlinear mixed effects models Sergei Leonov, AstraZeneca We discuss methods of model-based optimal experimental design applied in population pharmacokinetic/pharmacodynamic studies. The derivation of Fisher information matrix (FIM) for a single individual is a key component of efficient design algorithms. In this presentation we focus on links between various parameter estimation methods for nonlinear mixed effects models and corresponding

ptions for approximation of individual FIM.

Adaptive Design and Analysis of Dose-Finding studies under Model-uncertainty Tobias Mielke, Icon plc The characterization of the dose-response relation and the determination of the optimal dose level for confirmatory testing are key objectives in exploratory phases of drug development. Optimal design theory supports the determination of appropriate allocation probabilities to different dose levels, such that the prediction variance of the dose-response or the variance in estimating the optimal dose may theoretically be minimized. The optimality criterion depends on the underlying dose-response assumption. Considering the wrong model in the design stage may lead to relatively useless trial data. Analysis procedures, as the MCP-Mod approach or Bayesian modeling may take this model-uncertainty to some extent into account. Adaptive designs offer in addition the adjustment of dose allocation rates and the inclusion of additional dose levels during the running trial. A short introduction on MCP-Mod will be followed by an overview and motivation of adaptive design options. A case study will present the benefits and limitations of the adaptive dose-finding approach as compared to standard fixed designs. Design of clinical studies with “time-to-event” end points subject to random censoring Valerii Fedorov, Quintiles Additionally to observational uncertainties generated by observational errors or by variability between units/subjects that are typical in the traditional experiments in clinical trials we face uncertainties caused by enrollment process that often can be viewed as a stochastic processes. The latter makes the amount

f information that can be gained during experimentation uncertain at the design stage. To address the

problem we have to modify the concept of “optimal design” and to develop methods that guarantee that the information metrics either will be greater than a predefined levels with the smallest probability or at least the average information will be maximized. We illustrate the approach using proportional hazard models with censored observations and enrollment described by the Poisson process.

SLIDE 9

Session H: Discrete Choice Designs in Marketing and Industry (Thursday, March 5, 4:00PM-5:30PM)

Some Recent Developments in Discrete Choice Experiments William Li, University of Minnesota Conjoint analysis and discrete choice experiments, which were developed in fields such as marketing and economics, are useful for understanding the voice of the customer to guide quality improvement efforts. In the first part of the work, we discuss what they are, why they are useful methodologies for quality improvement, and how a discrete choice experiment can be carried out. We demonstrate the methodology by discussing a real case study in quality improvement in detail. We then introduce a new class of designs for discrete choice experiments that are robust for a class of possible models. We provide several examples in which an optimal design based on the main-effects only models is shown to have limited capability for estimation of two-factor interactions, whereas the proposed robust designs perform well in the presence of two-factor interactions. In the second part of the talk, we present another real case study, for which a new model that takes into account the response time is used. We also introduce some results on the impact of fatigue effects on choice experiments. The use of Firth penalized-likelihood estimates for individual-level discrete choice modeling and market segmentation Roselinde Kessels, University of Antwerp Individual-level choice data often exhibit separation which occurs if the responses can be perfectly classified by a linear combination of the attributes of the alternatives. In this case, the maximum likelihood (ML) estimator does not exist. A lesser problem with ML estimation is that, if the ML estimator exists, it tends to over-estimate the utility of strongly preferred attribute levels, and under-estimate the utility of weakly preferred attribute levels. We show how to overcome these two problems using the penalized- likelihood method of Firth, which we applied to the multinomial logit (MNL) model. A major advantage of the method is that it allows fitting a MNL model to individual-level data when the number of choice sets evaluated by each respondent permits, and, subsequently, exploring the heterogeneity in the respondents’ preferences and identifying market segments. Unlike panel mixed logit or latent class models, Firth’s approach does not require imposing an a priori preference heterogeneity distribution for each of the model parameters. This is important since it is not at all clear what an appropriate a priori preference heterogeneity distribution would be when markets are segmented. We demonstrate the use

f Firth’s method using a real-life study that investigates preferences for various forms of compensation
f employees.

Efficient Design and Analysis for a Selective Choice Process Qing Liu, University of Wisconsin Variable selection is a decision heuristic that describes a selective choice process – choices are made based

n only a subset of product attributes while the presence of other attributes plays no active role in the

decision ("inactive attributes"). This paper addresses two integrated topics of interest to marketing that

SLIDE 10

have received scant attention: the efficient design of choice experiments and the analysis of data in the context of a selective choice process. For efficient design, the authors propose a new compound design criterion that effectively incorporates prior information to serve the joint purpose of efficient detection

f the inactive attributes and efficient estimation of the active attributes in the selective choice process.

For efficient analysis, the authors propose a heterogeneous variable selection model that incorporates respondent-specific information about the active/inactive status of an attribute through the prior

specification. The authors demonstrate the superior performance of the proposed approach relative to

benchmark approaches using both simulated data and actual data obtained from a conjoint choice field experiment and discuss the substantive implications of the results.

Session I: Combinatorial Design (Friday, March 6, 8:30AM-10:00AM)

Optimal Experimental Designs for fMRI via Circulant Biased Weighing Designs Ming-Hung (Jason) Kao, Arizona State University Functional magnetic resonance imaging (fMRI) technology is widely considered in many fields for studying how the brain reacts to mental stimuli. Identifying optimal fMRI experimental designs (i.e. sequences of stimuli) is crucial for collecting informative data to render precise statistical inference. However, research

n this important topic is very lacking. Here, we study optimal fMRI designs for the estimation of the

individual hemodynamic response function (HRF) that describes the effect over time of a stimulus, and for the comparison between the HRFs. We provide a useful connection between the fMRI design issues with circulant biased weighing design problems to derive analytical results to guide the selection of fMRI

designs. Our results allow to establish the optimality of some well-known fMRI designs, and to identify

several new classes of fMRI designs. Some construction methods for optimal fMRI designs using a certain type of Hadamard matrices and difference sets are also provided.

Session J: Teaching DOE to Non-Statisticians (Friday, March 6, 10:30AM-12:00PM)

Teaching Design of Experiments to Engineers and Scientists

Doug Montgomery, Arizona State University

Teaching Design of Experiments to Engineers and Scientists Today designed experiments are widely used in industrial research and development for product design and formulation, process design and development, and process improvement. Much of this work is performed by engineers and scientists with in some cases little formal assistance from statisticians. The last two decades have seen considerable growth in students from these disciplines taking university courses in design of experiments as well an increasing number of short courses both public and as part of business improvement initiatives such as lean six sigma. This presentation focuses on some practical issues that arise in designing and teaching these

courses. Emphasis is on a one-term course designed for advanced undergraduates and first-year

SLIDE 11

graduate students. The presentation focuses on course topics, emphasis on content, the role of software, and managing a project experience for class participants.