Sta$s$cs & Experimental Design with R Barbara - PowerPoint PPT Presentation

Sta$s$cs ¡& ¡Experimental ¡Design ¡ with ¡R ¡ Barbara ¡Kitchenham ¡ Keele ¡University ¡ 1 ¡

Introduc$on ¡ Part ¡1 ¡ 2 ¡

Scope ¡of ¡Workshop ¡ • Basic ¡Sta$s$cs ¡ – Classical ¡sta$s$cal ¡methods ¡ • Parametric ¡& ¡Non-‑Parametric ¡ – Newer ¡methods ¡ • Randomisa$on ¡(Permuta$on ¡methods) ¡ • Sample-‑based ¡robust ¡methods ¡ • Experimental ¡design ¡ ¡ – Experiments ¡and ¡Quasi-‑experiments ¡ 3 ¡

Popula$on ¡and ¡Samples ¡ • Popula$on ¡ – All ¡par$cipants ¡or ¡objects ¡relevant ¡to ¡a ¡study ¡ • All ¡Java ¡programmers ¡ • All ¡soQware ¡development ¡companies ¡ • Sample ¡ – A ¡subset ¡of ¡subjects ¡or ¡objects ¡belonging ¡to ¡the ¡relevant ¡ popula$on ¡ • Random ¡sample ¡ – Sample ¡where ¡member ¡of ¡popula$on ¡has ¡same ¡probability ¡of ¡ being ¡included ¡ – Assump$on ¡underlying ¡many ¡sta$s$cal ¡methods ¡ – Basis ¡of ¡generalisa$ons ¡from ¡sample ¡to ¡popula$on ¡ – You ¡need ¡to ¡be ¡sure ¡you ¡know ¡ ¡whether ¡or ¡not ¡you ¡have ¡a ¡ random ¡sample ¡ 4 ¡

Fundamental ¡Concepts ¡of ¡Sta$s$cs ¡ • Design ¡ – Planning ¡& ¡carrying ¡out ¡an ¡experiment ¡ • Descrip$on ¡ – Methods ¡for ¡summarizing ¡data ¡ ¡ • Inference ¡ – Making ¡predic$ons ¡or ¡generaliza$ons ¡about ¡ the ¡popula$on ¡from ¡the ¡sample ¡ 5 ¡

Design ¡ • Types ¡of ¡Study ¡(for ¡this ¡tutorial) ¡ – Experiment ¡ • A ¡test ¡under ¡controlled ¡condi$ons ¡to ¡examine ¡the ¡validity ¡of ¡ a ¡hypothesis ¡ • Randomised ¡experiment ¡ – Subjects/objects ¡in ¡a ¡sample ¡are ¡allocated ¡at ¡random ¡to ¡one ¡of ¡ two ¡or ¡more ¡experimental ¡treatments/interven$ons ¡ • Quasi-‑experiment ¡ – Subjects/objects ¡cannot ¡be ¡allocated ¡at ¡random ¡ » Males ¡v. ¡Females ¡ » Project ¡that ¡used ¡CMM ¡v. ¡those ¡that ¡did ¡not ¡ – Observa$onal ¡study/Correla$onal ¡Study ¡ • Features ¡of ¡a ¡sample ¡of ¡subjects/objects ¡are ¡measured ¡ – You ¡always ¡need ¡to ¡be ¡sure ¡you ¡know ¡what ¡type ¡of ¡ study ¡you ¡are ¡doing ¡ 6 ¡

Descrip$on ¡ • Descrip$ve ¡sta$s$cs ¡ – Measures ¡that ¡describe ¡or ¡display ¡graphically ¡ proper$es ¡of ¡the ¡ sample ¡ • Measures ¡of ¡central ¡tendency ¡ – Also ¡called ¡measures ¡loca$on ¡ – Aim ¡to ¡iden$fy ¡the ¡value ¡of ¡a ¡typical ¡member ¡of ¡ the ¡sample ¡ • Measures ¡of ¡dispersion ¡ – Aim ¡to ¡iden$fy ¡the ¡spread ¡of ¡values ¡in ¡the ¡ sample ¡ • Graphical ¡displays ¡ – Aim ¡to ¡reveal ¡distribu$on ¡of ¡values ¡ 7 ¡

Inference ¡ Inferen$al ¡sta$s$cs ¡ • – OQen ¡the ¡same ¡as ¡descrip$ve ¡sta$s$cs ¡ – But ¡intended ¡to ¡apply ¡to ¡the ¡popula$on ¡ Sta$s$cal ¡claims ¡are ¡based ¡on ¡random ¡samples ¡ • – Without ¡random ¡samples ¡claims ¡need ¡to ¡be ¡jus$fied ¡ However ¡generaliza$on ¡may ¡not ¡cover ¡the ¡en$re ¡range ¡of ¡ ¡ • – Seangs ¡ ¡ – Task ¡and ¡material ¡complexity ¡ – Possible ¡outcome ¡measures ¡ – Subjects/objects ¡of ¡study ¡ – Interven$ons/treatments ¡ Random ¡sampling ¡does ¡not ¡rule ¡out ¡ ¡possibility ¡of ¡errors ¡ • – Type ¡1 ¡error ¡α= ¡Incorrectly ¡rejec$ng ¡the ¡null ¡hypothesis ¡ – Type ¡2 ¡error ¡β ¡= ¡incorrectly ¡accep$ng ¡the ¡null ¡hypothesis ¡ – Note: ¡Power=1-‑β ¡ 8 ¡

Sta$s$cal ¡approaches ¡-‑ ¡1 ¡ • Classical ¡Sta$s$cs ¡ – Parametric ¡methods ¡ ¡ • Frequency ¡Distribu$ons ¡ • ANOVA ¡ • Regression ¡& ¡Correla$on ¡ • Con$ngency ¡Tables ¡ • Usually ¡based ¡on ¡Normal/Gaussian ¡Distribu$on ¡ – May ¡be ¡unreliable ¡if ¡Normality ¡assump$ons ¡don’t ¡ hold ¡ • Star$ng ¡point ¡for ¡developing ¡improved ¡methods ¡ – Found ¡in ¡all ¡sta$s$cal ¡packages ¡and ¡text ¡books ¡ – Tutorial ¡will ¡discuss ¡these ¡methods ¡ 9 ¡

Sta$s$cal ¡Approaches ¡-‑ ¡2 ¡ • Robust ¡methods ¡ – OQen ¡based ¡on ¡ranks ¡ • Spearman’s ¡rank ¡correla$on ¡ • Wilcoxon ¡Mann-‑Whitney ¡test ¡for ¡comparing ¡two ¡ groups ¡ • Kruskall-‑Wallis ¡for ¡comparing ¡three ¡or ¡more ¡groups ¡ – Recent ¡studies ¡suggest ¡these ¡techniques ¡can ¡ have ¡low ¡power ¡when ¡comparing ¡groups ¡with ¡ different ¡distribu$ons ¡ • e.g. ¡different ¡variances ¡(although ¡they ¡are ¡supposed ¡to ¡ be ¡non-‑parametric) ¡ 10 ¡

Sta$s$cal ¡Approaches ¡-‑ ¡3 ¡ • Permuta$on/Randomisa$on ¡methods ¡ – Used ¡to ¡compare ¡different ¡treatment ¡groups ¡ – Assumes ¡random ¡alloca$on ¡to ¡treatment ¡(not ¡ random ¡sample) ¡ – Iden$fies ¡the ¡distribu$on ¡of ¡the ¡null ¡hypothesis ¡ by ¡permu$ng ¡the ¡observa$ons ¡over ¡the ¡groups ¡ • Very ¡plausible ¡method ¡but ¡has ¡problem ¡ – For ¡comparing ¡two ¡popula$ons ¡ • Non-‑parametric ¡but ¡not ¡robust ¡if ¡popula$ons ¡differ ¡ more ¡than ¡just ¡wrt ¡loca$on ¡ 11 ¡

Sta$s$cal ¡Approaches ¡-‑ ¡4 ¡ • Bootstrapping ¡ – Assumes ¡a ¡random ¡sample ¡ – Like ¡permuta$on ¡methods ¡ • Creates ¡many ¡different ¡samples ¡from ¡the ¡original ¡data ¡ • But ¡uses ¡sampling ¡with ¡ replacement ¡ – Non-‑parametric ¡approach ¡ • Evidence ¡suggests ¡bener ¡proper$es ¡than ¡standard ¡non-‑ parametric ¡tests ¡ • Other ¡effec$ve ¡non-‑parametric ¡methods ¡ – Trimmed ¡means ¡ – Kernel ¡Density ¡es$ma$on ¡ • We ¡will ¡cover ¡some ¡aspects ¡of ¡these ¡methods ¡ 12 ¡

Sta$s$cal ¡Approaches ¡-‑ ¡5 ¡ • Bayesian ¡Sta$s$cs ¡ – Change ¡prior ¡probabili$es ¡that ¡parameters ¡take ¡a ¡par$cular ¡ value ¡to ¡new ¡(posterior) ¡probabili$es ¡ • Based ¡on ¡data ¡+ ¡prior ¡distribu$on ¡ • Assume ¡θ ¡can ¡take ¡on ¡ n ¡different ¡values ¡θ i ¡ – Can ¡be ¡solved ¡using ¡Markov ¡Chain ¡Monte ¡Carlo ¡methods ¡e.g. ¡ Gibbs ¡Sampler ¡ • WINBUGS ¡SoQware ¡ – Assumes ¡ ¡ • The ¡prior ¡distribu$on ¡is ¡known ¡ • Data ¡are ¡random ¡sample ¡from ¡that ¡distribu$on ¡ – Not ¡covered ¡in ¡tutorial ¡ ¡except ¡for ¡issues ¡associated ¡with ¡ logis$c ¡regression ¡ 13 ¡

Design ¡Topics ¡ • Basic ¡types ¡of ¡experimental ¡design ¡ • Randomised ¡(One ¡factor ¡) ¡ • Mul$ple ¡factor ¡(Factorials) ¡ • Blocking ¡ • Within ¡subject ¡v. ¡Between ¡Groups ¡ • Random ¡v. ¡Fixed ¡Factors ¡ • Quasi-‑experiments ¡ • Apply ¡when ¡randomisa$on ¡is ¡impossible ¡ – Used ¡for ¡assessing ¡impact ¡of ¡“programs” ¡e.g. ¡CMM ¡ • Specific ¡types ¡of ¡design: ¡ – Differences ¡in ¡Differences ¡ – Interrupted ¡Time ¡Series ¡ • Assessing ¡Causality ¡ 14 ¡

The ¡R ¡Sta$s$cal ¡Language ¡ • The ¡examples ¡presented ¡in ¡this ¡workshop ¡use ¡R ¡ • R ¡is ¡Open ¡Source ¡ • It ¡is ¡a ¡very ¡flexible ¡language ¡ – Many ¡packages ¡are ¡supported ¡by ¡leading ¡sta$s$cal ¡ researchers ¡ – Many ¡test ¡books ¡available ¡ – Easy ¡to ¡program ¡your ¡own ¡func$ons ¡ • I ¡find ¡it ¡some$mes ¡difficult ¡to ¡use ¡ – Data ¡handling ¡is ¡messy ¡ – No ¡consistency ¡among ¡different ¡packages ¡that ¡ perform ¡similar ¡func$ons ¡ • But ¡arguably ¡the ¡best ¡sta$s$cal ¡soQware ¡available ¡ 15 ¡

Sta$s$cs & Experimental Design with R Barbara - PowerPoint PPT Presentation

Sta$s$cs & Experimental Design with R Barbara Kitchenham Keele University 1 Introduc$on Part 1 2 Scope of Workshop Basic Sta$s$cs Classical

Sta$s$cs Sta$s$cs Fourth Dimension of a Sta$s$cal Programmer

Basic Experimental Design Basic Concepts in Experimental Design Prof. Dr. Luc Duchateau Ghent

F orwa rd L ooking Sta te me nt Ce rta in o f the sta te me nts ma de in this Pre se nta tio

Experimental Design and Probability Introduction to course Robin Elahi Experimental Design and

Experimental Design in R Kaelen Medeiros Product Data Scientist at DataCamp DataCamp

Sta$s$cs & Experimental Design with R Barbara Kitchenham

2011 11 12 12 th th at t Sta tate te (3:18.02) :18.02) 2012 12 10 10 th th at t

STA STA 2Q 2Q19 19 An Analyst lyst Pre Presentation entation 1 CO CONTENTS TENTS 1. .

STA STA 4Q 4Q19 19 & FY & FY19 19 An Analy lyst st Pre Presentat sentation ion

STA STA 1Q 1Q19 19 An Analyst lyst Pre Presentation entation 1 CO CONTENTS TENTS 1.

Open Water Swimming Speaker: Dave Candler, STA President Qualifications STA Level 1 Award for

STA STA 1Q 1Q20 20 Pr Prese esentation ntation Opportu ortunity nity Day 5 June e 2020

STA Graduation 2019/20 STA Graduation Application https://forms.gle/tZsKJXUmbAQgcSn57 This google

263-2810: Advanced Compiler Design 2.0 Sta>c Single Assignment Form Thomas R. Gross Computer

WHAT WOULD TREX DO? From Experimental Design to Analysis, the TREX Approach EXPERIMENTAL DESIGN

Experimental Design for Simulation Experimental Design for Simulation [Law, Ch. 12][Sanchez et al.

Workshop 11.2a: Generalized Linear Mixed Effects Models (GLMM) Murray Logan 07 Feb 2017

Disclosures UCSF School Department of Family Office of Developmental of Medicine and Community

Adaptive algorithms for efficient content management in social networks Claudia Canali Michele

NOvA Muon Neutrino and Antineutrino Disappearance Results 2018 Dmitrii Torbunov University of

RSOS paths, quasi-particles and fermionic characters Pierre Mathieu Universit e Laval

PION properties from Lattice QCD R. Briceno, B. Chakraborty, R. Edwards, A. Gambhir, B. Joo, J.

Axial form factor measurements: current status and plans Carlos Mu noz Camacho* IPN-Orsay,

Mixed models for binary data Rasmus Waagepetersen Department of Mathematics Aalborg University

Sambuz

Useful Links

Newsletter

Mail Us

Sta$s$cs & Experimental Design with R Barbara - PowerPoint PPT Presentation

Sta$s$cs & Experimental Design with R Barbara Kitchenham Keele University 1 Introduc$on Part 1 2 Scope of Workshop Basic Sta$s$cs Classical

Sta$s$cs Sta$s$cs Fourth Dimension of a Sta$s$cal Programmer

Basic Experimental Design Basic Concepts in Experimental Design Prof. Dr. Luc Duchateau Ghent

F orwa rd L ooking Sta te me nt Ce rta in o f the sta te me nts ma de in this Pre se nta tio

Experimental Design and Probability Introduction to course Robin Elahi Experimental Design and

Experimental Design in R Kaelen Medeiros Product Data Scientist at DataCamp DataCamp

Sta$s$cs &amp; Experimental Design with R Barbara Kitchenham

2011 11 12 12 th th at t Sta tate te (3:18.02) :18.02) 2012 12 10 10 th th at t

STA STA 2Q 2Q19 19 An Analyst lyst Pre Presentation entation 1 CO CONTENTS TENTS 1. .

STA STA 4Q 4Q19 19 &amp; FY &amp; FY19 19 An Analy lyst st Pre Presentat sentation ion

STA STA 1Q 1Q19 19 An Analyst lyst Pre Presentation entation 1 CO CONTENTS TENTS 1.

Open Water Swimming Speaker: Dave Candler, STA President Qualifications STA Level 1 Award for

STA STA 1Q 1Q20 20 Pr Prese esentation ntation Opportu ortunity nity Day 5 June e 2020

STA Graduation 2019/20 STA Graduation Application https://forms.gle/tZsKJXUmbAQgcSn57 This google

263-2810: Advanced Compiler Design 2.0 Sta&gt;c Single Assignment Form Thomas R. Gross Computer

WHAT WOULD TREX DO? From Experimental Design to Analysis, the TREX Approach EXPERIMENTAL DESIGN

Experimental Design for Simulation Experimental Design for Simulation [Law, Ch. 12][Sanchez et al.

Workshop 11.2a: Generalized Linear Mixed Effects Models (GLMM) Murray Logan 07 Feb 2017

Disclosures UCSF School Department of Family Office of Developmental of Medicine and Community

Adaptive algorithms for efficient content management in social networks Claudia Canali Michele

NOvA Muon Neutrino and Antineutrino Disappearance Results 2018 Dmitrii Torbunov University of

RSOS paths, quasi-particles and fermionic characters Pierre Mathieu Universit e Laval

PION properties from Lattice QCD R. Briceno, B. Chakraborty, R. Edwards, A. Gambhir, B. Joo, J.

Axial form factor measurements: current status and plans Carlos Mu noz Camacho* IPN-Orsay,

Mixed models for binary data Rasmus Waagepetersen Department of Mathematics Aalborg University

Sambuz

Useful Links

Newsletter

Mail Us

Sta$s$cs & Experimental Design with R Barbara Kitchenham

STA STA 4Q 4Q19 19 & FY & FY19 19 An Analy lyst st Pre Presentat sentation ion

263-2810: Advanced Compiler Design 2.0 Sta>c Single Assignment Form Thomas R. Gross Computer