Statistical Modelling Helen Ogden & Antony Overstall University - PowerPoint PPT Presentation

Statistical Modelling Helen Ogden & Antony Overstall University of Southampton c � 2019 (Chapters 1–2 closely based on original notes by Anthony Davison, Jon Forster & Dave Woods) APTS: Statistical Modelling April 2019 – slide 0

Statistical Modelling Statistical Modelling 1. Model Selection 1. Model ◃ Selection 2. Beyond the Generalised Linear Model Basic Ideas 3. Non-linear models Linear Model Bayesian Inference APTS: Statistical Modelling April 2019 – slide 0

Statistical Modelling 1. Model ◃ Selection Overview Basic Ideas Linear Model Bayesian Inference 1. Model Selection APTS: Statistical Modelling April 2019 – slide 0

Overview Statistical Modelling 1. Basic ideas 1. Model Selection ◃ Overview 2. Linear model Basic Ideas 3. Bayesian inference Linear Model Bayesian Inference APTS: Statistical Modelling April 2019 – slide 1

Statistical Modelling 1. Model Selection ◃ Basic Ideas Why model? Criteria for model selection Motivation Setting Logistic regression Basic Ideas Nodal involvement Kullback–Leibler discrepancy Log likelihood Wrong model Out-of-sample prediction Information criteria Nodal involvement Theoretical aspects Properties of AIC, NIC, BIC Linear Model Bayesian Inference APTS: Statistical Modelling April 2019 – slide 2

Why model? Statistical Modelling 1. Model Selection Basic Ideas ◃ Why model? Criteria for model selection Motivation George E. P. Box (1919–2013): Setting Logistic regression All models are wrong, but some models are useful. Nodal involvement Kullback–Leibler discrepancy Some reasons we construct models: � Log likelihood – to simplify reality (e ffi cient representation); Wrong model Out-of-sample – to gain understanding; prediction Information criteria – to compare scientific, economic, . . . theories; Nodal involvement – Theoretical aspects to predict future events/data; Properties of AIC, – to control a process. NIC, BIC Linear Model We (statisticians!) rarely believe in our models, but regard them as � Bayesian Inference temporary constructs subject to improvement. Often we have several and must decide which is preferable, if any. � APTS: Statistical Modelling April 2019 – slide 3

Criteria for model selection Substantive knowledge, from prior studies, theoretical Statistical Modelling � 1. Model Selection arguments, dimensional or other general considerations Basic Ideas (often qualitative) Why model? Criteria for model Sensitivity to failure of assumptions (prefer models that are ◃ � selection Motivation robustly valid) Setting Quality of fit—residuals, graphical assessment (informal), or Logistic regression � Nodal involvement goodness-of-fit tests (formal) Kullback–Leibler discrepancy Prior knowledge in Bayesian sense (quantitative) Log likelihood � Wrong model Generalisability of conclusions and/or predictions: Out-of-sample � prediction same/similar models give good fit for many di ff erent datasets Information criteria Nodal involvement Theoretical aspects Properties of AIC, . . . but often we have just one dataset . . . � NIC, BIC Linear Model Bayesian Inference APTS: Statistical Modelling April 2019 – slide 4

Motivation Statistical Modelling Even after applying these criteria (but also before!) we may 1. Model Selection compare many models: Basic Ideas linear regression with p covariates, there are 2 p possible � Why model? Criteria for model combinations of covariates (each in/out), before allowing for selection ◃ Motivation transformations, etc.— if p = 20 then we have a problem; Setting Logistic regression choice of bandwidth h > 0 in smoothing problems � Nodal involvement Kullback–Leibler the number of di ff erent clusterings of n individuals is a Bell � discrepancy number (starting from n = 1 ): 1, 2, 5, 15, 52, 203, 877, Log likelihood Wrong model 4140, 21147, 115975, . . . Out-of-sample prediction we may want to assess which among 5 × 10 5 SNPs on the � Information criteria Nodal involvement genome may influence reaction to a new drug; Theoretical aspects Properties of AIC, . . . � NIC, BIC Linear Model For reasons of economy we seek ‘simple’ models. Bayesian Inference APTS: Statistical Modelling April 2019 – slide 5

Albert Einstein (1879–1955) Statistical Modelling 1. Model Selection Basic Ideas Why model? Criteria for model selection ◃ Motivation Setting Logistic regression Nodal involvement Kullback–Leibler discrepancy Log likelihood Wrong model Out-of-sample prediction Information criteria Nodal involvement Theoretical aspects Properties of AIC, NIC, BIC ‘Everything should be made as simple as possible, but no Linear Model simpler .’ Bayesian Inference APTS: Statistical Modelling April 2019 – slide 6

William of Occam (?1288–?1348) Statistical Modelling 1. Model Selection Basic Ideas Why model? Criteria for model selection ◃ Motivation Setting Logistic regression Nodal involvement Kullback–Leibler discrepancy Log likelihood Wrong model Out-of-sample prediction Information criteria Nodal involvement Theoretical aspects Properties of AIC, NIC, BIC Occam’s razor: Entia non sunt multiplicanda sine Linear Model necessitate : entities should not be multiplied beyond Bayesian Inference necessity . APTS: Statistical Modelling April 2019 – slide 7

Setting To focus and simplify discussion we will consider parametric models, but the � ideas generalise to semi-parametric and non-parametric settings We shall take generalised linear models (GLMs) as example of moderately � complex parametric models: – Normal linear model has three key aspects: structure for covariates : linear predictor η = x T β ; ◃ response distribution : y ∼ N ( µ, σ 2 ) ; and ◃ relation η = µ between µ = E( y ) and η . ◃ – GLM extends last two to y has density ◃ � y θ − b ( θ ) � f ( y ; θ , φ ) = exp + c ( y ; φ ) , φ where θ depends on η ; dispersion parameter φ is often known; and η = g ( µ ) , where g is monotone link function . ◃ APTS: Statistical Modelling April 2019 – slide 8

Logistic regression Statistical Modelling Commonest choice of link function for binary reponses: � 1. Model Selection exp( x T β ) 1 Basic Ideas Pr( Y = 1) = π = 1 + exp( x T β ) , Pr( Y = 0) = 1 + exp( x T β ) , Why model? Criteria for model selection giving linear model for log odds of ‘success’, Motivation Setting � Pr( Y = 1) � � � Logistic π ◃ regression log = log = x T β . Pr( Y = 0) 1 − π Nodal involvement Kullback–Leibler discrepancy Log likelihood Log likelihood for β based on independent responses y 1 , . . . , y n Wrong model � Out-of-sample with covariate vectors x 1 , . . . , x n is prediction Information criteria n n � � Nodal involvement � � ℓ ( β ) = y j x T log 1 + exp( x T j β ) Theoretical aspects j β − Properties of AIC, j =1 j =1 NIC, BIC � � Linear Model ℓ (˜ β ) − ℓ ( � , where � Good fit gives small deviance D = 2 β ) β is � Bayesian Inference model fit MLE and ˜ β is unrestricted MLE. APTS: Statistical Modelling April 2019 – slide 9

Nodal involvement data Statistical Modelling 1. Model Selection Table 1: Data on nodal involvement: 53 patients with prostate Basic Ideas cancer have nodal involvement ( r ), with five binary covariates a ge Why model? Criteria for model etc. selection Motivation a ge s tage g rade x ray a cid m r Setting 6 5 0 1 1 1 1 Logistic regression 6 1 0 0 0 0 1 Nodal ◃ 4 0 1 1 1 0 0 involvement 4 2 1 1 0 0 1 Kullback–Leibler 4 0 0 0 0 0 0 discrepancy 3 2 0 1 1 0 1 Log likelihood 3 1 1 1 0 0 0 Wrong model 3 0 1 0 0 0 1 Out-of-sample 3 0 1 0 0 0 0 prediction 2 0 1 0 0 1 0 Information criteria 2 1 0 1 0 0 1 Nodal involvement 2 1 0 0 1 0 0 Theoretical aspects 1 1 1 1 1 1 1 Properties of AIC, . . . . . . NIC, BIC . . . . . . . . . . . . 1 1 0 0 1 0 1 Linear Model 1 0 0 0 0 1 1 Bayesian Inference 1 0 0 0 0 1 0 APTS: Statistical Modelling April 2019 – slide 10

Nodal involvement deviances Deviances D for 32 logistic regression models for nodal involvement data. + denotes a term included in the model. a ge s t g r x r a c df a ge s t g r x r a c df D D 52 40.71 + + + 49 29.76 + 51 39.32 + + + 49 23.67 + 51 33.01 + + + 49 25.54 + 51 35.13 + + + 49 27.50 + 51 31.39 + + + 49 26.70 + 51 33.17 + + + 49 24.92 + + 50 30.90 + + + 49 23.98 + + 50 34.54 + + + 49 23.62 + + 50 30.48 + + + 49 19.64 + + 50 32.67 + + + 49 21.28 + + 50 31.00 + + + + 48 23.12 + + 50 24.92 + + + + 48 23.38 + + 50 26.37 + + + + 48 19.22 + + 50 27.91 + + + + 48 21.27 + + 50 26.72 + + + + 48 18.22 + + 50 25.25 + + + + + 47 18.07 APTS: Statistical Modelling April 2019 – slide 11

Statistical Modelling Helen Ogden & Antony Overstall University - PowerPoint PPT Presentation

Statistical Modelling Helen Ogden & Antony Overstall University of Southampton c 2019 (Chapters 12 closely based on original notes by Anthony Davison, Jon Forster & Dave Woods) APTS: Statistical Modelling April 2019 slide 0

Workshop 4: Statistical modelling intro Murray Logan 10 Mar 2019 Section 1 Introduction

Workshop 4: Statistical modelling intro Murray Logan March 10, 2019 Table of contents 1

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

The Modelling and Simulation Process 1. History of Modelling and Simulation 2. Modelling and

(Modelling) Semantics of Modelling Languages Hans Vangheluwe 7 September 2010, Lisboa, Portugal

Modelling with Differential Equations Modelling with Differential Equations Modelling with

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

Physical Modelling Physical Modelling with with ModelVision ModelVision, , Physical Modelling

Modelling and Synthesis of User Interfaces for Complex, Web-Based Modelling Environments Jacob

Course Overview CMPUT 654: Modelling Human Strategic Behaviour Strategic Modelling This course

Course Overview CMPUT 654: Modelling Human Strategic Behaviour Strategic Modelling This

Traditional and Modern Approaches to Modelling with R: An Advanced Course Bill Venables, CSIRO,

How to make R, PostGIS and QGis cooperate for statistical modelling duties a case study on

Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent

Statistical Device Variability and Statistical Device Variability and its Impact on Design its

STK-IN4300 Statistical Learning Methods in Data Science Statistical Boosting Boosting as a

XDP - challenges and future work Jesper Dangaard Brouer (Red Hat) Toke Hiland-Jrgensen

HyperLoop: NIC Offloaded Primitives to Accelerate Replicated Transactions in Multi-tenant Storage

HARROGATE TOWN MEETING Managing Phorid Flies Wednesday, March 23, 2016 Meeting Purpose Continue

SoLi SoLid: Search for neutrino oscillations using a Lithium-6 Detector at a nuclear reactor

+ Mohammad Mahmoody Rafael Pass + Modern Cryptography and One-Way Functions Modern

Should We Defy Amdahls Law (or DALs motivations) Andr Seznec Andr Seznec INRIA/IRISA

Books Promote Brands Promote Bili Bilingual Lit Literacy NICHE: multicultural marketing +

Practical Takeaways for a Healthy European Cyber Market Sponsored By: 1 Practical Takeaways for