Fun with Mixed Models Vic Biostats Seminar 30th April 2015 - PowerPoint PPT Presentation

Fun with Mixed Models Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Overview 1 Estimating SNP Heritability 2 Extensions 3 Computational Technicalities 4 Classification Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

The Linear Mixed Model Suppose our GWAS data comprise Phenotype Y (vector of length n ) SNP calls S (matrix of size n × N ) Plus any covariates Z . g ∼ N (0 , K σ 2 e ∼ N (0 , I σ 2 Y = Z α + g + e with g ) and e ) α is a vector of fixed effects, g and e are the genetic and environmental random effects (with corresponding components of variance σ 2 g and σ 2 e ). N , where X ij = S ij − ¯ Typically, use kinship matrix K = XX T S j SD ( S j ) . Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Traditionally Used for Heritability Estimation Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Solved Via REML g ∼ N (0 , K σ 2 e ∼ N (0 , I σ 2 Y = Z α + g + e with g ) and e ) The raw model likelihood follows from assuming V = K σ 2 g + I σ 2 Y ∼ N ( Z α, V ) where e : e ) − 1 2( Y − Z α ) T V − 1 ( Y − Z α ) − 1 e ) = − n l ( Y | α, K , σ 2 g , σ 2 2 log(2 πσ 2 2 log | V | . The restricted likelihood is obtained by “integrating across” α . P = V − 1 − V − 1 Z ( Z T V − 1 Z ) − 1 Z T V − 1 : Y ∼ N (0 , P ) where e ) = − n - p e ) − 1 2 Y T PY − 1 2 log | V |− 1 l ( Y | K , σ 2 g , σ 2 2 log(2 πσ 2 2 log | Z T V − 1 Z | . Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

1 Estimating SNP Heritability 2 Extensions 3 Computational Technicalities 4 Classification Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Estimating Total SNP Heritability Jian Yang et al. realised by applying to “unrelated individuals”, could estimate total proportion of phenotypic variance explained by all SNPs. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Linear Random Effects Regression Model Suppose we assume the following relationship: Y = α + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 + β 6 X 6 + β 7 X 7 + β 8 X 8 + β 9 X 9 + β 10 X 10 + β 11 X 11 + β 12 X 12 + β 13 X 13 + β 14 X 14 + β 15 X 15 + β 16 X 16 + β 17 X 17 + β 18 X 18 + β 19 X 19 + β 20 X 20 + β 21 X 21 + β 22 X 22 + β 23 X 23 + β 24 X 24 + β 25 X 25 + β 26 X 26 + β 27 X 27 + β 28 X 28 + . . . + β 500 000 X 500 000 + e , where β j ∼ N (0 , σ 2 g / N ) and e ∼ N (0 , σ 2 e ). j =1 β j X j = X β ∼ N (0 , XX T Then g = � N N σ 2 g ) and Y ∼ N ( α, K σ 2 g + I σ 2 e ) where K = XX T N Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Estimating Total SNP Heritability The heritability of human height is 80%. Jian Yang et al. calculated that 45% of phenotypic variance could be explained by common SNPs. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Solves the “Missing Heritability” Problem Human Height Environment Genetics Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Solves the “Missing Heritability” Problem Human Height GWAS SNPs Environment Genetics Missing Herit. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Solves the “Missing Heritability” Problem Human Height GWAS SNPs Environment ALL SNPs Still Genetics Missing Missing Herit. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Solves the “Missing Heritability” Problem Human Height Schizophrenia Obesity GWAS SNPs Environment Other SNPs Other Genetics Crohn's Disease Bipolar Disorder Epilepsy Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

The Method Generally Works REML Assumptions: All SNPs are Causal Gaussian Effect Sizes Gaussian Noise Terms Inverse Relationship between MAF and Effect Size Overall, we found the approach amazingly robust to misspecification Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Computing K = XX T N tet Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Impact of Uneven Tagging Regions of high LD have disproportionately large contribution Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Impact of Uneven Tagging A common problem when performing principal component analysis. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Estimates Can be Sensitive to LD of Causal Variants Causal variants in high LD areas ⇒ over-estimation of h 2 SNP Causal variants in low LD areas ⇒ under-estimation of h 2 SNP Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Adjusting for Uneven Tagging Weightings 1 1 1 1 β 1 β 2 β 3 β 7 Genotyped SNPs X 1 X 2 X 3 X 4 Underlying Variation U 1 U 2 U 3 U 4 Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Adjusting for Uneven Tagging Weightings ½ ½ 1 ½½ ¼¼¼ ¼ β 1 β 5 β 2 β 3 β 6 β 8 β 7 β 8 β 9 Genotyped SNPs X 1 X 1 X 2 X 3 X 3 X 4 X 4 X 4 X 4 Underlying Variation U 1 U 2 U 3 U 4 Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Weightings Reduce the Biases LDAK weightings down-weight SNPs well-tagged by neighbours and up-weight SNPs poorly-tagged by neighbours Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

LDAK: Linkage Disequilibrium Adjusted Kinships LDAK weights offer an alternative to pruning. e.g., when computing genetic profile risk scores. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

GCTA vs LDAK In the end, whether SNPs explain 50% or 60% of heritability not a big deal. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

1 Estimating SNP Heritability 2 Extensions 3 Computational Technicalities 4 Classification Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Basic Model Y = α + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 + β 6 X 6 + β 7 X 7 + β 8 X 8 + β 9 X 9 + β 10 X 10 + β 11 X 11 + β 12 X 12 + β 13 X 13 + β 14 X 14 + β 15 X 15 + β 16 X 16 + β 17 X 17 + β 18 X 18 + β 19 X 19 + β 20 X 20 + β 21 X 21 + β 22 X 22 + β 23 X 23 + β 24 X 24 + β 25 X 25 + β 26 X 26 + β 27 X 27 + β 28 X 28 + . . . + β 500 000 X 500 000 + e . Assume β j ∼ N (0 , σ 2 g / N ) and e ∼ N (0 , σ 2 e ). e ), where K = XX T Then Y ∼ N ( α, K σ 2 g + I σ 2 N Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Bivariate Analysis Trait 1: Y 1 = Z α 1 + β 1 X 1 + β 2 X 2 + . . . + β 500 000 X 500 000 + e 1 = Z α 1 + g 1 + e 1 Trait 2: Y 2 = Z α 2 + γ 1 X 1 + γ 2 X 2 + . . . + γ 500 000 X 500 000 + e 2 = Z α 2 + g 2 + e 2 Now interested in the correlation between genetic effects: ρ = cor ( g 1 , g 2 ). Or equivalently can think of the average correlation between effect sizes: ρ = cor ( β j , γ j ) Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Examining Concordance Between Traits Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Genome Partitioning Y = α + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 + β 6 X 6 + β 7 X 7 + β 8 X 8 + β 9 X 9 + β 10 X 10 + β 11 X 11 + β 12 X 12 + β 13 X 13 + β 14 X 14 + β 15 X 15 + β 16 X 16 + β 17 X 17 + β 18 X 18 + β 19 X 19 + β 20 X 20 + β 21 X 21 + β 22 X 22 + β 23 X 23 + β 24 X 24 + β 25 X 25 + β 26 X 26 + β 27 X 27 + β 28 X 28 + . . . + β 500 000 X 500 000 + e . Assume β j ∼ N (0 , σ 2 g / N ) and e ∼ N (0 , σ 2 e ). e ), where K = XX T Then Y ∼ N ( α, K σ 2 g + I σ 2 N Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Genome Partitioning Y = α + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 + β 6 X 6 + β 7 X 7 + β 8 X 8 + β 9 X 9 + β 10 X 10 + β 11 X 11 + β 12 X 12 + β 13 X 13 + β 14 X 14 + β 15 X 15 + β 16 X 16 + β 17 X 17 + β 18 X 18 + β 19 X 19 + β 20 X 20 + β 21 X 21 + β 22 X 22 + β 23 X 23 + β 24 X 24 + β 25 X 25 + β 26 X 26 + β 27 X 27 + β 28 X 28 + . . . + β 500 000 X 500 000 + e . Assume β j ∼ N (0 , σ 2 g 1 / N 1 ) and β k ∼ N (0 , σ 2 g 2 / N 2 ). e ), where K 1 = X 1 X T and K 2 = X 2 X T Then Y ∼ N ( α, K 1 σ 2 g 1 + K 2 σ 2 g 2 + I σ 2 1 2 N 1 N 2 Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Genome Partitioning Height BMI vWF QTi Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Intensity of Heritability We define the “intensity of heritability” of a set of SNPs as their heritability divided by their genetic variation. Can then test for differences in intensity of heritability. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Intensity of Heritability Are genic SNPs more important than inter-genic SNPs? Inter-genic defined as all SNPs > 100kbp from a coding region. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Intensity of Heritability Are genic SNPs more important than inter-genic SNPs? Can test eQTLs vs non-eQTLs; high-quality SNPs vs low-quality, etc. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Concordance Between Traits Are SNPs associated with one trait more important for others. p -values for Schizophrenia and Crohn’s obtained from independent studies. Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk

Fun with Mixed Models Vic Biostats Seminar 30th April 2015 - PowerPoint PPT Presentation

Fun with Mixed Models Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk Overview 1 Estimating SNP Heritability 2 Extensions 3 Computational Technicalities 4 Classification Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk The

Regression 2: Mixed Models Marco Baroni Practical Statistics in R Outline Mixed models with

Enhance OpenSSH for Fun and Security Enhance OpenSSH for Fun and Security Enhance OpenSSH for Fun

Mixing it up with random effects Joshua Loftus Mixed models Intro to mixed models What is a

Water Quality Fun Book ter Quality Fun Book Water Quality Fun Book ater Quality Fun Book Join

Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in

Mixed Precision Training PAI Overview What is mixed-precision

Mixed Methodological Analysis David F. Feldon Utah State University May 8, 2018 Mixed Methods

Outline Statistical inference for linear mixed models general form of linear mixed models

Mixed models in R using the lme4 package Part 3: Linear mixed models with simple, scalar random

Mixed models in R using the lme4 package Part 6: Nonlinear mixed models Douglas Bates Madison

Why Mixed Effects Models? Mixed Effects Models Recap/Intro Three issues with ANOVA

Mixed models in R using the lme4 package Part 1: Linear mixed models with simple, scalar random

Logistic mixed models for DIF IRT models can be regarded as logistic mixed models (e.g., Adams,

Linear mixed effect model- Birth rates data Richard Erickson Quantitative Ecologist DataCamp

EFFECTIVE USE OF MIXED PRECISION FOR HPC Kate Clark, Smoky Mountain Conference 2019 Why Mixed

MIXED PRECISION TRAINING OF DEEP NEURAL NETWORKS Carl Case, NVIDIA OUTLINE 1. What is mixed

Mikhail Varentsov Lomomosov Moscow State University, Faculty of Geography, Department of

Temperature monitoring of non Temperature monitoring of non- actively cooled pharmaceutical

Assessing uncertainty of the temporal EBLUP: a resampling-based approach Lus N. Pereira, MsC 1

Omitted variable bias of Lasso-based inference methods: A finite sample analysis uthrich

U G A V ! Michael Johanson, Nolan Bard, Marc Lanctot, " # ! K Q $ A J Richard

Regression Diagnostics and the Forward Search 2 A. C. Atkinson, London School of Economics March

IN MEMORY OF PROFESSOR DANIE KRIGE WINFRED ASSIBEY-BONSU August 2015 Content Part 1: The

What works in Boston may not work in Los Angeles: Understanding site di ff erences and generalizing

Sambuz

Useful Links

Newsletter

Mail Us

Fun with Mixed Models Vic Biostats Seminar 30th April 2015 - PowerPoint PPT Presentation

Fun with Mixed Models Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk Overview 1 Estimating SNP Heritability 2 Extensions 3 Computational Technicalities 4 Classification Vic Biostats Seminar 30th April 2015 doug.speed@ucl.ac.uk The

Regression 2: Mixed Models Marco Baroni Practical Statistics in R Outline Mixed models with

Enhance OpenSSH for Fun and Security Enhance OpenSSH for Fun and Security Enhance OpenSSH for Fun

Mixing it up with random effects Joshua Loftus Mixed models Intro to mixed models What is a

Water Quality Fun Book ter Quality Fun Book Water Quality Fun Book ater Quality Fun Book Join

Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in Selective Mixed Oxides in

Mixed Precision Training PAI Overview What is mixed-precision

Mixed Methodological Analysis David F. Feldon Utah State University May 8, 2018 Mixed Methods

Outline Statistical inference for linear mixed models general form of linear mixed models

Mixed models in R using the lme4 package Part 3: Linear mixed models with simple, scalar random

Mixed models in R using the lme4 package Part 6: Nonlinear mixed models Douglas Bates Madison

Why Mixed Effects Models? Mixed Effects Models Recap/Intro Three issues with ANOVA

Mixed models in R using the lme4 package Part 1: Linear mixed models with simple, scalar random

Logistic mixed models for DIF IRT models can be regarded as logistic mixed models (e.g., Adams,

Linear mixed effect model- Birth rates data Richard Erickson Quantitative Ecologist DataCamp

EFFECTIVE USE OF MIXED PRECISION FOR HPC Kate Clark, Smoky Mountain Conference 2019 Why Mixed

MIXED PRECISION TRAINING OF DEEP NEURAL NETWORKS Carl Case, NVIDIA OUTLINE 1. What is mixed

Mikhail Varentsov Lomomosov Moscow State University, Faculty of Geography, Department of

Temperature monitoring of non Temperature monitoring of non- actively cooled pharmaceutical

Assessing uncertainty of the temporal EBLUP: a resampling-based approach Lus N. Pereira, MsC 1

Omitted variable bias of Lasso-based inference methods: A finite sample analysis uthrich

U G A V ! Michael Johanson, Nolan Bard, Marc Lanctot, &quot; # ! K Q $ A J Richard

Regression Diagnostics and the Forward Search 2 A. C. Atkinson, London School of Economics March

IN MEMORY OF PROFESSOR DANIE KRIGE WINFRED ASSIBEY-BONSU August 2015 Content Part 1: The

What works in Boston may not work in Los Angeles: Understanding site di ff erences and generalizing

Sambuz

Useful Links

Newsletter

Mail Us

U G A V ! Michael Johanson, Nolan Bard, Marc Lanctot, " # ! K Q $ A J Richard