Latent Variable models for GWAs Oliver Stegle Machine Learning and - PowerPoint PPT Presentation

Latent Variable models for GWAs Oliver Stegle Machine Learning and Computational Biology Research Group Max-Planck-Institutes T¨ ubingen, Germany September 2011 O. Stegle Latent variable models for GWAs T¨ ubingen 1

Motivation Why latent variables ? Causal influences on phenotypes ◮ Genotype ◮ Primary variable of SNPs interest ATGACCTG A AACTGGGGGA C TGACGTG G AACGGT individuals ATGACCTG C AACTGGGGGA C TGACGTG C AACGGT Genome ATGACCTG C AACTGGGGGA C TGACGTG C AACGGT ◮ Known confounding ATGACCTG A AACTGGGGGA T TGACGTG G AACGGT ATGACCTG C AACTGGGGGA T TGACGTG C AACGGT ATGACCTG C AACTGGGGGA T TGACGTG C AACGGT factors ◮ Covariates ? ◮ Population structure ... y y y y y 1 y 2 y y N y y y y y y y y y y ◮ Unknown (latent) Phenome phenotypes confounders ◮ Sample handling ◮ Sample history ◮ Subtle environmental perturbations O. Stegle Latent variable models for GWAs T¨ ubingen 2

Motivation Why latent variables ? Causal influences on phenotypes ◮ Genotype ◮ Primary variable of interest SNPs Covariates ATGACCTG A AACTGGGGGA C TGACGTG G AACGGT individuals Population ATGACCTG C AACTGGGGGA C TGACGTG C AACGGT ◮ Known confounding Genome ATGACCTG C AACTGGGGGA C TGACGTG C AACGGT y y y y ATGACCTG A AACTGGGGGA T TGACGTG G AACGGT y factors ATGACCTG C AACTGGGGGA T TGACGTG C AACGGT ATGACCTG C AACTGGGGGA T TGACGTG C AACGGT ◮ Covariates ? ◮ Population structure ... ◮ Unknown (latent) y y 1 y y y y y 2 y y N y y y y y y y y y Phenome confounders phenotypes ◮ Sample handling ◮ Sample history ◮ Subtle environmental perturbations O. Stegle Latent variable models for GWAs T¨ ubingen 2

Motivation Why latent variables ? Causal influences on phenotypes ◮ Genotype ◮ Primary variable of interest SNPs Covariates Confounders ATGACCTG A AACTGGGGGA C TGACGTG G AACGGT individuals Population ATGACCTG C AACTGGGGGA C TGACGTG C AACGGT ◮ Known confounding Genome ATGACCTG C AACTGGGGGA C TGACGTG C AACGGT y y y y y y y ATGACCTG A AACTGGGGGA T TGACGTG G AACGGT y y y factors ATGACCTG C AACTGGGGGA T TGACGTG C AACGGT ATGACCTG C AACTGGGGGA T TGACGTG C AACGGT ◮ Covariates ? ◮ Population structure ... ◮ Unknown (latent) y y 1 y y y y y 2 y y N y y y y y y y y y Phenome confounders phenotypes ◮ Sample handling ◮ Sample history ◮ Subtle environmental perturbations O. Stegle Latent variable models for GWAs T¨ ubingen 2

Outline Outline O. Stegle Latent variable models for GWAs T¨ ubingen 3

Dimension reduction and the Gaussian Process Latent Variable Model (GPLVM) Outline Motivation Dimension reduction and the Gaussian Process Latent Variable Model (GPLVM) Modeling hidden confounders in GWAs Model Applications Modeling unobserved cellular phenotypes in genetic analyses Model Applications A unifying view Summary O. Stegle Latent variable models for GWAs T¨ ubingen 4

Dimension reduction and the Gaussian Process Latent Variable Model (GPLVM) Manifolds and dimension reduction (from Olivier Grisel, Generated using the Modular Data Processing toolkit and matplotlib.) O. Stegle Latent variable models for GWAs T¨ ubingen 5

Dimension reduction and the Gaussian Process Latent Variable Model (GPLVM) Linear dimension reduction ◮ Map G dimensional data on K dimensional manifold; K << G Y = H W + Ψ �� NxG NxK KxG NxG ◮ H : latent factors in low-dimensional space ◮ W : weights for factors on data dimensions ◮ Ψ : noise, ψ n,g ∼ N (0 , σ 2 ) . ◮ Challenge: neither W nor H known! ◮ Depending on assumptions on W and H : ◮ Principle component analysis (PCA) ◮ Independent component analysis (ICA) ◮ ... O. Stegle Latent variable models for GWAs T¨ ubingen 6

Dimension reduction and the Gaussian Process Latent Variable Model (GPLVM) Linear dimension reduction PCA PCA is corresponds to a noise-free version of the model = H Y W �� NxG NxK KxG ◮ PCA components ( H ) correspond to directions of maximum data variance in the original dataset: ◮ Covariance matrix: C = YY T ◮ Eigenvalue/Eigen vectors Cv i = λ i v i ◮ Projection matrix P = [ v 1 , . . . , v K ] ◮ Principle components H n = P · Y n . O. Stegle Latent variable models for GWAs T¨ ubingen 7

Dimension reduction and the Gaussian Process Latent Variable Model (GPLVM) Linear dimension reduction Bayesian PCA and GPLVM Assumption: data dimensions or sample dimension independent given H and W . GPLVM Probabilistic PCA G � � � � � Hw g , σ 2 I � N p ( Y | H , W ) = N y : ,g � � � � � h n W , σ 2 I p ( Y | H , W ) = N � y n g =1 n =1 G � � � � � 0 , σ 2 N � p ( W ) = N � � � w : ,g h I � � 0 , σ 2 N � p ( H ) = h n h I g =1 n =1  �  � N G � � � � � � h WW T + σ 2 I  � 0 , σ 2 h HH T + σ 2 I  � 0 , σ 2 p ( Y | W ) = N � p ( Y | H ) = N y n  y : ,g  �  �  � n =1 g =1 � �� NxN [Tipping and Bishop, 1999] [Lawrence, 2005] O. Stegle Latent variable models for GWAs T¨ ubingen 8

Latent Variable models for GWAs Oliver Stegle Machine Learning and - PowerPoint PPT Presentation

Latent Variable models for GWAs Oliver Stegle Machine Learning and Computational Biology Research Group Max-Planck-Institutes T ubingen, Germany September 2011 O. Stegle Latent variable models for GWAs T ubingen 1 Motivation Why

1 Latent variable models In the next section we will discuss latent variable models for

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]->

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Ontologising the GWAS Catalog A picture paints a thousand traits Helen Parkinson, EBI 17

Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 6 Stefano

Learning Latent Variable Models through Tensor Methods Anima Anandkumar U.C. Irvine Challenges

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Guaranteed Learning of Latent Variable Models through Tensor Methods Furong Huang University of

Discrete Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 15

Outline Latent Variable Generative Models Cooperative Vector Quantizer Model Model

Maximum Reconstruction Estimation for Generative Latent-Variable Models Yong Cheng joint work

GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof.

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

Genetic Susceptibility to p y Endometrial Cancer: an update d Immaculata De Vivo Harvard

Population-based detection of Structural Variants in normal and aberrant genomes. Jean Monlong,

Week 8 Joe Felsenstein Genome 562, 2015 Week 8 p.1/7 Effect of a bottleck on effective

BIOLOGY Progressive Science Initiative This material is made freely available at www.njctl.org

Recent adaptive selection in Tibet and Greenland Anders Albrechtsen The bioinformatic Centre,

Basic Ray Tracing CMSC 435/634 Projections orthographic axis-aligned orthographic perspective

Unit 4: Inference for numerical data 2. ANOVA GOVT 3990 - Spring 2020 Cornell University Dr.

Computer Graphics - Introduction to Ray Tracing - Philipp Slusallek Rendering Algorithms

Latent Variable models for GWAs Oliver Stegle Machine Learning and - PowerPoint PPT Presentation

Latent Variable models for GWAs Oliver Stegle Machine Learning and Computational Biology Research Group Max-Planck-Institutes T ubingen, Germany September 2011 O. Stegle Latent variable models for GWAs T ubingen 1 Motivation Why

1 Latent variable models In the next section we will discuss latent variable models for

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]-&gt;

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Ontologising the GWAS Catalog A picture paints a thousand traits Helen Parkinson, EBI 17

Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 6 Stefano

Learning Latent Variable Models through Tensor Methods Anima Anandkumar U.C. Irvine Challenges

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Guaranteed Learning of Latent Variable Models through Tensor Methods Furong Huang University of

Discrete Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 15

Outline Latent Variable Generative Models Cooperative Vector Quantizer Model Model

Maximum Reconstruction Estimation for Generative Latent-Variable Models Yong Cheng joint work

GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof.

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

Genetic Susceptibility to p y Endometrial Cancer: an update d Immaculata De Vivo Harvard

Population-based detection of Structural Variants in normal and aberrant genomes. Jean Monlong,

Week 8 Joe Felsenstein Genome 562, 2015 Week 8 p.1/7 Effect of a bottleck on effective

BIOLOGY Progressive Science Initiative This material is made freely available at www.njctl.org

Recent adaptive selection in Tibet and Greenland Anders Albrechtsen The bioinformatic Centre,

Basic Ray Tracing CMSC 435/634 Projections orthographic axis-aligned orthographic perspective

Unit 4: Inference for numerical data 2. ANOVA GOVT 3990 - Spring 2020 Cornell University Dr.

Computer Graphics - Introduction to Ray Tracing - Philipp Slusallek Rendering Algorithms

EpiGraphDB Query for confounders http://epigraphdb.org/confounder/ (cf:Gwas)-[r1:MR]->