Com ompari rison on of of F Five C Commonly Used Ge Gene-Gene - - PowerPoint PPT Presentation

com ompari rison on of of f five c commonly used ge gene
SMART_READER_LITE
LIVE PREVIEW

Com ompari rison on of of F Five C Commonly Used Ge Gene-Gene - - PowerPoint PPT Presentation

Com ompari rison on of of F Five C Commonly Used Ge Gene-Gene I Interaction Detecting Metho hods in S n Schi hizophr hrenia Chung-Keng Hsieh and Guan-Hua Huang Institute of Statistics National Chiao Tung University Outline


slide-1
SLIDE 1

Com

  • mpari

rison

  • n of
  • f F

Five C Commonly Used Ge Gene-Gene I Interaction Detecting Metho hods in S n Schi hizophr hrenia

Chung-Keng Hsieh and Guan-Hua Huang Institute of Statistics National Chiao Tung University

slide-2
SLIDE 2

Outline

INTRODUCTION METHODOLOGY

Study population Preliminary analyses Study design Methods Cross validation

RESULTS CONCLUSION

06/26/2009 2

slide-3
SLIDE 3

Outline

METHODOLOGY Study population Preliminary analyses Study design Methods Cross validation RESULTS CONCLUSION

INTRODUCTION

06/26/2009 3

slide-4
SLIDE 4

INTRODUCTION

Single-locus methods Gene-gene interaction Genotype data Haplotype data

SNP NP

Hap aplo lotype B e Block

06/26/2009 4

slide-5
SLIDE 5

INTRODUCTION

In the present study:

Assessed the importance of gene-gene interactions on schizophrenia risk

Data:

65 SNPs from 5 candidate genes 514 cases and 376 controls

06/26/2009 8

slide-6
SLIDE 6

INTRODUCTION

Five commonly used gene-gene interaction detecting methods Cross validation

06/26/2009 9

slide-7
SLIDE 7

Outline

INTRODUCTION RESULTS CONCLUSION

METHODOLOGY

Study population Preliminary analyses Study design Methods Cross validation

06/26/2009 10

slide-8
SLIDE 8

Study population

Schizophrenia dataset

Data collection was based on TSLS program

Genotyping of markers on 5 candidate genes:

DISC1, NRG1, DAO, G72 and CACNG2

06/26/2009 11

slide-9
SLIDE 9

Study population

514 schizophrenia cases and 376 controls Total 65 SNPs in five candidate genes

06/26/2009 12

slide-10
SLIDE 10

Outline

INTRODUCTION RESULTS CONCLUSION

METHODOLOGY

Study population Preliminary analyses Study design Methods Cross validation

06/26/2009 13

slide-11
SLIDE 11

Preliminary analyses

Data quality control:

exclude SNP if

HWE p value < 0.001 missing genotypes > 25% (SNP call rate < 75%) MAF is less than 1%

exclude individuals if

percentage of missing SNPs > 50%

After filtering data

55 SNPs 889 individuals (513 cases / 376 controls).

06/26/2009 14

slide-12
SLIDE 12

Preliminary analyses

Missing data imputation:

Imputation: replacing missing genotypes with predicted values that are based on the

  • bserved genotypes at neighboring SNPs.

We implement data imputation by using the MDR Data Tool software

It will perform a simple frequency-based imputation.

06/26/2009 15

slide-13
SLIDE 13

Outline

INTRODUCTION RESULTS CONCLUSION

METHODOLOGY

Study population Preliminary analyses Study design Methods Cross validation

06/26/2009 16

slide-14
SLIDE 14

Study design

The data was analyzed by two strategies:

use the original genotype-based data

55 SNPs

use the haplotype-based data

10 Haplotype block + 29 SNPs

In haplotype-based study, we use the Haploview software to define haplotype block and use the PHASE software to estimate individual’s haplotype

06/26/2009 17

slide-15
SLIDE 15

Outline

INTRODUCTION RESULTS CONCLUSION

METHODOLOGY

Study population Preliminary analyses Study design Methods Cross validation

06/26/2009 19

slide-16
SLIDE 16

Methods

Chi-square test Logistic regression model (LRM) Bayesian epistasis association mapping (BEAM) algorithm Classification and regression trees (CART) Multifactor dimensionality reduction (MDR) method

06/26/2009 20

slide-17
SLIDE 17

Outline

INTRODUCTION RESULTS CONCLUSION

METHODOLOGY

Study population Preliminary analyses Study design Methods Cross validation

06/26/2009 36

slide-18
SLIDE 18

Cross Validation

We want to compare the abilities of prediction in these five methods

We randomly divided our genotype-based data into training set and testing set.

The sample size of training set doubles that of testing set.

We repeat this procedure 100 times to create 100 dataset

06/26/2009 37

slide-19
SLIDE 19

Cross Validation

For each CV, we apply the five methods to the training set and get the best model for one-way, two-way, and three-way interaction. We use the training set to build a prediction rule for the best model

Like MDR, we compute the case-control ratio for each genotype combination While the prediction rule is built, we can calculate the prediction error

06/26/2009 38

slide-20
SLIDE 20

Outline

INTRODUCTION METHODOLOGY

Study population Preliminary analyses Study design Methods Cross validation

CONCLUSION

RESULTS

06/26/2009 39

slide-21
SLIDE 21

RESULTS

06/26/2009 40

slide-22
SLIDE 22

RESULTS

06/26/2009 41

slide-23
SLIDE 23

RESULTS

06/26/2009 42

slide-24
SLIDE 24

RESULTS

06/26/2009 43

slide-25
SLIDE 25

RESULTS

06/26/2009 44

slide-26
SLIDE 26

RESULTS

06/26/2009

  • ne-way interaction

Box-plot of prediction error

two-way interaction three-way interaction

45

slide-27
SLIDE 27

Outline

INTRODUCTION METHODOLOGY

Study population Preliminary analyses Study design Methods Cross validation

RESULTS

CONCLUSION

06/26/2009 46

slide-28
SLIDE 28

CONCLUSION

Our aim of this study is to propose a methodological issue in detecting gene- gene interaction We chose five commonly used methods and apply them to a schizophrenia data

06/26/2009 47

slide-29
SLIDE 29

CONCLUSION

we find that SNPs rsDAO_13 and rsDAO_7 have strong main effect SNPs rsDAO_6, rsDAO_7, and rsG72_16 have strong gene-gene interaction effects LRM shows the best predictive ability in

  • ur data

06/26/2009 48

slide-30
SLIDE 30

THANK YOU!

06/26/2009 49