Identifying Conserved Protein Complexes between Species by - - PowerPoint PPT Presentation

identifying conserved protein complexes between species
SMART_READER_LITE
LIVE PREVIEW

Identifying Conserved Protein Complexes between Species by - - PowerPoint PPT Presentation

Identifying Conserved Protein Complexes between Species by Constructing Interolog Networks InCoB 2013, Taicang China September 2013 Sriganesh SRIHARI Institute for Molecular Bioscience, The University of Queensland, QLD, Australia In


slide-1
SLIDE 1

Identifying Conserved Protein Complexes between Species by Constructing Interolog Networks

InCoB 2013, Taicang China September 2013

Sriganesh SRIHARI

Institute for Molecular Bioscience, The University of Queensland, QLD, Australia

In collaboration with Phi Vu Nguyen and Hon Wai Leong Department of Computer Science, National University of Singapore

slide-2
SLIDE 2

Protein Complexes:

The fundamental functional units of the cell

2

Multi-protein complexes drive important cellular functions Proteins physically interact to form complexes

Identifying the entire complement of complexes (the ‘complexosome’) is crucial to understand the underlying cellular machinery and organization. Protein complexes

  • Drive several biological processes in

the cell

  • Example: RNA polymerase plays a

crucial role in transcription by binding to DNA to generate mRNA Complexes Proteins come together at same time, same place and physically interact

RNA Polymerase DNA mRNA transcript

slide-3
SLIDE 3

Reconstructing the‘complexosome’ still a long way to go!

  • Yeast (most complete, most studied) – 60-75%
  • Mainly missing are the membrane complexes
  • Wodak CYC 2008 (Pu et al., 2009), MIPS (Mewes et al., 2004)
  • Human: 30-40%
  • CORUM (Ruepp et al, 2011), Human soluble (Havugimana et al., 2012)
  • Many complexes are conserved
  • Complexes are functional units
  • Useful to integrate evolutionary conservation to detect

complexes

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 4

slide-4
SLIDE 4

Interolog networks

  • Integrating evolutionary information with PPI

networks

  • Detect evolutionarily conserved protein

interactions

  • Detect evolutionarily conserved complexes

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 5

slide-5
SLIDE 5

CONSTRUCTING INTEROLOG NETWORKS

Identifying conserved complexes between human and yeast

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland

6

slide-6
SLIDE 6

Orthologous Proteins between Yeast and Human

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 7

y1 y2 h1 h2

Yeast proteins Human proteins Orthologs*

*Mainly sequence similarity used to measure orthology in the literature. E.g. BLAST similarity with E < 10-3.

y3 h3

slide-7
SLIDE 7

Interologs: Interactions Conserved Between Orthologous Proteins

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 8

y1 y2 h1 h2

Yeast PPI Human PPI

Orthologs* Interologs

y3 h3

*Mainly sequence similarity used to measure orthology in the literature.

slide-8
SLIDE 8

Constructing Interolog Network

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 9

y1 y2 h1 h2 Yeast PPI Human PPI

Orthologs*

*Mainly sequence similarity used in the literature

y1|h1 y2|h2 y3|h3

y3 h3

(Orthology graph: Sharan et al., (2005), J Comp Biol)

slide-9
SLIDE 9

Constructing Interolog Network

10

Yeast PPI network Orthologs*

*Mainly sequence similarity used in the literature

Human PPI network

Clusters in the interolog network corresponds to conserved regions between the two PPI networks. If a region is “dense”, check if it’s a conserved complex. (Sharan et al., 2005) In general, network alignment. Max graph isomorphism

 maximal clique

(NP-complete problems)

slide-10
SLIDE 10

Conserved Complexes Identified from Interolog Networks

13

Yeast eIF3 complex Human eIF3 complex Bork et al., Curr Opinion Struct Biol (2004); Sharan et al., J Comp Biol (2005); Teunis et al., PLoS Comp Biol (2008); Zaslavskiy et al., Bioinformatics (2009).

On average, proteins in a conserved yeast complex account for 30-35% of proteins in the corresponding human complex. (Teunis et al., PLoS Comp Bio 2008)

In fact Teunis et al., say: “Protein complex evolution does not involve extensive PPI rewiring.”

(Among the conserved proteins within a complex)

But all this is just one part of the story! Told using mainly sequence similarity

Larger complexes more evolutionarily conserved compared to smaller and restricted to vertebrates, suggesting recent innovations (Havugimana et al., 2012)

slide-11
SLIDE 11

Functional Conservation: Going closer to real orthology

17

y1 y2 h1 h2 Yeast PPI Human PPI

Orthologs

y3 h3 h4 y5

  • y2 performs a function F1 in yeast.

 F1 is performed by h2 and h4 in human.

  • y1 and y5 perform a function F2 in yeast.

 F2 is performed by h1 in human.

  • Segregation

 y2 {h2,h4}

  • Fusion

 {y1,y5}h1

slide-12
SLIDE 12

Functional Conservation by Domain Conservation

18

y1 y2 h1 h2 Yeast PPI Human PPI

Orthologs

y3 h3 h4 y5

  • Rad9 is a cell-cycle and DDR protein in yeast.

 hRAD9, BRCA1 and 53BP1 in human.  BRCT domain conserved in all these proteins!

  • RECQL helicases – BLM and WRN (SGS1), RECQ1-4

Integrate domain conservation in interolog construction.

slide-13
SLIDE 13

Constructing Interolog Networks by Adding Domain Information

19

{y1,y5} | h1 y2|{h2,h4}

y1 y2 h1 h2 Yeast PPI Human PPI

Orthologs

y3 h3 h4

y3|h3

y5

y1|h1 y2|h2 y5|h1 y2|h4 y3|h3

slide-14
SLIDE 14

Advantages of Using Domain Information for Interolog Network Construction

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 21

Ensembl: Uses domain information + sequence similarity OrthoMCL: Sequence similarity (mainly BLAST)

slide-15
SLIDE 15

Advantages of Using Domain Information for Interolog Network Construction

22

  • Integrates functional conservation

 Beyond simple sequence similarity

 Integrates orthology relationships (multi-vertices)

  • Creates a denser network

 Many-to-many relationships using domain information as

against predominantly one-to-one using only sequence similarity

  • Avoids false-positive interactions

 Adds only conserved interactions

Better complex prediction!

  • Higher accuracy and less noise!
  • More complexes!
slide-16
SLIDE 16

Pipeline for Predicting Conserved Complexes between Yeast and Human

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 23

Interolog network

Yeast PPI

Human PPI

Sequence similarity

Domain conservation Clustering algorithms

Conserved yeast complexes Clusters in interolog network Conserved human complexes Map back to yeast PPI Map back to human PPI

slide-17
SLIDE 17

Pipeline for Predicting Conserved Complexes between Yeast and Human

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 24

Interolog network

Yeast PPI

Human PPI

Sequence similarity

Domain conservation Clustering algorithms

Conserved yeast complexes Clusters in interolog network Conserved human complexes Map back to yeast PPI Map back to human PPI

COCIN: COnserved Complexes from Interolog Networks

slide-18
SLIDE 18

Improvement Over Earlier Orthology-network Methods including Sharan et al. (2006)

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 25

  • Improved interolog network construction

 Uses domain information apart from sequence similarity

 Preserves many-to-many orthology relationships

  • Uses ‘state-of-the-art’ PPI network clustering algorithms

 CMC (Liu et al., Bioinformatics 2009)  HACO (Wang et al., Cell Mol Proteomics 2009)  MCL (van Dongen 2000/2004) and

MCL-CAw (Srihari et al., BMC Bioinformatics 2010) Shown to perform significantly better than traditional clustering methods (Srihari et al., 2010, 2012, 2013)

slide-19
SLIDE 19

EXPERIMENTAL EVALUATION

Identifying conserved complexes between yeast and human

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 26

slide-20
SLIDE 20

PPI Datasets

Database # proteins # interactions IntAct (version Nov 13, 2012) 5276 18834 Biogrid (version 3.2.95, Nov 30, 2012) 5886 73923 IntAct Biogrid 6332 83777 IntActBiogrid 4620 8930 ICDScore(IntAct  Biogrid) 5239 71636

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 27

Database # proteins #interactions HPRD (Release 9, 2010) 9617 39184 Biogrid (April 25, 2012) 12515 59027 HPRDBiogrid 13624 76719 HPRDBiogrid 8615 21491 ICDScore(HPRDBiogrid) 8521 61868 ICDEnrich(HPRDBiogrid) 9764 192053

Yeast: #proteins: 5239 #interactions: 71636

Source: IntAct, BioGrid (Kerrien et al. 2007, Stark et al. 2011)

Human: #proteins: 9764 #interactions: 192053

Source: BioGrid, HPRD (Stark et al. 2011, Prasad et al. 2009)

slide-21
SLIDE 21

Protein Benchmark Complexes Datasets

  • Wodak CYC2008 yeast complexes
  • 149 with size>3 (36.5%)
  • Total: 408
  • Pu S et al., NAR 2009
  • CORUM human complexes
  • 722 with size>3 (39.1%)
  • Total: 1843
  • Ruepp et al. NAR 2008

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 28

slide-22
SLIDE 22

COCIN Identifies More Conserved Complexes than Direct Clustering

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 31

Comparisons between CMC on interolog network and CMC directly on the individual PPI networks

(Using Ensembl)

slide-23
SLIDE 23

COCIN Identifies More Conserved Complexes than Direct Clustering

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 32

Method # Predicted complexes # Matched predictions Precision # Gold standard conserved complexes # Detected conserved complexes Recall (of conserved complexes) COCIN 71 36 50.7% 118 78 66.1% CMC 1389 156 11.2% 118 66 55.9% HACO 1290 80 6.2% 118 36 30.5% MCL-CAw/MCL 631 45 7.1% 118 24 20.3%

Similar results comparing against HACO and MCL-CAw/MCL

(Using Ensembl)

slide-24
SLIDE 24

Using Domain Information Identifies Many-to-Many Complexes Mapping

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 34

The function of a yeast complex is performed by multiple human complexes. Throws further light on the mechanisms of conservation.

slide-25
SLIDE 25

Conserved Complexes Identified from Interolog Networks

35

Yeast eIF3 complex Human eIF3 complex Bork et al., Curr Opinion Struct Biol (2004); Sharan et al., J Comp Biol (2005); Teunis et al., PLoS Comp Biol (2008); Zaslavskiy et al., Bioinformatics (2009).

On average, proteins in a conserved yeast complex account for 30-35% of proteins in the corresponding human complex. (Teunis et al., PLoS Comp Bio 2008)

In fact Teunis et al., say: “Protein complex evolution does not involve extensive PPI rewiring.”

(Among the conserved proteins within a complex)

I told you this was just one part of the story!

Larger complexes more evolutionarily conserved compared to smaller and restricted to vertebrates, suggesting recent innovations (Havugimana et al., 2012)

slide-26
SLIDE 26

Novel Insights into the Mechanism of Conservation of Complexes

36

slide-27
SLIDE 27

Novel Insights into the Mechanism of Conservation of Complexes

37

BE CAREFUL!

In Prof Manyuan Long’s words

slide-28
SLIDE 28

Novel Insights into the Mechanism of Conservation of Complexes

38

Cellular processes have evolved multi-fold from yeast to human.

1.

Proteins have fused as well as segregated

2.

Multiple proteins “invented” for buffering purposes

3.

Co-functional proteins have parted ways (belong to different complexes)

Key relationships have been broken, new ones formed

slide-29
SLIDE 29

THANK YOU

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland 39

National University of Singapore: Phi Vu Nguyen Prof Hon Wai Leong

slide-30
SLIDE 30

Thank You…

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland

40

Acknowledgements

Institute for Molecular Bioscience

Prof Mark Ragan and his group

UQ Centre for Clinical Research

Dr Peter T. Simpson

Queensland Institute of Medical Research

Prof Kum Kum Khanna and her group NHMRC grant to Dr Peter T. Simpson & Prof Mark A. Ragan

slide-31
SLIDE 31

Thank You…

Sriganesh Srihari, Institute for Molecular Bioscience, The University of Queensland

41