A novel index of protein- protein interface propensity improves - - PowerPoint PPT Presentation

a novel index of protein protein interface propensity
SMART_READER_LITE
LIVE PREVIEW

A novel index of protein- protein interface propensity improves - - PowerPoint PPT Presentation

A novel index of protein- protein interface propensity improves interface residue recognition Wentao Dai Email: wtdai AT scbit.org Shanghai Center for Bioinformation Technology SCBIT) Outline I. Background and Motivation II.


slide-1
SLIDE 1

A novel index of protein- protein interface propensity improves interface residue recognition

Wentao Dai Email: wtdai AT scbit.org Shanghai Center for Bioinformation Technology(SCBIT)

slide-2
SLIDE 2

Outline

I. Background and Motivation

  • II. Protein-Protein Interface Datasets

– Astral2.05-40-4506

  • III. Characteristics of Interface

– QIPI : Quantitative protein-protein Interface Propensity Index

  • IV. Evaluation

– SPR : Single domain based Patch Recognition

  • V. Summary
slide-3
SLIDE 3

I.Background and Motivation

slide-4
SLIDE 4
  • I. Background and Motivation
  • protein-protein

interaction

  • protein-protein

interface properties

  • protein-protein

interface prediction (residue recognition)

slide-5
SLIDE 5

II.Protein-Protein Interface Datasets

  • Comprehensive interface dataset

– Training – Astral2.05-40-4506

  • Testing interface dataset

– Docking Benchmark 2.0 – CAPRI25 and Enz35

slide-6
SLIDE 6

II.Protein-Protein Interface Datasets

  • SCOPe : Structural

Classification of Proteins — extended database (v2.05)

  • Astral2.05-40 : a subset of

SCOPe2.05 with less than 40% identity between any two domains

  • Astral2.05-40-4506 : 4506

interfaces obtained from the Astral2.05-40 dataset

slide-7
SLIDE 7
  • III. Characteristics of

Interface

  • Relative Interface Ratio (RIR) and Contact

Preferences

  • Residue Composition and QIPI
  • Secondary Structure
  • Contact preference
  • Interface Size
slide-8
SLIDE 8
  • III. Characteristics of Interface
  • --Relative Interface Ratio (RIR)

=

m m i i

f f w

=

m m i i

F F W

i i W

w RIR =

i

f

number of interface residues of type i

i

F

number of non-interface surface residues of type i

slide-9
SLIDE 9
  • III. Characteristics of Interface
  • --Contact Preferences

=

n m mn ij

C C q ContactFre

,

) ) ( ) ( ( log Pr

, 2 j i n m mn ij

w w C C ef Contact × =

ij

C

number of interface-crossing contacts between residues of types i and j

slide-10
SLIDE 10
  • III. Characteristics of Interface
  • --QIPI

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 H R K A V I L M P F W Y G C S T N Q D E Ratio Frequency Interface Non-Inter RIR

slide-11
SLIDE 11

Basic Hydrophobic Polar Acidic Aromatic

  • III. Characteristics of Interface
  • --QIPI

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 H R K A V I L M P F W Y G C S T N Q D E Ratio Frequency Interface Non-Inter RIR

slide-12
SLIDE 12

Basic Hydrophobic Polar Acidic Aromatic

  • III. Characteristics of Interface
  • --QIPI

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 H R K A V I L M P F W Y G C S T N Q D E Ratio Frequency Interface Non-Inter RIR

slide-13
SLIDE 13

Basic Hydrophobic Polar Acidic Aromatic

  • III. Characteristics of Interface
  • --QIPI

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 H R K A V I L M P F W Y G C S T N Q D E Ratio Frequency Interface Non-Inter RIR

slide-14
SLIDE 14
  • III. Characteristics of Interface
  • --QIPI
  • Interface preference:

– hydrophobic residues – aromatic residues – residues with long side chain

  • High interface propensity

– Arg, Phe, Met, Trp and Tyr

H R K A V I L M P F 1.147 1.346 0.784 0.841 0.994 1.084 1.144 1.451 1.109 1.334 W Y G C S T N Q D E 1.284 1.368 0.823 1.172 0.873 0.966 0.958 0.909 0.830 0.805

Quantitative residue interface propensity index

slide-15
SLIDE 15
  • III. Characteristics of Interface
  • --Secondary structure

0.2 0.4 0.6 0.8 1 1.2 H E C Interface Non-Inter RIR

slide-16
SLIDE 16
  • III. Characteristics of Interface
  • --Secondary structure

0.5 1 1.5 2 2.5 0.01 0.02 0.03 0.04 0.05 0.06 A-H A-C R-E N-H N-C D-E C-H C-C Q-E E-H E-C H-E I-H I-C L-E K-H K-C M-E F-H F-C P-E S-H S-C T-E W-H W-C Y-E V-H V-C G-E Ratio Frequency Interface Non-Inter RIR

slide-17
SLIDE 17
  • III. Characteristics of Interface
  • --Secondary structure

0.5 1 1.5 2 2.5 0.01 0.02 0.03 0.04 0.05 0.06 A-H A-C R-E N-H N-C D-E C-H C-C Q-E E-H E-C H-E I-H I-C L-E K-H K-C M-E F-H F-C P-E S-H S-C T-E W-H W-C Y-E V-H V-C G-E Ratio Frequency Interface Non-Inter RIR

slide-18
SLIDE 18
  • III. Characteristics of Interface
  • --Secondary structure
  • strand (E) residues : negative interface

propensity

  • coil(C) : positive interface propensity
  • Residue type : the principal factor of interface

propensity

slide-19
SLIDE 19
  • III. Characteristics of Interface
  • --Contact Preferences

=

n m mn ij

C C q ContactFre

,

) ) ( ) ( ( log Pr

, 2 j i n m mn ij

w w C C ef Contact × =

ij

C

number of interface-crossing contacts between residues of types i and j

slide-20
SLIDE 20
  • III. Characteristics
  • f Interface
  • --Contact preference

ØHigh preferences ØCys-Cys contacts Øhydrophobic contacts (A- W) Øaromatic contacts (P-Y : Phe-Cys, Phe-Phe, Phe-Trp, Phe-Tyr, Trp-Tyr, Tyr-His, Tyr- Lys and Tyr-Met ) Øcontacts between

  • ppositely charged residues

(Arg-Asp, Arg-Glu)

Arg, Phe, Trp and Tyr have the highest interface propensity

RIR of these residues >1.2 and the number of contacts include these residues with high contact preference (more than 1.5 in pink)

slide-21
SLIDE 21
  • III. Characteristics of Interface
  • --Interface Size

Fig.A : The average interface size is about 800 Å2 and there are about 86% of interface sizes in the range

  • f 0-2000 Å2.

Figure B : the size of interface residue number has a gamma distribution and the average of interface residue numbers is about 20. Fig.C : The average domain size is about 9000 Å2 which is much larger than that of interface.

slide-22
SLIDE 22
  • IV. Evaluation
  • --Interface residue recognition
  • Identification of surface

residues

  • Generation of residue side-

chain distance matrix

  • Construction of candidate

interface patches

  • Merging the candidate

interface patches

  • Selecting the top-ranked

candidate interface patch

slide-23
SLIDE 23
  • IV. Evaluation
  • --SPR : Single domain based Patch

Recognition

Distance(Å) ASA(> Å2) (2,5) (5,7) 20 (7,9) 40 (9,11) 60 (11,13) 80 (13,15) 100 Domain ASA(Å2) Identity Ratio (0,5000) 0.8 (5000,7500) 0.7 (7500,10000) 0.6 (10000,+ ∞) 0.5

Table 1 Patch Generation Thresholds A The ASA and distance with seed residue of patch residue B Thresholds for patch merging

slide-24
SLIDE 24

sol cons hydro res Patch

E w E w E w E E

3 2 1

+ + + =

  • =

r patch i r r i res

REF RIR ASA E

,

) (

=

r patch i i hydro

H E

,

− =

r patch i rr ir cons

B C E

,

) (

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − =

patch i

  • ut

i sphere i

  • ut

i

V V V E

, , ,

Hi is the hydrophobic score in the CASG920101 matrix of AAindex for the residue type r at sequence position i ASAi is the relative accessible surface area of residue r at sequence position I The RIRr for 20 amino acid residues are obtained from QIPI The REFr is the element of JANJ780101 in AAindex for residue type r Cir is the self-substitution score in the position-specific substitution matrix produced by PSI-BLAST for the residue type r at sequence position I Brr is the diagonal element of BLOSUM62 for residue type r Cyscore Vi,sphere is defined as the sphere volume in the solvent accessible surface Vi,out represents the volume out of the solvent accessible surface on residue i in the patch

  • IV. Evaluation
  • --SPR : Single domain based Patch

Recognition

slide-25
SLIDE 25

ACC COV F ∗ =

TP : True positive (real interface residues and right prediction ) FP : False positive (non- interface residues but predicted ones ) FN : false negative (real interface residues but wrong prediction )

  • IV. Evaluation
  • --SPR : Single domain based Patch

Recognition

slide-26
SLIDE 26

ACC COV F ∗ =

TP : True positive (real interface residues and right prediction ) FP : False positive (non- interface residues but predicted ones ) FN : false negative (real interface residues but wrong prediction )

  • IV. Evaluation
  • --SPR : Single domain based Patch

Recognition

slide-27
SLIDE 27
  • IV. Evaluation
  • --Contribution of interface features to

interface residue recognition

Coverage Accuracy F QIPI 0.472 0.188 0.089 Hydrophobic 0.321 0.238 0.076 Conservation 0.266 0.191 0.051 Solvation 0.147 0.160 0.023 QIPI+Hydrophobic 0.467 0.186 0.087 All-QIPI 0.312 0.239 0.075 All 0.475 0.194 0.092

Note: Bold values denote the best performance in each category.

slide-28
SLIDE 28
  • IV. Evaluation
  • --Performance of SPR

Comparisons of SPR with several popular interface prediction programs on CAPRI25 dataset

ACC COV SPR 0.34 0.4 Cons-PPISP 0.26 0.3 Meta-PPISP 0.28 0.39 Promate 0.26 0.3 PINUP 0.25 0.43

Comparisons of SPR with several popular interface prediction programs on Enz35 dataset

ACC COV SPR 0.36 0.58 Cons-PPISP 0.36 0.5 Meta-PPISP 0.48 0.55 Promate 0.4 0.45 PINUP 0.47 0.53

Note: Bold values denote the best performance in each category.

slide-29
SLIDE 29

V.Summary

  • A large-scale comprehensive interface dataset

Astral2.05-40-4506 for analysis

  • A novel quantitative residue interface

propensity index (QIPI)

  • An interface prediction method Single domain

based Patch Recognition (SPR)

slide-30
SLIDE 30

Acknowledgements

  • Shanghai Sailing Program (16YF1408600)
  • Shanghai Center for Bioinformation

Technology

– Prof. Yuan-Yuan Li , Prof. Yi-Xue Li and Liangxiao Ma

  • Suzhou Institute of Systems Medicine

– Prof. Aiping Wu and Prof. Taijiao Jiang