A novel index of protein- protein interface propensity improves - - PowerPoint PPT Presentation
A novel index of protein- protein interface propensity improves - - PowerPoint PPT Presentation
A novel index of protein- protein interface propensity improves interface residue recognition Wentao Dai Email: wtdai AT scbit.org Shanghai Center for Bioinformation Technology SCBIT) Outline I. Background and Motivation II.
Outline
I. Background and Motivation
- II. Protein-Protein Interface Datasets
– Astral2.05-40-4506
- III. Characteristics of Interface
– QIPI : Quantitative protein-protein Interface Propensity Index
- IV. Evaluation
– SPR : Single domain based Patch Recognition
- V. Summary
I.Background and Motivation
- I. Background and Motivation
- protein-protein
interaction
- protein-protein
interface properties
- protein-protein
interface prediction (residue recognition)
II.Protein-Protein Interface Datasets
- Comprehensive interface dataset
– Training – Astral2.05-40-4506
- Testing interface dataset
– Docking Benchmark 2.0 – CAPRI25 and Enz35
II.Protein-Protein Interface Datasets
- SCOPe : Structural
Classification of Proteins — extended database (v2.05)
- Astral2.05-40 : a subset of
SCOPe2.05 with less than 40% identity between any two domains
- Astral2.05-40-4506 : 4506
interfaces obtained from the Astral2.05-40 dataset
- III. Characteristics of
Interface
- Relative Interface Ratio (RIR) and Contact
Preferences
- Residue Composition and QIPI
- Secondary Structure
- Contact preference
- Interface Size
- III. Characteristics of Interface
- --Relative Interface Ratio (RIR)
∑
=
m m i i
f f w
∑
=
m m i i
F F W
i i W
w RIR =
i
f
number of interface residues of type i
i
F
number of non-interface surface residues of type i
- III. Characteristics of Interface
- --Contact Preferences
∑
=
n m mn ij
C C q ContactFre
,
) ) ( ) ( ( log Pr
, 2 j i n m mn ij
w w C C ef Contact × =
∑
ij
C
number of interface-crossing contacts between residues of types i and j
- III. Characteristics of Interface
- --QIPI
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 H R K A V I L M P F W Y G C S T N Q D E Ratio Frequency Interface Non-Inter RIR
Basic Hydrophobic Polar Acidic Aromatic
- III. Characteristics of Interface
- --QIPI
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 H R K A V I L M P F W Y G C S T N Q D E Ratio Frequency Interface Non-Inter RIR
Basic Hydrophobic Polar Acidic Aromatic
- III. Characteristics of Interface
- --QIPI
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 H R K A V I L M P F W Y G C S T N Q D E Ratio Frequency Interface Non-Inter RIR
Basic Hydrophobic Polar Acidic Aromatic
- III. Characteristics of Interface
- --QIPI
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 H R K A V I L M P F W Y G C S T N Q D E Ratio Frequency Interface Non-Inter RIR
- III. Characteristics of Interface
- --QIPI
- Interface preference:
– hydrophobic residues – aromatic residues – residues with long side chain
- High interface propensity
– Arg, Phe, Met, Trp and Tyr
H R K A V I L M P F 1.147 1.346 0.784 0.841 0.994 1.084 1.144 1.451 1.109 1.334 W Y G C S T N Q D E 1.284 1.368 0.823 1.172 0.873 0.966 0.958 0.909 0.830 0.805
Quantitative residue interface propensity index
- III. Characteristics of Interface
- --Secondary structure
0.2 0.4 0.6 0.8 1 1.2 H E C Interface Non-Inter RIR
- III. Characteristics of Interface
- --Secondary structure
0.5 1 1.5 2 2.5 0.01 0.02 0.03 0.04 0.05 0.06 A-H A-C R-E N-H N-C D-E C-H C-C Q-E E-H E-C H-E I-H I-C L-E K-H K-C M-E F-H F-C P-E S-H S-C T-E W-H W-C Y-E V-H V-C G-E Ratio Frequency Interface Non-Inter RIR
- III. Characteristics of Interface
- --Secondary structure
0.5 1 1.5 2 2.5 0.01 0.02 0.03 0.04 0.05 0.06 A-H A-C R-E N-H N-C D-E C-H C-C Q-E E-H E-C H-E I-H I-C L-E K-H K-C M-E F-H F-C P-E S-H S-C T-E W-H W-C Y-E V-H V-C G-E Ratio Frequency Interface Non-Inter RIR
- III. Characteristics of Interface
- --Secondary structure
- strand (E) residues : negative interface
propensity
- coil(C) : positive interface propensity
- Residue type : the principal factor of interface
propensity
- III. Characteristics of Interface
- --Contact Preferences
∑
=
n m mn ij
C C q ContactFre
,
) ) ( ) ( ( log Pr
, 2 j i n m mn ij
w w C C ef Contact × =
∑
ij
C
number of interface-crossing contacts between residues of types i and j
- III. Characteristics
- f Interface
- --Contact preference
ØHigh preferences ØCys-Cys contacts Øhydrophobic contacts (A- W) Øaromatic contacts (P-Y : Phe-Cys, Phe-Phe, Phe-Trp, Phe-Tyr, Trp-Tyr, Tyr-His, Tyr- Lys and Tyr-Met ) Øcontacts between
- ppositely charged residues
(Arg-Asp, Arg-Glu)
Arg, Phe, Trp and Tyr have the highest interface propensity
RIR of these residues >1.2 and the number of contacts include these residues with high contact preference (more than 1.5 in pink)
- III. Characteristics of Interface
- --Interface Size
Fig.A : The average interface size is about 800 Å2 and there are about 86% of interface sizes in the range
- f 0-2000 Å2.
Figure B : the size of interface residue number has a gamma distribution and the average of interface residue numbers is about 20. Fig.C : The average domain size is about 9000 Å2 which is much larger than that of interface.
- IV. Evaluation
- --Interface residue recognition
- Identification of surface
residues
- Generation of residue side-
chain distance matrix
- Construction of candidate
interface patches
- Merging the candidate
interface patches
- Selecting the top-ranked
candidate interface patch
- IV. Evaluation
- --SPR : Single domain based Patch
Recognition
Distance(Å) ASA(> Å2) (2,5) (5,7) 20 (7,9) 40 (9,11) 60 (11,13) 80 (13,15) 100 Domain ASA(Å2) Identity Ratio (0,5000) 0.8 (5000,7500) 0.7 (7500,10000) 0.6 (10000,+ ∞) 0.5
Table 1 Patch Generation Thresholds A The ASA and distance with seed residue of patch residue B Thresholds for patch merging
sol cons hydro res Patch
E w E w E w E E
3 2 1
+ + + =
∑
∈
- =
r patch i r r i res
REF RIR ASA E
,
) (
∑
∈
=
r patch i i hydro
H E
,
∑
∈
− =
r patch i rr ir cons
B C E
,
) (
∑
∈
⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − =
patch i
- ut
i sphere i
- ut
i
V V V E
, , ,
Hi is the hydrophobic score in the CASG920101 matrix of AAindex for the residue type r at sequence position i ASAi is the relative accessible surface area of residue r at sequence position I The RIRr for 20 amino acid residues are obtained from QIPI The REFr is the element of JANJ780101 in AAindex for residue type r Cir is the self-substitution score in the position-specific substitution matrix produced by PSI-BLAST for the residue type r at sequence position I Brr is the diagonal element of BLOSUM62 for residue type r Cyscore Vi,sphere is defined as the sphere volume in the solvent accessible surface Vi,out represents the volume out of the solvent accessible surface on residue i in the patch
- IV. Evaluation
- --SPR : Single domain based Patch
Recognition
ACC COV F ∗ =
TP : True positive (real interface residues and right prediction ) FP : False positive (non- interface residues but predicted ones ) FN : false negative (real interface residues but wrong prediction )
- IV. Evaluation
- --SPR : Single domain based Patch
Recognition
ACC COV F ∗ =
TP : True positive (real interface residues and right prediction ) FP : False positive (non- interface residues but predicted ones ) FN : false negative (real interface residues but wrong prediction )
- IV. Evaluation
- --SPR : Single domain based Patch
Recognition
- IV. Evaluation
- --Contribution of interface features to
interface residue recognition
Coverage Accuracy F QIPI 0.472 0.188 0.089 Hydrophobic 0.321 0.238 0.076 Conservation 0.266 0.191 0.051 Solvation 0.147 0.160 0.023 QIPI+Hydrophobic 0.467 0.186 0.087 All-QIPI 0.312 0.239 0.075 All 0.475 0.194 0.092
Note: Bold values denote the best performance in each category.
- IV. Evaluation
- --Performance of SPR
Comparisons of SPR with several popular interface prediction programs on CAPRI25 dataset
ACC COV SPR 0.34 0.4 Cons-PPISP 0.26 0.3 Meta-PPISP 0.28 0.39 Promate 0.26 0.3 PINUP 0.25 0.43
Comparisons of SPR with several popular interface prediction programs on Enz35 dataset
ACC COV SPR 0.36 0.58 Cons-PPISP 0.36 0.5 Meta-PPISP 0.48 0.55 Promate 0.4 0.45 PINUP 0.47 0.53
Note: Bold values denote the best performance in each category.
V.Summary
- A large-scale comprehensive interface dataset
Astral2.05-40-4506 for analysis
- A novel quantitative residue interface
propensity index (QIPI)
- An interface prediction method Single domain
based Patch Recognition (SPR)
Acknowledgements
- Shanghai Sailing Program (16YF1408600)
- Shanghai Center for Bioinformation
Technology
– Prof. Yuan-Yuan Li , Prof. Yi-Xue Li and Liangxiao Ma
- Suzhou Institute of Systems Medicine