Frank Dehne ■ www.dehne.net
Frank Dehne
School of Computer Science Centre For Advanced Studies Canada
Protein Interaction Prediction: The PIPE and InSiPS Projects Frank - - PowerPoint PPT Presentation
Protein Interaction Prediction: The PIPE and InSiPS Projects Frank Dehne School of Computer Science Centre For Advanced Studies Canada Frank Dehne www.dehne.net Parallel Computational Biochemistry Protein-Protein Interactions Frank Dehne
Frank Dehne ■ www.dehne.net
School of Computer Science Centre For Advanced Studies Canada
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
– F.Dehne
– A.Golshani – A.Wong – J.Greenblatt (Toronto)
– J.Green
– S.Pitre, C.North, A.Amos-
Binks, A.Schoenrock, ...
– B.Samanfar, M.Hooshyar,
M.Alamgir, K.Omidi, D.Burnside , ...
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
Primary Sequence: V H L T P E E K ... 3D Structure:
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
Partial Arabidopsis PPI Network
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
Do YGL227W and YMR135C interact?
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
species # proteins # protein pairs # known interactions # unknown interactions
6,300 19,867,056 15,151 ???
23,684 280,454,086 6,607 ???
22,513 253,406,328 41,678 ???
Frank Dehne ■ www.dehne.net
– No 3D structure information needed. (PDB is small) – Can be applied to all proteins, even those without known 3D
structure.
– Can be applied to all genomes, even newly sequenced ones.
Frank Dehne ■ www.dehne.net
String comparison
Match = (Sum of pairwise PAM values > Threshold)
Frank Dehne ■ www.dehne.net
Positive
Frank Dehne ■ www.dehne.net
Negative
Frank Dehne ■ www.dehne.net
Yeast: YGL227W - YMR135C
Frank Dehne ■ www.dehne.net
Banting and Best Institute of Medical Research, Toronto
Yeast: YGL227W - YMR135C Experimental Verification
Frank Dehne ■ www.dehne.net
Banting and Best Institute of Medical Research, Toronto
Yeast: YGL227W - YMR135C
Frank Dehne ■ www.dehne.net
Banting and Best Institute of Medical Research, Toronto
Protein complex: YGL227W, YMR135C, YIL017C, YDL176W, YIL097W, YDR255C, YBR105C
Yeast: YGL227W - YMR135C
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
species # proteins # protein pairs # known interactions # unknown interactions
6,300 19,867,056 15,151 ???
23,684 280,454,086 6,607 ???
22,513 253,406,328 41,678 ???
Frank Dehne ■ www.dehne.net
– Requires innovative data structures for approx. string matching
(Hamming distance via PAM matrix).
– Requires high performance computing.
– Requires very high specificity ~99.95 % (i.e. less than 0.05% false
positive rate)
– Otherwise: #false positives > #true positives
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
PIPE Consensus (incl. PIPE) PIPE 2nd 2nd Yeast Human
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
H.sapiens protein pairs
Architecture:
Cluster of multi-core processors
One MP-PIPE worker per proc.
Each worker with multiple threads
Frank Dehne ■ www.dehne.net
species # proteins # protein pairs # known interaction s # novel PIPE pred. * S. cerevisiae 6,300 19,867,056 15,151 14,438
23,684 280,454,086 6,607 32,548 H.sapiens 22,513 253,406,328 41,678 130,470 * False positive rate: 0.0001 MP-PIPE's superior performance and prediction accuracy enabled the first ever complete scan of entire protein interaction networks 1 hour 1 week 3 months Running time
(1,000 proc. cores)
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
– a set of target proteins and – a set of non-target proteins.
– predicted to interact with the
target proteins and
– predicted not to interact with
the non-targets.
targets non-targets
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
target No side effects pathway intercept
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
#Nodes (16 cores per node) #Nodes (16 cores per node) Population Size: 1500 Sequences. 1 Target. 250 Non-targets.
Frank Dehne ■ www.dehne.net
Frank Dehne ■ www.dehne.net
“Good” Cases “Bad” Cases
Frank Dehne ■ www.dehne.net
blocking its function.
HHHHHHSDNEHLHKCQRLKTRWKMARQFSDPQHNMYWIINWAQAMNIHADQNQEEEEELHDASVNNAEQYMAQCAPE EACQYPVRRSYGLHATNCIERRKCCMIMYQHPTCRQWEAKNTCAISRAGKGVYWKGIIFMRAWKHWCTRRLVQ
Blue Gene /Q
Frank Dehne ■ www.dehne.net
blocking its function.
HHHHHHSDNEHLHKCQRLKTRWKMARQFSDPQHNMYWIINWAQAMNIHADQNQEEEEELHDASVNNAEQYMAQCAPE EACQYPVRRSYGLHATNCIERRKCCMIMYQHPTCRQWEAKNTCAISRAGKGVYWKGIIFMRAWKHWCTRRLVQ
Frank Dehne ■ www.dehne.net
UV Light
Deletion
Frank Dehne ■ www.dehne.net
WT WT (empty vector) WT + Anti-PSK1 expressed PSK1 knockout Expression of Anti-Psk1 causes sensitivity to UV light. Equal numbers of cells serially diluted and exposed to 30s of UV light Decreasing cell density
Frank Dehne ■ www.dehne.net
With Alex Blais, Ottawa General Hospital
Frank Dehne ■ www.dehne.net
With Alex Blais, Ottawa General Hospital
Frank Dehne ■ www.dehne.net
Healthy donor Dystrophic patient
Stem Cell Therapy
With Alex Blais, Ottawa General Hospital
Frank Dehne ■ www.dehne.net
Problem: Immediate fusion of satellite cells
With Alex Blais, Ottawa General Hospital
Frank Dehne ■ www.dehne.net
Problem: Immediate fusion of satellite cells
Research Questions:
interaction between Six1 and Eya?
factors it interacts with directly? Can we disrupt their interaction too?
With Alex Blais, Ottawa General Hospital
Frank Dehne ■ www.dehne.net
Bioinformatics 15:383, 2014.
Scientific Reports (Nature.com/srep), vol.2, art.239, 2012.
re-occurring short polypeptide sequences", Nucleic Acids Research, vol.36, pp.4286-4294, 2008.
polypeptide sequences between known interacting protein pairs", BMC Bioinformatics, vol.7, p.365 (15 pages), 2006.