Supervised Convolutional GSN for Protein Secondary Structure Prediction
Jian Zhou Olga Troyanskaya
Princeton University
Supervised Convolutional GSN for Protein Secondary Structure - - PowerPoint PPT Presentation
Supervised Convolutional GSN for Protein Secondary Structure Prediction Jian Zhou Olga Troyanskaya Princeton University Whats In this talk.. Problem: Predict protein secondary structure Iterative prediction with multi-layer
Jian Zhou Olga Troyanskaya
Princeton University
– Supervised GSN – Convolutional architecture for GSN – A trick for improving convergence and performance
Previous Approaches: neural network
from 1988 (Qian & Sejnowski); bidirectioal recurrent neural network (Baldi et al., 1999); conditional neural fields (Peng et al., 2009); many more…
MDLSALRVEEVQNVINAMQKILECP ICLELIKEPVSTKCDHIFCKFCMLKL LNQKKGPSQCPLCKNDITKRSLQE STRFSQLVEELLKIICAFQLDTGLEY ANSYNFAKKGK
Protein sequence
CCGGGSSHHHHHHHHHHHHHHTS CSSSCCCCSSCCBCTTSCCCCSH HHHHHHHSSSSSCCCTTTSCCCC TTTCBCCCSSSHHHHHHHHHHHH HHHHTCCCCCC
Secondary structure
Image credit: Wikimedia common
Predict
20 types of amino acids
8 classes
3D structure
Protein sequence
20 types of amino acids
8 classes
Secondary structure label sequence Predict
Evolutionary neighborhood
3D structure
connections
𝐼𝑢+1 ~ 𝑄𝜄1 𝐼 𝐼𝑢,𝑌𝑢 𝑌𝑢+1 ~ 𝑄𝜄2 𝑌 𝐼𝑢+1)
𝐼1 𝐼2 𝑌0 𝑌1 𝑌2 𝐼3
Learning the transition operators of a Markov chain whose stationary distribution estimates the data distribution 𝑄 (𝑌).
Learning 𝑄 𝑌 𝐼) can be much easier than 𝑄 (𝑌) by design. Trainable using back-propagation 𝐼0
Bengio, Y., Thibodeau-Laufer, É., Alain, G., and Yosinski, J. Deep Generative Stochastic Networks Trainable by Backprop
𝐼𝑢+1 ~ 𝑄𝜄1 𝐼 𝐼𝑢,𝑌𝑢 𝑌𝑢+1 ~ 𝑄𝜄2 𝑌 𝐼𝑢+1) 𝐼𝑢+1 ~ 𝑄𝜄1 𝐼 𝐼𝑢,𝑍
𝑢,𝑌0
𝑍
𝑢+1 ~ 𝑄𝜄2 𝑍
𝐼𝑢+1)
𝐼1 𝐼2 𝑌0 𝑌1 𝑌2 𝐼3 𝐼1 𝐼2 𝑍 𝑍
1
𝑌0 𝑍
2
𝐼3
Supervised GSN
𝐼0 𝐼0 Learning 𝑄 𝑍 𝐼) can be much easier than 𝑄 𝑍 𝑌 , utilizing previous state of the chain
𝐼𝑢+1 ~ 𝑄𝜄1 𝐼 𝐼𝑢,𝑍
𝑢,𝑌0
𝑍
𝑢+1 ~ 𝑄𝜄2 𝑍
𝐼𝑢+1)
𝐼1 𝐼2 𝑍 𝑍
1
𝑌0 𝑍
2
𝐼3
Maximize log-likelihoods True 𝑄(𝑍|𝑌0) 𝑄𝜄(𝑍|𝐼1) 𝑄𝜄(𝑍|𝐼2) 𝑍 𝑍
1
Multi-scale representation – multi-layer convolutional architecture Local information sensitive – output unit at bottom layer
𝑍 𝐼0 𝑌
W1’ W1 W1’ W1 W1 W1’ W2 W2’ W2 W2 W2’
𝑍
1
𝐼1 𝑍
2
𝑍
3
𝐼1 𝐼2 𝑍 𝑌0 𝐼3
Conv Pool Conv
𝑍 𝐼0 𝑌 𝐼1
tanh tanh
Mean pooling
…
Initialize at a specified test initialization value for a subset of training batches:
Experiments on initialization of chain during training 𝑍 𝐼0 𝑌
W1’ W1 W1’ W1 W1 W1’ W2 W2’ W2 W2 W2’
𝑍
1
𝐼1 𝑍
2
𝑍
3
𝐼1 𝐼2 𝑍 𝑌0 𝐼3
Accuracy # of iterations
0% 20% 50% 80% 100%
Accuracy # of iterations
𝑍
0 𝑢𝑠𝑣𝑓
𝑍
0 𝑢𝑓𝑡𝑢
CullPDB-30 test set Overall Accuracy (8-class) 1 layer 0.714 ± 0.006 2 layers 0.720 ± 0.006 3 layers 0.721 ± 0.006 CB513 dataset Overall Accuracy (8-class) RaptorSS8/CNF 0.649 ± 0.003 Our method 0.664 ± 0.005
Cull PDB dataset (6133 proteins with <30% identity between any protein pairs); available at www.princeton.edu/~jzthree/datasets
single protein prediction example Performance through averaging iterative predictions: 𝑍
1
𝑍
2
𝑍
4
𝑍
8
𝑍
16
𝑍
32
𝑀𝑏𝑐𝑓𝑚
– Stochastic iterative prediction through Markov chain – Initialization trick improve both performance and convergence rate empirically
– Combine high level representation and local prediction – Improved over previous best performance
𝑋𝑌→𝐼0
(Amino acids)
Position Channel
𝑋𝑍→𝐼0
(Secondary structure)
𝑋𝐼0→𝑍
(Secondary structure)