Non-Homogeneous Hidden Markov Model Qingyuan Liu Introduction (Why - - PowerPoint PPT Presentation
Non-Homogeneous Hidden Markov Model Qingyuan Liu Introduction (Why - - PowerPoint PPT Presentation
Non-Homogeneous Hidden Markov Model Qingyuan Liu Introduction (Why Homogeneous HMM) Classify new sequences into new family Add related sequences into MSA Compute MSA for groups of related sequence Introduction (Building a HMM)
Introduction (Why Homogeneous HMM)
- Classify new sequences into new family
- Add related sequences into MSA
- Compute MSA for groups of related sequence
Introduction (Building a HMM)
- Seed sequences for HMM building
- Ultra-large multiple sequence alignment using
Phylogeny-aware Profiles (UPP)
- Parameter
– Emission Probability – Transition Probability
Background (Long Indels)
- HMM can not deal with long indels.
- Example: 10 consecutive residue loss
- Assume 0.5 for each deletion transition probability
- 0.510 is extremely small
Significance
- Cause: Emission probability is fixed
- Do HMM non-homogeneously instead
– Emission probability is not fixed – Different parameters for different cases
Project
- Literature review for how to build a non-
homogeneous HMM.
- Propose ideas for how to build an non-homogeneous
HMM for MSA
- Literature review for other possible MSA
methods to deal with long indels
- Combined tree- and profile-based alignment
- Simulation based approach
- Group-to-group sequence alignment
Literature:
- Sarkar, Abhra, Anindya Bhadra, and Bani K. Mallick. "Nonparametric Bayesian
Approaches to Non-homogeneous Hidden Markov Models.” (n.d.): n. pag. 8 May
- 2012. Web. 4 Apr. 2017.
- Ghavidel, Fatemeh Zamanzad, Jargen Claesen, and Tomasz Burzykowski. "A
Nonhomogeneous Hidden Markov Model for Gene Mapping Based on Next- Generation Sequencing Data." Journal of Computational Biology 22.2 (2015): 178-88. Web.
- Grzegorczyk, Marco. "A Non-homogeneous Dynamic Bayesian Network with a
Hidden Markov Model Dependency Structure among the Temporal Data Points." Machine Learning 102.2 (2015): 155-207. Web.
- Gowri-Shankar, V., & Rattray, M. (2007). A Reversible Jump Method for
Bayesian Phylogenetic Inference with a Nonhomogeneous Substitution Model. Molecular Biology and Evolution,24(6), 1286-1299. doi:10.1093/molbev/ msm046.
- Aalen, O., & Johansen, S. (1978). An Empirical Transition Matrix for Non-
Homogeneous Markov Chains Based on Censored Observations. Scandinavian Journal of Statistics, 5(3), 141-150. Retrieved from http://www.jstor.org/stable/ 4615704
Literature:
- Vassiliou, P. G. (1997). The evolution of the theory of non‐homogeneous
Markov systems. Applied Stochastic Models and Data Analysis,13(34), 159-176. doi:10.1002/(sici)1099-0747(199709/12)13:3/4<159::aid-asm309>3.3.co;2-h
- Loytynoja, A., & Goldman, N. (2008). A model of evolution and structure for
multiple sequence alignment. Philosophical Transactions of the Royal Society B: Biological Sciences,363(1512), 3913-3919. doi:10.1098/rstb.2008.0170
- Karin, E. L., Rabin, A., Ashkenazy, H., Shkedy, D., Avram, O., Cartwright, R. A.,
& Pupko, T. (2015). Inferring Indel Parameters using a Simulation-based
- Approach. Genome Biology and Evolution,7(12), 3226-3238. doi:10.1093/gbe/
evv212
- Yamada, S., Gotoh, O., & Yamana, H. (2006). BMC Bioinformatics,7(1), 524.
doi:10.1186/1471-2105-7-524