Computational Challenges in Computing Nearest Neighbor Estimates of - PowerPoint PPT Presentation

Computational Challenges in Computing Nearest Neighbor Estimates of Entropy for Large Molecules Home Page Title Page Contents ◭◭ ◮◮ E. James Harner, Harshinder Singh Shengqiao Li, and Jun Tan ◭ ◮ Page 1 of 40 Research supported by: Biostatistics Branch, National Institute for Occupational Go Back Safety and Health, Morgantown, WV Full Screen September 19, 2003 Close Quit

Home Page Title Page Probabilistic Modelling of Molecular Vibrations Contents ◭◭ ◮◮ ⋆ Modelling random vibrations in molecules is important for studying their prop- erties and functions. ◭ ◮ ⋆ Entropy is a measure of freedom of a system to explore its available configuration space. Page 2 of 40 ⋆ Entropy evaluation is important in order to understand the factors involved in Go Back the stability of a conformation and the change from one conformation to another. Full Screen Close Quit

Home Page Entropy in Protein Folding Title Page Contents ⋆ Proteins are biological molecules that are of primary importance to all living organisms. ◭◭ ◮◮ ⋆ Protein are made up of many amino acids (called residues) linked together. ◭ ◮ ⋆ A human body contains over 30,000 different kinds of proteins. ⋆ Protein misfolding is the cause of protein-folding diseases: Alzheimers disease, Page 3 of 40 mad cow disease, cystic fibrosis and some types of cancer. ⋆ It is important to study the stability of a protein and the key is to find a small Go Back molecule (a drug) that can stabilize the normally folded structure. Full Screen Close Quit

Insulin Protein Home Page Title Page Contents ◭◭ ◮◮ ◭ ◮ Page 4 of 40 Go Back Full Screen Close Quit

Home Page Entropy Title Page ⋆ Entropy of a molecular conformation depends on the coordinates of the confor- Contents mation. These are: ◭◭ ◮◮ – Bond lengths – Bond angles ◭ ◮ – Torsional angles (dihedral or rotational degrees of freedom) ⋆ 1. and 2. are rather hard coordinates, entropy is mainly determined by fluctua- Page 5 of 40 tions in torsional angles. Go Back ⋆ Probability modeling of torsional angles of a molecular system is important for entropy evaluation. Full Screen Close Quit

Methanol Molecule Home Page Title Page Contents ◭◭ ◮◮ ◭ ◮ Page 6 of 40 Go Back Full Screen Close Quit

Home Page Probabilistic Modelling of Torsional Angles Title Page Contents ⋆ In molecular biology literature, torsional angles are assumed to have multivariate Gaussian (Normal) distribution (Karplus and Kushik (1981), Macromolecules, ◭◭ ◮◮ Levy et al (1984), Macromolecules) . The entropy is then given by ◭ ◮ S c = mk B + k B 2 ln[(2 π ) m | Σ | ] 2 Page 7 of 40 ⋆ S c is estimated by using the maximum likelihood estimate of the determinant of the variance-covariance matrix of torsional angles using data on torsional angles Go Back of the molecular system Full Screen Close Quit

Home Page Probability Modeling of Torsional Angles Title Page ⋆ There are common situations where assuming a Gaussian distribution for tor- Contents sional angles is not realistic, e.g., ◭◭ ◮◮ – Modeling a torsional angle which has more than one peak. – Modeling a torsional angle where there is more free movement, e.g., in gases. ◭ ◮ ⋆ In Demchuk and Singh( 2001, Molecular Physics ) Page 8 of 40 – We proposed a circular probability modeling approach for modeling torsional angles. Go Back – The torsional angle of the methanol molecule was modeled by using a von Mises distribution (most commonly used distribution on the circle). Full Screen Close Quit

Probability Modeling of Tosional Angles Home Page ⋆ A circular random variable Θ follows l -mode von Mises distribution if its proba- Title Page bility density function is given by: Contents 1 2 πI 0 ( κ ) e κ cos[ l ( θ − θ 0 )] , f ( θ ) = − π ≤ θ < π. ◭◭ ◮◮ κ = concentration parameter, l = number of modes ◭ ◮ I 0 = Modified Bessel function of order 0 θ 0 = Position of first mode For l > 2, the modes are 2 π/l radians apart. Page 9 of 40 ⋆ For l = 1: Go Back – Mean angle is θ 0 . Full Screen – If κ = 0, it is uniform distribution – For large κ , it is approximately Gaussian dist. Close Quit

von Mises Distribution Home Page Title Page Contents ◭◭ ◮◮ ◭ ◮ Page 10 of 40 Go Back Full Screen Close Quit

von Mises Distribution Home Page Title Page Contents ◭◭ ◮◮ ◭ ◮ Page 11 of 40 Go Back Full Screen Close Quit

Home Page Probability Modeling of Torsional Angles Title Page We assumed independent von Mises distributions for torsional angles. Let Θ i have an Contents l i -mode von Mises distribution, i = 1 , 2 , , m with concentration parameter κ i . Then the entropy of the system is given by: ◭◭ ◮◮ I 1 ( k i ) � � S c = k B [ m ln 2 π + ln I 0 ( k i )] − k i I 0 ( k i ) ◭ ◮ where I 1 is the modified Bessel function of order 1. From the Boltzman Gibbs distri- Page 12 of 40 bution, the potential energy of the system is given by m V ( θ 1 , θ 2 , . . . , θ m ) = 1 Go Back � k i [1 − cos( l i ( θ i − θ i 0 ))] , B i =1 Full Screen Close Quit

Home Page Modeling Torsional Angle of Methanol Title Page As a case study, we considered the torsional angle of a methanol molecule. We Contents assumed a 3-mode von Mises distribution for its torsional angle Θ i.e.: ◭◭ ◮◮ 1 2 πI 0 ( κ ) e κ cos[3( θ − θ 0 )] , f ( θ ) = − π ≤ θ < π. ◭ ◮ The potential energy Page 13 of 40 V (Θ) = k B [1 − cos(3( θ − θ 0 ))] = V 0 2 [1 − cos 3( θ − θ 0 )] Go Back where V 0 = maximum potential energy. Full Screen Close Quit

Home Page A Bathtub Shaped Distribution for Potential Energy Title Page For methanol molecule, the potential energy is Contents V = V 0 ◭◭ ◮◮ 2 [1 − cos 3( θ − θ 0 )] Assuming Θ to have a 3-mode von Mises distribution, we derived the following p.d.f. ◭ ◮ for V : 1 πI 0 ( κ ) e κ (1 − 2 v v 0 ) v − 1 2 ( v 0 − v ) − 1 g ( v ) = 2 Page 14 of 40 This is a bathtub shaped probability distribution. For κ = 0 , V/V 0 has beta(1/2, Go Back 1/2) distribution Full Screen Close Quit

A Bath-tub Shaped Distribution Home Page Title Page Contents ◭◭ ◮◮ ◭ ◮ Page 15 of 40 Go Back Full Screen Close Quit

Histograms of Torsional Angle and Energy Home Page Title Page Contents ◭◭ ◮◮ ◭ ◮ Page 16 of 40 Go Back Full Screen Close Quit

Fitting von Mises and Bath-tub Shaped Distributions Home Page Title Page Contents ◭◭ ◮◮ ◭ ◮ Page 17 of 40 Go Back Full Screen Close Quit

A Bivariate Circular Model(Singh et al, 2002, Biometrika) Home Page Title Page ⋆ Let Θ 1 and Θ 2 be the two circular random variables. We introduced a joint probability distribution for Θ 1 and Θ 2 with pdf given by Contents ◭◭ ◮◮ f ( θ 1 , θ 2 ) = Ce κ 1 cos( θ 1 − µ 1 )+ κ 2 cos( θ 2 − µ 2 )+ λ sin( θ 1 − µ 1 ) sin( θ 2 − µ 2 ) , − π = θ 1 , θ 2 < π, ◭ ◮ where κ 1 , κ 2 ≥ 0 , −∞ < λ < ∞ , − π ≤ µ 1 , µ 2 < π and C is normalizing constant Page 18 of 40 ⋆ If fluctuations in Θ 1 and Θ 2 are sufficiently small, then (Θ 1 , Θ 2 ) follows approximately a bivariate normal distribution with Go Back κ 2 κ 1 λ σ 2 κ 1 κ 2 − λ 2 , σ 2 1 = 2 = κ 1 κ 2 − λ 2 , ρ = . √ κ 1 κ 2 Full Screen Close Quit

A Bivariate Circular Model Home Page Title Page ⋆ The normalizing constant C is given by � � λ 2 Contents ∞ � m 1 � 2 m � C = 4 π 2 I m ( κ 1 ) I m ( κ 2 ) m 4 κ 1 κ 2 ◭◭ ◮◮ m =0 where I m is a modified Bessel function of order m . ◭ ◮ ⋆ E [sin(Θ i − µ i )] = 0 , i = 1 , 2 implies that ?i is the circular mean of Θ i . Page 19 of 40 ⋆ Circular variance of Θ 1 is given by Go Back � � λ 2 ∞ � m � 2 m � 1 − E [cos( θ 1 − µ 1 )] = 1 − 4 Cπ 2 I m +1 ( κ 1 ) I m ( κ 2 ) Full Screen m 4 κ 1 κ 2 m =0 Close Quit

A Bivariate Circular Model Home Page Title Page ⋆ The conditional distributions of Θ 1 and Θ 2 are von Mises Contents ⋆ The marginal distribution of Θ 1 is symmetric around θ 1 = µ 1 and unimodal (bimodal) when ◭◭ ◮◮ A ( κ 2 ) = I 1 ( κ 2 ) I 0 ( κ 2 ) ≤ ( ≥ ) κ 1 κ 2 ◭ ◮ λ 2 ⋆ A generalization which allows multiple peaks in marginal distributions Page 20 of 40 Go Back f ( θ 1 , θ 2 ) = Ce κ 1 cos( l 1 ( θ 1 − µ 1 ))+ κ 2 cos( l 2 ( θ 2 − µ 2 ))+ λ sin( l 1 ( θ 1 − µ 1 )) sin( l 2 ( θ 2 − µ 2 )) , − π ≤ θ 1 , θ 2 < π, Full Screen where l 1 , l 2 are positive integers. Close Quit

Home Page Nearest Neighbor Estimates of Entropy(Singh et al., 2002) Title Page ⋆ Let X 1 , X 2 , .., X n be a random sample from a population with pdf f ( x ). Contents ⋆ R i,k = Euclidean distance from X i to its k th closest neighbor. ◭◭ ◮◮ ⋆ Then a reasonable estimate of f ( X i ) is given by R p i,k π p/ 2 ◭ ◮ k Γ( p/ 2 + 1) = k ˆ f ( X i ) n Page 21 of 40 ⋆ The above equation gives Go Back f ( X i ) = k Γ( p/ 2 + 1) ˆ i,k π p/ 2 , i = 1 , 2 , 3 , . . . , n, nR p Full Screen Close Quit

Computational Challenges in Computing Nearest Neighbor Estimates of - PowerPoint PPT Presentation

Computational Challenges in Computing Nearest Neighbor Estimates of Entropy for Large Molecules Home Page Title Page Contents E. James Harner, Harshinder Singh Shengqiao Li, and Jun Tan Page 1 of 40 Research

Nearest Neighbor and Locality-Sensitive Hashing Nearest Neighbor Set Similarity

NEAREST NEIGHBOR RULE Jeff Robble, Brian Renzenbrink, Doug Roberts Nearest Neighbor Rule

CSCI 447/547 MACHINE LEARNING Outline Nearest Neighbor K-Nearest Neighbor Algorithm

High-Dimensional Nearest Neighbor Search High-Dimensional Nearest Neighbor Search Who?

Nearest Neighbor Classification Machine Learning 1 This lecture K-nearest neighbor

Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled

Graph-based Nearest Neighbor Search: From Practice to Theory Liudmila Prokhorenkova, Aleksandr

The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Hypothesis Space Hypothesis Space

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M.

Simultaneous Nearest Neighbor Search Piotr Indyk Robert Kleinberg MIT Cornell Sepideh

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Simple and Fast Nearest Neighbor Search Marcel Birn, Manuel Holtgrewe, Peter Sanders , Johannes

9/28/2009 Nearest Neighbor Queries What are the two nearest stars to Andromeda? Reverse

The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal Jiantao Jiao

Learning: Nearest Neighbor, Perceptrons & Neural Nets Artificial Intelligence CSPP 56553

Continuous Nearest Neighbor Search Yufei Tao, Dimitris Papadias, Qiongmao Shen Hong Kong

EU Perspective Wojciech Dziworski First Secretary EU Delegation to India 10th CII National Food

Beyond the Obvious: Beyond the Obvious: National Econom ic I m pact National Econom ic I m pact

Which Melbourne? Augmenting Geocoding with Maps Milan Gritta, Mohammad Taher Pilehvar,

Democratizing risk assessments for biosafety Professor Michel Pimbert Centre for Agroecology,

Lessons Learned from Hurricane Katrina LTG. RET. Russel L. Honor CDR JTF-Katrina Lessons

6 BEST FRIENDS 5 1.53 Check/clarify: performer , enter ( the show ), Objectives audition , stage

Quality bringing prosperity to European mountain territories: the European Charter for

[ OpenLab ESEV ] a narrative of Libre Software and Free Culture in a portuguese higher education

Computational Challenges in Computing Nearest Neighbor Estimates of - PowerPoint PPT Presentation

Computational Challenges in Computing Nearest Neighbor Estimates of Entropy for Large Molecules Home Page Title Page Contents E. James Harner, Harshinder Singh Shengqiao Li, and Jun Tan Page 1 of 40 Research

Nearest Neighbor and Locality-Sensitive Hashing Nearest Neighbor Set Similarity

NEAREST NEIGHBOR RULE Jeff Robble, Brian Renzenbrink, Doug Roberts Nearest Neighbor Rule

CSCI 447/547 MACHINE LEARNING Outline Nearest Neighbor K-Nearest Neighbor Algorithm

High-Dimensional Nearest Neighbor Search High-Dimensional Nearest Neighbor Search Who?

Nearest Neighbor Classification Machine Learning 1 This lecture K-nearest neighbor

Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search Sariel Har-Peled

Graph-based Nearest Neighbor Search: From Practice to Theory Liudmila Prokhorenkova, Aleksandr

The Nearest Neighbor Algorithm The Nearest Neighbor Algorithm Hypothesis Space Hypothesis Space

Learning From Data Lecture 16 Similarity and Nearest Neighbor Similarity Nearest Neighbor M.

Simultaneous Nearest Neighbor Search Piotr Indyk Robert Kleinberg MIT Cornell Sepideh

BAYES AND NEAREST NEIGHBOR BAYES AND NEAREST NEIGHBOR CLASSIFIERS CLASSIFIERS Matthieu R Bloch

Simple and Fast Nearest Neighbor Search Marcel Birn, Manuel Holtgrewe, Peter Sanders , Johannes

9/28/2009 Nearest Neighbor Queries What are the two nearest stars to Andromeda? Reverse

The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal Jiantao Jiao

Learning: Nearest Neighbor, Perceptrons &amp; Neural Nets Artificial Intelligence CSPP 56553

Continuous Nearest Neighbor Search Yufei Tao, Dimitris Papadias, Qiongmao Shen Hong Kong

EU Perspective Wojciech Dziworski First Secretary EU Delegation to India 10th CII National Food

Beyond the Obvious: Beyond the Obvious: National Econom ic I m pact National Econom ic I m pact

Which Melbourne? Augmenting Geocoding with Maps Milan Gritta, Mohammad Taher Pilehvar,

Democratizing risk assessments for biosafety Professor Michel Pimbert Centre for Agroecology,

Lessons Learned from Hurricane Katrina LTG. RET. Russel L. Honor CDR JTF-Katrina Lessons

6 BEST FRIENDS 5 1.53 Check/clarify: performer , enter ( the show ), Objectives audition , stage

Quality bringing prosperity to European mountain territories: the European Charter for

[ OpenLab ESEV ] a narrative of Libre Software and Free Culture in a portuguese higher education

Learning: Nearest Neighbor, Perceptrons & Neural Nets Artificial Intelligence CSPP 56553