Predicting virus mutations through relational learning AIMM 2012 E - PowerPoint PPT Presentation

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Predicting virus mutations through relational learning AIMM 2012 E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 September 9 th , 2012 1 - D´ epartement d’Informatique, FS, Universit´ e Libre de Bruxelles 2 - Department of Computer Science and Information Engineering, University of Trento 3 - Ambiotec sas E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 1/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Motivations Mining relevant features from protein mutation data understanding the properties of functional sites developing novel proteins with useful/relevant function Rational Design engineering technique modifying existing proteins by site directed mutagenesis assumes knowledge (or intuition) about the e ff ects of specific mutations involves extensive trial-and-error experiments also serves to improve understanding protein function E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 2/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Introduction An artificial system mimicking rational design Goal To build an artificial system mimicking the rational design process A relational learning approach to: 1 mine rules from mutation data describing mutations relevant to a certain behavior 2 use the rules to infer novel mutations that may induce a similar behavior E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 3/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion A Relational Learning Approach dataset of background mutations / mutants knowledge rank of novel relevant mutations hypothesis E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 4/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Step 1: Relational Learning Phase Learning in First Order Logic data D , background knowledge B and features induced during learning are represented in first order logic res against(M,nnrti) ← mut(M,P) AND close to site(P) body head searching for a set of clauses (hypothesis) covering all or most positive examples, and none or few negative ones. Advantages expressivity and interpretability of the learned model possibility to make use of specific background knowledge ability to learn rules from description of complex, structured entities the learnt rules constrain the rational design space E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 5/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Step 2: Generative Phase Mutation Generation Algorithm Algorithm Mutation generation 1: input: background knowledge B , learned model H , k 2: output: rank of the most relevant mutations R 3: procedure GenerateMutations ( B , H , k ) Initialize D M ← ∅ 4: 5: A ← find all mutations m that satisfy at least one clause c i ∈ H 6: for m ∈ M do 7: score ← S M ( m ) . number of clauses c i satisfied by m 8: D M ← D M ∪ { ( m , score ) } 9: end for 10: R ← RankMuts ( D M , B , H , k ) . rank relevant mutations 11: return R 12: end procedure E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 6/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion HIV-1 RT Drug Resistance mining rules from HIV mutation data understand the virus adaptation mechanism design drugs that e ff ectively counter potentially resistant mutants Datasets 1 Reverse Transcriptase (RT) mutations from the Los Alamos National Laboratories HIV resistance database NRTI → 95 mutations NNRTI → 56 mutations 2 RT mutants from the Stanford HIV drug resistance database NRTI → 639 mutants NNRTI → 747 mutants E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 7/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Learning settings Learning from mutations Mutation -based learning Input examples: single amino-acid mutations conferring resistance to a class of drugs aa(Pos,AA) mut(MutationID,AA,Pos,AA1) Target concept: a model (i.e. set of rules) describing a mutation conferring resistance to a certain class of drugs res against(MutationID,Drug) Learning setting: learn from positive examples only (annotation on mutations NOT conferring resistance is scarce) Output: generated resistance mutations E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 8/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Background Knowledge Background Knowledge Predicates (excerpt) typeaa(T,AA) same type aa(R1,R2,T) same type mut t(MutID,Pos,T) close to site(Pos) location(L,Pos) (Betts and Russell, 2003) catalytic propensity(AA,CP) Background Knowledge Rules (example) same type aa(R1,R2,T) ← typeaa(T,R1) AND typeaa(T,R2) different type mut t(MutID,Pos) ← mut(MutID,R1,Pos,R2) AND NOT same type aa(R1,R2,T) E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 9/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Learned Hypothesis Model for the resistance to NNRTI >wt ...AGLKKKKSVTVLDVG...YQYMDDLYVG...WETWWTEY...WIPEWEFVN... D DD W W | | | | | | | | 98 112 181 190 398 405 410 418 mut(A,B,C,D) AND position(C,190) mut(A,B,C,D) AND position(C,190) AND typeaa(polar,D) mut(A,y,C,D) AND typeaa(aliphatic,D) mut(A,B,C,a) AND position(C,106) E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 10/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Experimental Setting Aleph ILP system (one-class classification setting ) 30 random training/test set splits (70/30) (for each of the 2 learning tasks) enrichement in the test mutations (recall) comparison against the random generator E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 11/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Experimental Results Mean recall % on 30 splits Algorithm Random Generator NNRTI 86 • 58 NRTI 55 • 46 Mean n. generated mutations n. test mutations NNRTI 5201 17 NRTI 5548 28 ( • ) significant improvement evaluated with a paired Wilcoxon test ( ↵ =0.01) 100" 90" 80" 70" mean%recall% 60" NNRTI" 50" NNRTI"(rand)" 40" NRTI" 30" NRTI"(rand)" 20" 10" 0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" number%of%sa.sfied%clauses%per%generated%muta.on% E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 12/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Learning settings Learning from mutants Mutant -based learning Input examples: mutant resistant or not to a class of drugs aa(Pos,AA) mut(MutantID,AA,Pos,AA1) Target concept: a model (i.e. set of rules) describing a mutant resistant to a certain class of drugs res against(MutantID,Drug) Learning setting: binary classification setting Output: generated resistant mutants with a single amino acid mutation E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 13/24

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Experimental Setting Aleph ILP system (binary classification setting ) 30 random training/test set splits (for each of the 2 learning tasks) enrichment in test set mutations as performance measure (recall) comparison against the random generator E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A Passerini 2 — Predicting virus mutations through relational learning 14/24

Predicting virus mutations through relational learning AIMM 2012 E - PowerPoint PPT Presentation

Intro A Relational Learning Approach HIV RT Drug Resistance Learning from mutations Learning from mutants Conclusion Predicting virus mutations through relational learning AIMM 2012 E Cilia 1 , S Teso 2 , S Ammendola 3 , T Lenaerts 1 , and A

Mutations, the molecular clock, and models of sequence evolution Why are mutations important?

Chapter 2: Relational Model Chapter 2: Relational Model Structure of Relational Databases

Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple

Mutations What is a mutation? ANY change in the genetic material (DNA) Mutations may be

DNA mutations http://www.ncbi.nlm.nih.gov/books/NBK2 1897/ 1 Types of mutations

Relational Algebra Relational Query Languages Recall: Query = Retrieval Program Language

Relational Algebra 1 / 39 Relational Algebra Relational model specifies stuctures and

Relational Query Languages (2) SQL and QBE Walid G. Aref Query Languages For The Relational

Arevir University of Cologne Institute of Virology Analysis of resistance mutations of HI-Virus

Office of Public Health Ebola Virus Disease (EVD) Preparedness Name: Date: Ebola Virus Disease

Mosquito Vectors of Zika Virus and Their Control Chris Evans, MS, PhD Public Health Entomologist

COMPUTER-INTERNET SECURITY How am I vulnerable? 1 COMPUTER-INTERNET SECURITY Virus

anti-virus and anti-anti-virus 1 logistics: TRICKY HW assignment out infecting an

Chapter 8 Evaluation of Relational Operators Implementing the Relational Algebra Relational

Relational Calculus More declarative than relational algebra Foundation for query

RELATIONAL ALGEBRA CHAPTER 6 1 CHAPTER 6 OUTLINE Unary Relational Operations: SELECT and

Exercise. SNP-based drug resistance to Nevirapine drug against the HIV reverse transcriptase Marc

Current cautions about drug development in treatment naive populations more risk than

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies Systematic Annotation Review RTFM

Phenotype Sequencing Marc Harper UCLA Bioinformatics, Genomics and Proteomics March 4th, 2013

in Data Mining (An overview to Multiple Instance Learning) Sebastin Ventura Soto Knowledge

Fast Learning of Relational Dependency Networks Relational Dependency Networks B in Person

tools: towards mimicking wet experiments Carole Knibbe

Probabilistic Inductive Logic Programming with SLIPCOVER Fabrizio Riguzzi F. Riguzzi PILP 1 /