transfer learning and applications in computational
play

Transfer Learning and Applications in Computational Biology 1 - PowerPoint PPT Presentation

Transfer Learning and Applications in Computational Biology 1 Christian Widmer, 1 , 2 Marius Kloft, 1 , 3 Gunnar R atsch, 2 Gabriele Schweikert Nico G ornitz, 1 Memorial Sloan-Kettering Cancer Center, NY, USA 2 Technical University of


  1. Transfer Learning and Applications in Computational Biology 1 Christian Widmer, 1 , 2 Marius Kloft, 1 , 3 Gunnar R¨ atsch, 2 Gabriele Schweikert Nico G¨ ornitz, 1 Memorial Sloan-Kettering Cancer Center, NY, USA 2 Technical University of Berlin, Germany 3 New York University, NY, USA

  2. Memorial Sloan-Kettering Cancer Center Frequent words of abstracts from publications 1998-2004. [ wordle.net ] � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 2

  3. Memorial Sloan-Kettering Cancer Center Frequent words of abstracts from publications 2005-2012. [ wordle.net ] � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 3

  4. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g (RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  5. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g (RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  6. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g (RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  7. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g (RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  8. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Three things will be crucial: Biological insights , RNA) N Many observations of the system: (DNA, i =1 Empirical inference to estimate Θ: f Θ (DNA, ) = RNA � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  9. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma Goal: Estimate f to predict RNAs Need: Good inference method Omit (f Inputs (DNA, ) Outputs (complete transcriptome) Challenges: 1 RNA only partially known 2 Factors Omit (f only partially known 3 Improved inference methods � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 5

  10. Memorial Sloan-Kettering Cancer Center Recent Machine Learning Work � Develop fast, accurate and interpretable learning methods 1 Large scale sequence classification [R¨ atsch et al., 2006a; Sonnenburg et al., 2010, 2007; Toussaint et al., 2010] 2 Analysis and explanation of learning results [R¨ atsch et al., 2006b; Sonnenburg et al., 2008; Zien et al., 2009] 3 Sequence segmentation & structure prediction [R¨ atsch et al., 2007; Schweikert et al., 2009; Zeller et al., 2008] 4 Transfer & Multitask learning [Schweikert et al., 2008a; Widmer et al., 2011, 2012, 2010a; Widmer and R¨ atsch, 2011; Widmer et al., 2010c] � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 6

  11. Memorial Sloan-Kettering Cancer Center Recent Machine Learning Work � Develop fast, accurate and interpretable learning methods 1 Large scale sequence classification [R¨ atsch et al., 2006a; Sonnenburg et al., 2010, 2007; Toussaint et al., 2010] 2 Analysis and explanation of learning results [R¨ atsch et al., 2006b; Sonnenburg et al., 2008; Zien et al., 2009] 3 Sequence segmentation & structure prediction [R¨ atsch et al., 2007; Schweikert et al., 2009; Zeller et al., 2008] 4 Transfer & Multitask learning [Schweikert et al., 2008a; Widmer et al., 2011, 2012, 2010a; Widmer and R¨ atsch, 2011; Widmer et al., 2010c] 8 7 k−mer Length 6 5 4 3 2 1 −30 −20 −10 0 10 20 30 Position � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 6

  12. Memorial Sloan-Kettering Cancer Center Recent Machine Learning Work � Develop fast, accurate and interpretable learning methods 1 Large scale sequence classification [R¨ atsch et al., 2006a; Sonnenburg et al., 2010, 2007; Toussaint et al., 2010] 2 Analysis and explanation of learning results [R¨ atsch et al., 2006b; Sonnenburg et al., 2008; Zien et al., 2009] 3 Sequence segmentation & structure prediction [R¨ atsch et al., 2007; Schweikert et al., 2009; Zeller et al., 2008] 4 Transfer & Multitask learning [Schweikert et al., 2008a; Widmer et al., 2011, 2012, 2010a; Widmer and R¨ atsch, 2011; Widmer et al., 2010c] 8 7 10 k−mer Length 6 Log-intensity 5 4 3 5 2 1 transcript −30 −20 −10 0 10 20 30 Position 0 � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 6

  13. Memorial Sloan-Kettering Cancer Center Recent Machine Learning Work � Develop fast, accurate and interpretable learning methods 1 Large scale sequence classification [R¨ atsch et al., 2006a; Sonnenburg et al., 2010, 2007; Toussaint et al., 2010] 2 Analysis and explanation of learning results [R¨ atsch et al., 2006b; Sonnenburg et al., 2008; Zien et al., 2009] 3 Sequence segmentation & structure prediction [R¨ atsch et al., 2007; Schweikert et al., 2009; Zeller et al., 2008] 4 Transfer & Multitask learning [Schweikert et al., 2008a; Widmer et al., 2011, 2012, 2010a; Widmer and R¨ atsch, 2011; Widmer et al., 2010c] 8 7 10 k−mer Length 6 Log-intensity 5 4 3 5 2 1 transcript −30 −20 −10 0 10 20 30 Position 0 � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 6

  14. Memorial Sloan-Kettering Cancer Center Many algorithms implemented in Shogun toolbox (GPL, ≥ 1000 users) � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 7

  15. Memorial Sloan-Kettering Cancer Center Roadmap Motivation from computational biology TSS Donor Acceptor Donor Acceptor polyA/cleavage DNA TIS Stop Empirical comparison of domain adaptation algorithms Algorithms for hierarchical multi-task learning Algorithms for learning task relations Fast(er) Algorithms Discussion & Conclusion � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 8

  16. Memorial Sloan-Kettering Cancer Center A Core CompBio Problem: Gene Finding DNA genic intergenic pre-mRNA exon intron exon intron exon mRNA 5' UTR 3' UTR cap polyA Protein Given a piece of DNA sequence Predict gene products including intermediate processing steps Predict signals used during processing Predict the correct corresponding label sequence with labels � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 9

  17. Memorial Sloan-Kettering Cancer Center A Core CompBio Problem: Gene Finding TSS polyA/cleavage DNA Splice Splice Splice Splice pre-mRNA Donor Acceptor Donor Acceptor mRNA TIS Stop cap polyA Protein Given a piece of DNA sequence Predict gene products including intermediate processing steps Predict signals used during processing Predict the correct corresponding label sequence with labels � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 9

  18. Memorial Sloan-Kettering Cancer Center A Core CompBio Problem: Gene Finding TSS Donor Acceptor polyA/cleavage Donor Acceptor DNA TIS Stop pre-mRNA mRNA cap polyA Protein Given a piece of DNA sequence Predict gene products including intermediate processing steps Predict signals used during processing Predict the correct corresponding label sequence with labels � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 9

  19. Memorial Sloan-Kettering Cancer Center A Core CompBio Problem: Gene Finding TSS Donor Acceptor polyA/cleavage Donor Acceptor DNA TIS Stop pre-mRNA mRNA cap polyA Protein Given a piece of DNA sequence Predict gene products including intermediate processing steps Predict signals used during processing Predict the correct corresponding label sequence with labels � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend