not just a black box interpretable deep learning for
play

Not Just a Black Box: Interpretable Deep Learning for Genomics - PowerPoint PPT Presentation

Not Just a Black Box: Interpretable Deep Learning for Genomics Presented by: AvanA Shrikumar 1 With great power comes really poor interpretability Deep Power Learning Traditional machine learning Classical statistics 2


  1. Not Just a Black Box: Interpretable Deep Learning for Genomics Presented by: AvanA Shrikumar 1

  2. With great power comes really poor interpretability… Deep Power Learning Traditional machine learning Classical statistics 2 Interpretability

  3. With great power comes really poor interpretability… Deep Interpretable Deep Power Learning Learning Traditional machine learning Classical statistics 2 Interpretability

  4. QuesAons for the model • Which parts of the input are the most important for making a given predic:on? 3

  5. QuesAons for the model • Which parts of the input are the most important for making a given predic:on? • What are the recurring pa<erns in the input? 3

  6. QuesAons for the model • Which parts of the input are the most important for making a given predic:on? • What are the recurring pa<erns in the input? 4

  7. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  8. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  9. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  10. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  11. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  12. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  13. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  14. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output Drawbacks … 1) Computa:onal efficiency - … requires one forward prop for each perturba:on … … Yellow = inputs 5

  15. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output Drawbacks … 1) Computa:onal efficiency - … requires one forward prop for each perturba:on 2) Satura:on … … Yellow = inputs 5

  16. Satura:on problem illustrated y 1 y 2 i 1 1 0 i 2 i 1 + i 2 6

  17. Satura:on problem illustrated =1 y 1 y 2 i 1 1 =1 0 i 2 =1 i 1 + i 2 6

  18. Satura:on problem illustrated =1 y 1 y 2 i 1 1 =1 0 i 2 =1 i 1 + i 2 0 6

  19. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7

  20. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7

  21. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7

  22. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output Examples: - Gradients (Simonyan et al.) … - Deconvolu:onal Networks (Zeiler & Fergus) - Guided Backpropaga:on (Springenberg et al.) … - Layerwise Relevance Propaga:on (Bach et al.) - Integrated Gradients (Sundararajan et al.) … … Yellow = inputs 7

  23. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output Examples: - Gradients (Simonyan et al.) … - Deconvolu:onal Networks (Zeiler & Fergus) - Guided Backpropaga:on (Springenberg et al.) … - Layerwise Relevance Propaga:on (Bach et al.) - Integrated Gradients (Sundararajan et al.) - DeepLIFT (Learning Important FeaTures) … - h<ps://github.com/kundajelab/ deepli^, ICML 2017 … - With Peyton Greenside and Anshul Yellow = inputs Kundaje 7

  24. Satura:on revisited y 1 y i 1 i 2 1 2 0 i 1 + i 2 8

  25. Satura:on revisited When (i 1 + i 2 ) >= 1, y gradient is 0 1 y i 1 i 2 1 2 0 i 1 + i 2 8

  26. The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y 1 y i 1 i 2 1 2 0 i 1 + i 2 9

  27. The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) 1 y i 1 i 2 1 2 0 i 1 + i 2 9

  28. The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) At (i 1 + i 2 ) = 2, 1 y the “difference from reference” is +1, NOT 0 i 1 i 2 1 2 0 i 1 + i 2 9

  29. The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) At (i 1 + i 2 ) = 2, 1 y the “difference from reference” is +1, NOT 0 i 1 i 2 1 2 0 i 1 + i 2 DeepLIFT addresses other failure modes besides saturaAon (see paper) 9

  30. Reference ma<ers! CIFAR10 model, class = “ship” Original 10

  31. Reference ma<ers! CIFAR10 model, class = “ship” DeepLIFT Reference Original scores 10

  32. Reference ma<ers! CIFAR10 model, class = “ship” SuggesAons on how to pick a DeepLIFT Reference Original scores reference : - MNIST: all zeros (background) 10

  33. Reference ma<ers! CIFAR10 model, class = “ship” SuggesAons on how to pick a DeepLIFT Reference Original scores reference : - MNIST: all zeros (background) - Consider using a distribuAon of references - E.g. mul:ple references generated by shuffling a genomic sequence 10

  34. Eg: morphing 8 to a 3 or a 6 original 8->3 8->6 Backprop Guided Integrated gradients DeepLIFT 11

  35. Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on 12

  36. Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on How is cell-type-specific gene expression controlled? 12

  37. Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on How is cell-type-specific gene expression controlled? Ans: “control elements” act like switches to turn genes on 12

  38. “Control Elements” are switches that turn genes DNA sequence of a gene Control element 13

  39. “Control Elements” are switches that turn genes Sequence contain “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Control element 13

  40. “Control Elements” are switches that turn genes Sequence contain “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Control element Controller proteins bind to DNA words 13

  41. “Control Elements” are switches that turn genes DNA sequence of a gene Control element + controller proteins loop over… 13

  42. “Control Elements” are switches that turn genes …and ac:vate nearby genes DNA sequence of a gene Control element + controller proteins loop over… 13

  43. 89%* of disease-associated mutaAons are outside genes! DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011

  44. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011

  45. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011

  46. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011

  47. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins Many posi:ons in a control element are not essen:al for its func:on! 14 *Stranger et al ., Genet. , 2011

  48. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins Many posi:ons in a control element are not essen:al for its func:on! à Which posiAons in controller elements maber? 14 *Stranger et al ., Genet. , 2011

  49. Q: Which posiAons in control elements maber? 15

  50. Q: Which posiAons in control elements maber? Experimentally measure control elements in different :ssues 15

  51. Q: Which posiAons in control elements maber? Predict :ssue- Experimentally specific ac:vity of measure control control elements elements in from sequence using deep different :ssues learning 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend