workshop machine learning and deep learning
play

Workshop: Machine Learning and Deep Learning Mark Hoffman PhD - PowerPoint PPT Presentation

AIMed NORTH AMERICA, CALIFORNIA 1114 DECEMBER 2019 Workshop: Machine Learning and Deep Learning Mark Hoffman PhD Robert Hoyt MD, FACP, FAMIA, ABPM-C Kevin Lyman @socialnamehere @socialnamehere @socialnamehere @socialnamehere


  1. AIMed NORTH AMERICA, CALIFORNIA 11–14 DECEMBER 2019 Workshop: Machine Learning and Deep Learning Mark Hoffman PhD Robert Hoyt MD, FACP, FAMIA, ABPM-C Kevin Lyman @socialnamehere @socialnamehere @socialnamehere @socialnamehere www.aimed.events/northamerica-2019/

  2. Sp Speaker #1 #1 Mark Hoffma man • Presentation : The Promise and Perils of Real-World EHR Data • Title : Chief Research Information Officer, Children’s Mercy Hospital and Children’s Research Institute, Kansas City MO • Bio : Dr Hoffman worked for Cerner Corp. for 16 years as Vice President for Genomics and Research before joining Children’s Mercy Hospital in 2016. He is also faculty at the University of Missouri Kansas City and is the primary investigator on a CDC grant. His goal is to improve capabilities in genomics, public health and big data. He has delivered a TED talk and is an inventor with 19 issued patents.

  3. Sp Speaker #2 #2 Robert Hoyt • Presentation: Machine Learning for Non-Data Scientists • Title : Associate Clinical Professor, Internal Medicine Department, Virginia Commonwealth University, Richmond, VA • Bio: Dr Hoyt has taught Health Informatics for many years and is the co-editor and author of Health Informatics: Practical Guide, seventh edition. His second textbook Introduction to Biomedical Data Science will be published in December. His goal is to help educate clinicians and informatics students about new trends in data science, to include machine learning and artificial intelligence

  4. Sp Speaker #3 #3 Ke Kevin Lyma man • Presentation : Practical Applications in Clinical AI • Title : CEO, Enlitic Corp., San Francisco, CA • Bio : Kevin Lyman is an engineer and entrepreneur who received a BS in Computer Science from RPI. Prior to working at Enlitic he was employed at Hasbro, SpaceX and Microsoft. As CEO of Enlitic, his focus is on integrating AI into Radiology workflow. Enlitic was twice named one of MIT Technology Review’s 50 smartest companies. He is also the founder of The Inventor’s Guild and is a highly sought-after speaker on AI.

  5. AIMed NORTH AMERICA, CALIFORNIA 11–14 DECEMBER 2019 Machine Learning for Non-Data Scientists Robert Hoyt MD, FACP, FAMIA, ABPM-CI @socialnamehere @socialnamehere @socialnamehere @socialnamehere www.aimed.events/northamerica-2019/

  6. AIMed NORTH AMERICA, CALIFORNIA 11–14 DECEMBER 2019 Le Learning O g Objectives Af After viewing participants s sh shoul uld be able to: • Discuss the importance of machine learning for clinicians • Enumerate the challenges of learning a programming language such as R or Python for machine learning • List some of the open source machine learning programs that do not require higher math or programming skills • Use RapidMiner as an example of ML software www.aimed.events/northamerica-2019/

  7. Di Disc sclaimer I have no conflicts of interest to report

  8. Wh Why clinicians should understand ma machine learning • Machine learning is commonly employed for predictive analytics, in addition to statistical approaches • Some knowledge of ML is important in order to intelligently read or review medical articles today • Understanding ML is a logical step towards also understanding deep learning and artificial intelligence

  9. So You Want To Be a Data Scientist?

  10. Ma Machine Learning Challenges • To learn machine learning by using a programming language probably means 1-2 years of education and experience • To fully understand AI implies comfort with calculus and linear algebra • Pre-requisites for some data science Master’s degrees include a programming language and higher math

  11. Caveats Machine Learning Challenges • Because 60-80% of the time spent by data scientists is spent in data preparation/exploration, some knowledge of spreadsheets, visualization and biostats is mandatory • You must understand little data before big data and shallow learning before deep learning • Machine learning software provides the algorithmic phase of data analysis, but there is much more to know • However, machine learning software promotes the “democratization of data science”

  12. Is Is This our r Curre rrent Status?

  13. Wh What is the Path Forward? • Masters in Data Science or Biomedical Data Science? • Take multiple online courses on your own: Coursera, Udacity, etc.? • Learn Python or R? • Use Machine Learning software?

  14. Open Open Sou ource ce or or Free ee for or Aca cadem emic c Use Name Dependency Uniqueness Limitations WEKA Windows, Mac, Linux GUI based. Associated with Outdated appearance. courses and textbook KNIME Windows, Mac, Linux Visual operators Mild-moderate learning curve Orange Windows, Mac, Linux Python based. Visual Limited community forum operators. Intuitive H2o ai Web-based Advanced Mild-moderate learning curve BigML Web-based Advanced Mild-moderate learning curve BlueSky Statistics Windows only R based Does not include neural networks RapidMiner Windows, Mac, Linux Visual operators and GUI None. “Best of breed”? based. Automated analysis

  15. Rapi RapidMine ner • Web based. Free for academic use. Free 30-day trial, after that - visual operators only • Comprehensive: data preparation, visualization, statistics, machine learning and deep learning • Excellent algorithm performance matrices • Automated steps: TurboPrep Ⓡ and AutoModel Ⓡ • Runs multiple algorithms simultaneously • Embedded help • User community

  16. Ra RapidMiner • Auto – Model • Turbo Prep – • Screens variables for quality • Transform (filter, sort, split) • Select the column of interest and • Clean (auto clean, PCA, it selects the appropriate normalize, remove low quality, algorithms for classification or highly correlated variables and regression duplicates, create dummy codes) • Clustering (k-means, x-means) • Merge datasets • Runs multiple algorithms at same • Create pivot tables time • Extensive data visualization • Output – AUC, accuracy, F score, • Extensions: NLP, DL, Stats, and link sensitivity, specificity, precision, to Hadoop recall, classification errors • Extensive algorithm library

  17. Tu Turbo boPrep - Ge General al

  18. Tu Turbo boPrep - Tr Tran ansfo form

  19. Tu Turbo boPrep - Ch Charts

  20. Tu Turbo boPrep - Cl Clea eans nse

  21. Au Auto toMode del - Ge General al

  22. Au Auto toMode del - Pr Predict

  23. Au Auto toMode del – Se Select Class

  24. Au Auto toMode del – Se Select Inputs

  25. Au Auto toMode del – Se Select Algorithms ms

  26. Au Auto toMode del - Re Results ts

  27. Au Auto toMode del - Pe Performance

  28. Au Auto toMode del - We Weights

  29. Naïve Bayes - Simulator Weighted Predictors

  30. Au Auto toMode del – De Desc scriptive Stats s

  31. Au Auto toMode del – Cor Correl elation on Ma Matrix

  32. Pr Processe sses s Running in the Background

  33. Unsupervised Learning – Clustering Clustering

  34. Concl Conclusions ons • Machine learning software allows clinicians to use supervised and unsupervised machine learning to model data, without programming languages or higher math • Supplemental reading in stats, visualization, performance, etc. is important • Collaboration with experts is always advised

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend