Workshop: Machine Learning and Deep Learning Mark Hoffman PhD - - PowerPoint PPT Presentation

workshop machine learning and deep learning
SMART_READER_LITE
LIVE PREVIEW

Workshop: Machine Learning and Deep Learning Mark Hoffman PhD - - PowerPoint PPT Presentation

AIMed NORTH AMERICA, CALIFORNIA 1114 DECEMBER 2019 Workshop: Machine Learning and Deep Learning Mark Hoffman PhD Robert Hoyt MD, FACP, FAMIA, ABPM-C Kevin Lyman @socialnamehere @socialnamehere @socialnamehere @socialnamehere


slide-1
SLIDE 1

Workshop: Machine Learning and Deep Learning

Mark Hoffman PhD Robert Hoyt MD, FACP, FAMIA, ABPM-C Kevin Lyman

@socialnamehere @socialnamehere @socialnamehere @socialnamehere

www.aimed.events/northamerica-2019/

AIMed NORTH AMERICA, CALIFORNIA 11–14 DECEMBER 2019

slide-2
SLIDE 2

Sp Speaker #1 #1 Mark Hoffma man

  • Presentation: The Promise and Perils of Real-World EHR Data
  • Title: Chief Research Information Officer, Children’s Mercy

Hospital and Children’s Research Institute, Kansas City MO

  • Bio: Dr Hoffman worked for Cerner Corp. for 16 years as Vice

President for Genomics and Research before joining Children’s Mercy Hospital in 2016. He is also faculty at the University of Missouri Kansas City and is the primary investigator on a CDC grant. His goal is to improve capabilities in genomics, public health and big data. He has delivered a TED talk and is an inventor with 19 issued patents.

slide-3
SLIDE 3

Sp Speaker #2 #2 Robert Hoyt

  • Presentation: Machine Learning for Non-Data Scientists
  • Title: Associate Clinical Professor, Internal Medicine

Department, Virginia Commonwealth University, Richmond, VA

  • Bio: Dr Hoyt has taught Health Informatics for many years

and is the co-editor and author of Health Informatics: Practical Guide, seventh edition. His second textbook Introduction to Biomedical Data Science will be published in December. His goal is to help educate clinicians and informatics students about new trends in data science, to include machine learning and artificial intelligence

slide-4
SLIDE 4

Sp Speaker #3 #3 Ke Kevin Lyma man

  • Presentation: Practical Applications in Clinical AI
  • Title: CEO, Enlitic Corp., San Francisco, CA
  • Bio: Kevin Lyman is an engineer and entrepreneur who

received a BS in Computer Science from RPI. Prior to working at Enlitic he was employed at Hasbro, SpaceX and Microsoft. As CEO of Enlitic, his focus is on integrating AI into Radiology workflow. Enlitic was twice named one of MIT Technology Review’s 50 smartest

  • companies. He is also the founder of The Inventor’s

Guild and is a highly sought-after speaker on AI.

slide-5
SLIDE 5

Machine Learning for Non-Data Scientists

Robert Hoyt MD, FACP, FAMIA, ABPM-CI

@socialnamehere @socialnamehere @socialnamehere @socialnamehere

www.aimed.events/northamerica-2019/

AIMed NORTH AMERICA, CALIFORNIA 11–14 DECEMBER 2019

slide-6
SLIDE 6
  • Discuss the importance of machine learning for clinicians
  • Enumerate the challenges of learning a programming

language such as R or Python for machine learning

  • List some of the open source machine learning programs

that do not require higher math or programming skills

  • Use RapidMiner as an example of ML software

Le Learning O g Objectives Af After viewing participants s sh shoul uld be able to:

www.aimed.events/northamerica-2019/

AIMed NORTH AMERICA, CALIFORNIA 11–14 DECEMBER 2019

slide-7
SLIDE 7

Di Disc sclaimer

I have no conflicts of interest to report

slide-8
SLIDE 8

Wh Why clinicians should understand ma machine learning

  • Machine learning is commonly employed for predictive

analytics, in addition to statistical approaches

  • Some knowledge of ML is important in order to

intelligently read or review medical articles today

  • Understanding ML is a logical step towards also

understanding deep learning and artificial intelligence

slide-9
SLIDE 9

So You Want To Be a Data Scientist?

slide-10
SLIDE 10

Ma Machine Learning Challenges

  • To learn machine learning by using a programming

language probably means 1-2 years of education and experience

  • To fully understand AI implies comfort with calculus and

linear algebra

  • Pre-requisites for some data science Master’s degrees

include a programming language and higher math

slide-11
SLIDE 11
  • Because 60-80% of the time spent by data scientists is

spent in data preparation/exploration, some knowledge of spreadsheets, visualization and biostats is mandatory

  • You must understand little data before big data and

shallow learning before deep learning

  • Machine learning software provides the algorithmic phase
  • f data analysis, but there is much more to know
  • However, machine learning software promotes the

“democratization of data science”

Caveats

Machine Learning Challenges

slide-12
SLIDE 12

Is Is This our r Curre rrent Status?

slide-13
SLIDE 13

Wh What is the Path Forward?

  • Masters in Data Science or Biomedical Data Science?
  • Take multiple online courses on your own: Coursera,

Udacity, etc.?

  • Learn Python or R?
  • Use Machine Learning software?
slide-14
SLIDE 14

Open Open Sou

  • urce

ce or

  • r Free

ee for

  • r Aca

cadem emic c Use

Name Dependency Uniqueness Limitations WEKA Windows, Mac, Linux GUI based. Associated with courses and textbook Outdated appearance. KNIME Windows, Mac, Linux Visual operators Mild-moderate learning curve Orange Windows, Mac, Linux Python based. Visual

  • perators. Intuitive

Limited community forum H2o ai Web-based Advanced Mild-moderate learning curve BigML Web-based Advanced Mild-moderate learning curve BlueSky Statistics Windows only R based Does not include neural networks RapidMiner Windows, Mac, Linux Visual operators and GUI

  • based. Automated analysis
  • None. “Best of breed”?
slide-15
SLIDE 15

Rapi RapidMine ner

  • Web based. Free for academic use. Free 30-day trial,

after that - visual operators only

  • Comprehensive: data preparation, visualization, statistics,

machine learning and deep learning

  • Excellent algorithm performance matrices
  • Automated steps: TurboPrepⓇ and AutoModelⓇ
  • Runs multiple algorithms simultaneously
  • Embedded help
  • User community
slide-16
SLIDE 16

Ra RapidMiner

  • Turbo Prep –
  • Transform (filter, sort, split)
  • Clean (auto clean, PCA,

normalize, remove low quality, highly correlated variables and duplicates, create dummy codes)

  • Merge datasets
  • Create pivot tables
  • Extensive data visualization
  • Extensions: NLP, DL, Stats, and link

to Hadoop

  • Extensive algorithm library
  • Auto – Model
  • Screens variables for quality
  • Select the column of interest and

it selects the appropriate algorithms for classification or regression

  • Clustering (k-means, x-means)
  • Runs multiple algorithms at same

time

  • Output – AUC, accuracy, F score,

sensitivity, specificity, precision, recall, classification errors

slide-17
SLIDE 17

Tu Turbo boPrep - Ge General al

slide-18
SLIDE 18

Tu Turbo boPrep - Tr Tran ansfo form

slide-19
SLIDE 19

Tu Turbo boPrep - Ch Charts

slide-20
SLIDE 20

Tu Turbo boPrep - Cl Clea eans nse

slide-21
SLIDE 21

Au Auto toMode del - Ge General al

slide-22
SLIDE 22

Au Auto toMode del - Pr Predict

slide-23
SLIDE 23

Au Auto toMode del – Se Select Class

slide-24
SLIDE 24

Au Auto toMode del – Se Select Inputs

slide-25
SLIDE 25

Au Auto toMode del – Se Select Algorithms ms

slide-26
SLIDE 26

Au Auto toMode del - Re Results ts

slide-27
SLIDE 27

Au Auto toMode del - Pe Performance

slide-28
SLIDE 28

Au Auto toMode del - We Weights

slide-29
SLIDE 29

Weighted Predictors

Naïve Bayes - Simulator

slide-30
SLIDE 30

Au Auto toMode del – De Desc scriptive Stats s

slide-31
SLIDE 31

Au Auto toMode del – Cor Correl elation

  • n Ma

Matrix

slide-32
SLIDE 32

Pr Processe sses s Running in the Background

slide-33
SLIDE 33

Clustering

Unsupervised Learning – Clustering

slide-34
SLIDE 34

Concl Conclusions

  • ns
  • Machine learning software allows clinicians to use

supervised and unsupervised machine learning to model data, without programming languages or higher math

  • Supplemental reading in stats, visualization,

performance, etc. is important

  • Collaboration with experts is always advised