Machine Learning for Healthcare 6.S897, HST.S53 Lecture 1: What - PowerPoint PPT Presentation

Machine Learning for Healthcare 6.S897, HST.S53 Lecture 1: What makes healthcare unique? Prof. David Sontag MIT EECS, CSAIL, IMES

Outline for today’s class 1. Brief history of AI and ML in healthcare 2. Why now ? 3. Examples of machine learning in healthcare 4. What is unique about ML in healthcare? 5. Overview of class syllabus and projects

1970’s: MYCIN expert system • 1970’s (Stanford): MYCIN expert Related Work and Goals 615 Dialogue interface system for identifying bacteria I am ready causing severe infections ** THIS IS A 26 YEAR OLD MALE PATIENT My understanding is: • Proposed a good therapy in The age of the patient is 26 The sex of the patient is male ~69% of cases. Better than ** FIVE DAYS AGO, HE HAD RESPIRATORY-TRACT SYMPTOMS What is his name? infectious disease experts 4 The Context of the MYCIN Experiments ** JO My understanding is: The name of the patient is Jo Respiratory-tract is one of the symptoms that the patient had EXPERT SYSTEM ** A COUPLE OF DAYS BEFORE THE ADMISSION, HE HAD A MALAISE Description User Please give me the date of admission USER I =~ inter- of new case qp~ Inference Engine ** MARCH 12, 1979 face My understanding is: t The patient was admitted at the hospital 3 days ago Malaise is one of the symptoms that the patient had 5 days ago Advice & Explanation qp_~ Knowledge [ Base FIGURE 33-1 Short sample dialogue. The physician’s inputs appear in capital letters after the double asterisks. FIGURE 1-1 Major parts of an expert system. Arrows indicate information flow. some of the frames might rule out others, thus enabling the space of possible inferences to be pruned. This isstie has also been raised by Char- to help build a knowledge base, to explain a line of reasoning, and so forth. niak (1978). Embodying world knowledge in frames (Minsky, 1975) The knowledge base is the program’s store of facts and associations it "knows" about a subject area such as medicine. A critical design decision scripts (Abelson, 1973; Schank and Abelson, 1975) led to the development is how such knowledge is to be represented within the program. There are of" programs that achieved a reasonably deep level of understanding, for many choices, in general. For MYCIN, we chose to represent knowledge example, GUS (Bobrow et al., 1977), NUDGE (Goldstein and Roberts, mostly as conditional statements, or rules, of the following form: 1977), FRUMP (DeJong, 1977) and SAM (Cullingford, 1977). BAOBAB and the other programs mentioned so far have a common IF: There is evidence that A and B are true, feature: they do not interpret sentences in isolation. Rather, they interpret THEN: Conclude there is evidence that C is true. in the context of an ongoing discourse and, hence, use discourse structure. BAOBAB also explores issues of (a) what constitutes a model for structured This form is often abbreviated to one of the following: texts and (b) how and when topic shifts occur. However, BAOBAB is in- terested neither in inferring implicit facts that might have occurred tem- If A and B, then C porally between facts explicitly described in a text nor in explaining inten- A& B--*C tions of characters in stories (main emphases of works using scripts or plans). Our program focuses instead on coherence of texts, which is mainly We refer to the antecedent of a rule as the premise or left-hand side (LHS) a task of detecting anomalies, asking the user to clarify vague pieces of and to the consequent as the action or right-hand side (RHS). information or disappointed expectations, and suggesting omissions. The The inference mechanism can take many forms. We often speak of domain of application is patient medical summaries, a kind of text for the control structure or control of inference to reflect the [’act that there which language-processing research has mainly consisted of filling in for- are different controlling strategies for the system. For example, a set of matted grids without demanding any interactive behavior (Sager, 1978). rules may be chained together, as in this example: BAOBAB’s objectives are to understand a summary typed in "natural med- If A, then B (Rule 1) If B, then C (Rule 2) A (Data) .’.C (Conclusion)

1980’s: INTERNIST-1/QMR model • 1980’s (Univ. of Pittsburgh): Probabilistic model relating: INTERNIST-1/Quick Medical 570 binary disease variables Reference 4,075 binary symptom variables 45,470 directed edges • Diagnosis for internal medicine Elicited from doctors: diabetes flu pneumonia 15 person-years of work Diseases Led to advances in ML & AI (Bayesian networks, approximate Symptoms inference) fatigue cough chest high pain A1C Problems: 1. Clinicians entered symptoms manually 2. Difficult to maintain, difficult to generalize [Miller et al., ‘86, Shwe et al., ‘91]

1980’s: automating medical discovery Discovers that prednisone elevates cholesterol (Annals of Internal Medicine, ‘86) [Robert Blum, “Discovery, Confirmation and Incorporation of Causal Relationships from a Large Time-Oriented Clinical Data Base: The RX Project”. Dept. of Computer Science, Stanford. 1981]

1990’s: neural networks in medicine • Neural networks with clinical data took off in 1990, with 88 new studies that year • Small number of features (inputs) • Data often collected by chart review FIGURE 2. A multilayer perceptron. This is a two-layer perceptron with four inputs, four hidden units, and unit. one output Problems: 1. Did not fit well into clinical workflow 2. Poor generalization to new places [Penny & Frost, Neural Networks in Clinical Medicine. Med Decis Making, 1996]

Table 1 9 25 Neural Network Studies in Medical Decision Making* *For reference citations, see the reference list tP = pnor probability of most prevalent category. $D ratio of tramng examples to weights per output = §A single integer in the accuracy column denotes percentage overall classification rate and a single real number between 0 and 1 indicates the AUROCC value Neural = accuracy of neural net, Other = accuracy of best other method

Why now? DATA

Adoption of Electronic Health Records (EHR) has increased 9x since 2008 00000 96% 85.2%* 96.9%* Certi � ed EHR 94%* 83.8%* 75.5%* 71.9% Basic EHR Percentage 59.4%* of hospitals 44.4%* in the US 27.6%* 15.6% 12.2% 9.4% 2008 2009 2010 2011 2012 2013 2014 2015 [Henry et al., ONC Data Brief, May 2016]

Large datasets Laboratory for Computational Physiology De-identified health data from ~40K critical care patients Demographics, vital signs, laboratory tests, medications, notes, …

Large datasets “Data on nearly 230 million unique patients since 1995” $$$

Large datasets President Obama’s initiative to create a 1 million person research cohort Core data set: • Baseline health exam • Clinical data derived THE PRECISION MEDICINE INITIATIVE from electronic health records (EHRs) • Healthcare claims • Laboratory data [Precision Medicine Initiative (PMI) working Group Report, Sept. 17 2015]

Diversity of digital health data pro rote teomics lab ab tests im imagin ing soc ocial medi dia phone pho ge genomics vit ital signs de devices

Standardization • Diagnosis codes: ICD-9 and ICD-10 (International Classification of Diseases) … …… [https://blog.curemd.com/the-most-bizarre- [https://en.wikipedia.org/wiki/Lis icd-10-codes-infographic/] t_of_ICD-9_codes]

Standardization • Diagnosis codes: ICD-9 and ICD-10 (International Classification of Diseases) • Laboratory tests: LOINC codes • Pharmacy: National Drug Codes (NDCs) • Unified Medical Language System (UMLS): millions of medical concepts [http://oplinc.com/newsletter/index_May08.htm]

Why now? ALGORITHMS

Advances in machine learning • Major advances in ML & AI – Learning with high-dimensional features (e.g., l1- regularization) – Semi-supervised and unsupervised learning – Modern deep learning techniques (e.g. convnets, variants of SGD) • Democratization of machine learning – High quality open-source software, such as Python’s scikit-learn, TensorFlow, Torch, Theano

Industry interest in AI & healthcare

Emergency Department: • Limited resources • Time sensitive • Critical decisions

Data in Emergency Department (ED) Electronic records for over 300,000 ED visits Physician Specialist consults Triage Information documentation MD comments (Free text) (free text) 2 hrs 30 min T=0 Repeated vital signs Disposition (continuous values) Measured every 30 s Lab results (Continuous valued) Collaboration with Steven Horng, MD

Machine Learning for Healthcare 6.S897, HST.S53 Lecture 1: What - PowerPoint PPT Presentation

Machine Learning for Healthcare 6.S897, HST.S53 Lecture 1: What makes healthcare unique? Prof. David Sontag MIT EECS, CSAIL, IMES Outline for todays class 1. Brief history of AI and ML in healthcare 2. Why now ? 3. Examples of machine

Regulation of AL / ML in the US 6.S897/HST.956: Machine Learning for Healthcare 6.S897/HST.956:

MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 3: Causal inference Prof. David Sontag

Machine Learning for Healthcare HST.956, 6.S897 Lecture 19: Disease progression modeling &

Machine Learning for Healthcare HST.956, 6.S897 Lecture 4: Risk stratification David Sontag

Machine Learning for Healthcare HST.956, 6.S897 Lecture 24: Robustness to dataset shift David

Machine Learning for Healthcare HST.956, 6.S897 Lecture 15: Causal Inference Part 2 David Sontag

Reinforcement learning Fredrik D. Johansson Clinical ML @ MIT 6.S897/HST.956: Machine Learning

IOGP/IADC BOP Reliability Database Mark Siegmund Chairman: RAPID S53 Oversight Committee IADC

Creating Innovations that Matter Deep Learning for Medical Imaging Christine Swisher, PhD Guest

Machine Learning for Healthcare 6.871, HST.956 Lecture 5: Learning with noisy or censored labels

Machine Learning for Healthcare 6.871, HST.956 Lecture 14: Causal Inference Part 1 David Sontag

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Course Overview Mark L Braunstein, MD School of Interactive Computing Health Informatics on FHIR

outcomes for inpatient rehabilitation participants: a randomised controlled trial Treacy D 1,2 ,

Beginnings of Probability I Despite the fact that humans have played games of chance forever (so

Alliance for Innovation on Maternal Health (AIM)Taskforce Jennifer Miller Conduct detailed

Connatre l'anatomie de l'auricule gauche Dr Ciobotaru Vlad Unit dexploration des

Monitoring Persistent Platelet Reactivity in Patients with Unprotected Left Main Stenting Impact

Local Data for Suicide Preven7on January 21 st , 2014 K

Intern Survival Series Lecture #5 Dying, Death and Breaking Bad News Shaping the Future of

Machine Learning for Healthcare 6.S897, HST.S53 Lecture 1: What - PowerPoint PPT Presentation

Machine Learning for Healthcare 6.S897, HST.S53 Lecture 1: What makes healthcare unique? Prof. David Sontag MIT EECS, CSAIL, IMES Outline for todays class 1. Brief history of AI and ML in healthcare 2. Why now ? 3. Examples of machine

Regulation of AL / ML in the US 6.S897/HST.956: Machine Learning for Healthcare 6.S897/HST.956:

MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 3: Causal inference Prof. David Sontag

Machine Learning for Healthcare HST.956, 6.S897 Lecture 19: Disease progression modeling &amp;

Machine Learning for Healthcare HST.956, 6.S897 Lecture 4: Risk stratification David Sontag

Machine Learning for Healthcare HST.956, 6.S897 Lecture 24: Robustness to dataset shift David

Machine Learning for Healthcare HST.956, 6.S897 Lecture 15: Causal Inference Part 2 David Sontag

Reinforcement learning Fredrik D. Johansson Clinical ML @ MIT 6.S897/HST.956: Machine Learning

IOGP/IADC BOP Reliability Database Mark Siegmund Chairman: RAPID S53 Oversight Committee IADC

Creating Innovations that Matter Deep Learning for Medical Imaging Christine Swisher, PhD Guest

Machine Learning for Healthcare 6.871, HST.956 Lecture 5: Learning with noisy or censored labels

Machine Learning for Healthcare 6.871, HST.956 Lecture 14: Causal Inference Part 1 David Sontag

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Course Overview Mark L Braunstein, MD School of Interactive Computing Health Informatics on FHIR

outcomes for inpatient rehabilitation participants: a randomised controlled trial Treacy D 1,2 ,

Beginnings of Probability I Despite the fact that humans have played games of chance forever (so

Alliance for Innovation on Maternal Health (AIM)Taskforce Jennifer Miller Conduct detailed

Connatre l'anatomie de l'auricule gauche Dr Ciobotaru Vlad Unit dexploration des

Monitoring Persistent Platelet Reactivity in Patients with Unprotected Left Main Stenting Impact

Local Data for Suicide Preven7on January 21 st , 2014 K

Intern Survival Series Lecture #5 Dying, Death and Breaking Bad News Shaping the Future of

Machine Learning for Healthcare HST.956, 6.S897 Lecture 19: Disease progression modeling &