Applying Machine Learning Methods to Predicting Reliance on VA - PowerPoint PPT Presentation

Applying Machine Learning Methods to Predicting Reliance on VA Primary Care Edwin S. Wong, PhD VA Puget Sound Health Care System Department of Health Services, University of Washington 2018 AcademyHealth Annual Research Meeting June 25, 2018

Acknowledgements • Sources of funding – VA Career Development Award (Wong, CDA 13-024) • Disclosure: – Dr. Wong reports ownership of common stock in UnitedHealth Group Inc. totaling less than $15,000 in market value VETERANS HEALTH ADMINISTRATION 2

Why Machine Learning? • Increased capabilities not possible with traditional methods – Broader range of health care applications – Support analysis of data of greater size and complexity – Ability to develop models of greater complexity – Offer richer insights – Improvement in model performance VETERANS HEALTH ADMINISTRATION

Example Machine Learning Applications in Health Services Research • Predicting health and health care outcomes • Detecting outliers – High cost patients • Classification • Subgroup analysis – Phenotyping, risk stratification • Measuring heterogenous treatment effects VETERANS HEALTH ADMINISTRATION

Application: Dual Use of VA and Non-VA Health Care • Veterans Affairs Health Care System (VA) – Large, nationally integrated health system – 8.4 million Veteran enrollees in FY2016 • VA enrollees are not precluded from obtaining care through non-VA sources, independent of VA – ~80% have at least one other non-VA health insurance source – Nearly all age 65+ dually enrolled in Medicare VETERANS HEALTH ADMINISTRATION

Research Objective • To examine how to best predict which VA enrollees will be mostly reliant on VA primary care next year using predictor variables in the current year • Policy Relevance : – VA reliance is an input to projection models used to inform VA health care budget requests submitted to Congress – Better predictions of reliance may improve accuracy of these requests VETERANS HEALTH ADMINISTRATION

Data Sources • VA Corporate Data Warehouse – Comprehensive administrative data on all users of VA health care • Medicare Claims – Utilization of outpatient services through fee-for-service Medicare • 2012 VA Survey of Healthcare Experiences of Patients – Random sample of Veterans receiving care at VA outpatient facilities • Area Health Resource File – Characteristics in Veterans’ residence county VETERANS HEALTH ADMINISTRATION

Population Studied • Sample of 83,825 VA patients responding to the 2012 VA SHEP – Dually enrolled in fee-for-service Medicare in FY2012 and FY2013 – Alive at end of FY2013 – Weighted to population of 4.6 million VA patients VETERANS HEALTH ADMINISTRATION

Definition of VA Reliance • Counts of face-to-face office visits in primary care • VA Reliance = Proportion of all visits from VA – # visits in VA ÷ (# visits in VA + # visits via Medicare) • Dichotomous measure denoting whether Veterans were mostly reliant on VA – VA reliance ≥ 0.5 VETERANS HEALTH ADMINISTRATION 1 Burgess JF, et al. (2011). Health Econ 20(2).

Predictor Variables (Features) • 59 features in 5 categories Group Example Variables Demographics Age, gender, marital status, race/ethnicity Access to Care Distance to nearest VA, copayment exemption Comorbidities Heart failure, hypertension, diabetes, liver disease Patient-Reported Experiences Provider rating, ability to receive immediate care, parking availability, office cleanliness Local Area Factors Poverty rate, unemployment rate, hospital beds per 1,000 VETERANS HEALTH ADMINISTRATION

Machine Learning Framework for Classification • Analytic Objective : Learn a target classifier function C that best assigns input variables X to an output variable y: – y <- C ( X ) – Binary classification: y = 0 (not VA reliant), 1 (VA reliant) – X = Matrix of predictor variables, or features • Policy Goal : Make accurate predictions of Veterans’ future reliance classification given observed features in the present VETERANS HEALTH ADMINISTRATION

Machine Learning Objective • Goal: Assessing properties of model “out -of- sample” – How would model perform in practice? – Causality deemphasized – Focus on performance and fit • Use training sample to estimate model • Assess model performance on separate validation sample • Consider multiple algorithms or models – “Best” model will depend on research question and analytical data – No single model is always superior VETERANS HEALTH ADMINISTRATION

Road Map for Classifying VA Reliance • Model set-up – Pre-processing of data (cleaning and transforming) – Identify performance metric (loss function) – Resampling methods (validation set generation methods) – Identify candidate algorithms VETERANS HEALTH ADMINISTRATION

Road Map for Classifying VA Reliance • Build Models – Estimate model parameters – Determine best value of tuning parameters – Assessing model fit – Calculating the performance of the final model • Identify “best” of candidate models VETERANS HEALTH ADMINISTRATION

Visual Roadmap Gathering data and preprocessing Train, Test, Validation split Test Cleaned Raw Data Train Data Train/ Validation Validation Build and tweak candidate models Compare best models against test data Out of sample Train Train/Val data Model Parameters Predicted Best Best Validation values Model(s) Model(s) Model Performance Metrics (loss function) Test 16 VETERANS HEALTH ADMINISTRATION

Preprocessing • Data may have irregularities that may influence model stability and performance • Differing assumptions and requirements of models • Common preprocessing tasks: – Correcting inconsistent data – Addressing missing data – Centering and scaling – Transformations of individual predictors or groups of predictors – Discretizing continuous predictors VETERANS HEALTH ADMINISTRATION

Variable Selection • More parsimonious model may be preferred – More complex models may achieve a high performance at the cost of overfit – Computational limitations – Easier to interpret • Assess gain in performance from complex model against a simpler, lower variance model VETERANS HEALTH ADMINISTRATION 18

Performance Metrics for Classification Models • Several common metrics to assess performance: Metric Description Accuracy Proportion correctly classified by model Kappa Statistic Inter-rater agreement; performance adjusting for agreement due to random chance Sensitivity True positive (TP) rate: TP / [TP + FN] True negative (TN) rate: TN / [TN + FP] Specificity Area Under ROC Curve Average value of sensitivity for all the possible (AUROC) values of specificity FP=false positive, FN=false negative VETERANS HEALTH ADMINISTRATION

Resampling Methods • Facilitate estimation of model performance on data “unseen” in training process • Resampling allows for assessing variability and stability of model • Helps protect against model overfitting – Overfit models will “memorize” data – Good performance using training data does not necessarily generalize “out -of- sample” VETERANS HEALTH ADMINISTRATION

Resampling Methods • Repeat for a given number of resampling iterations – Construct validation sample by holding out observations – Fit model on remaining observations (i.e. training sample) – Predict on validation sample – Calculate performance using specified metric • Assess model performance across all iterations VETERANS HEALTH ADMINISTRATION

Resampling Methods • Several commonly applied methods to define validation samples Method Brief Description Simple Cross Validation Partition data into training and test sample K-fold Cross Validation Split data into K equally sized blocks Repeated K-fold Cross Create multiple versions of K-folds Validation Leave Group Out Define random proportion of data to train model and repeat multiple times Bootstrapping Construct random sample with replacement of same size as original data set VETERANS HEALTH ADMINISTRATION

Applying Machine Learning Methods to Predicting Reliance on VA - PowerPoint PPT Presentation

Applying Machine Learning Methods to Predicting Reliance on VA Primary Care Edwin S. Wong, PhD VA Puget Sound Health Care System Department of Health Services, University of Washington 2018 AcademyHealth Annual Research Meeting June 25, 2018

Professional Reliance in the Professional Reliance in the Williams Lake TSA Empowering

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Welcome Predicting Change Outcomes Leveraging SQL Server Profiler Lee Everest SQL Rx Predicting

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Reliance Agreement Basics Julie Chamberlin, CCRP, CIP IRB Reliance & QA/QI Specialist

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Predicting Hotel Cancellations with Machine Learning Michael el Grogan Machine Learning

Machine Learning for NLP Ethics and Machine Learning Aurlie Herbelot 2019 Centre for

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

WT/COMTD/RTA/7/1 17 August 2016 (16-4426) Page: 1/22 Committee on Trade and Development

Working together to combat I UU fishing to ensure the sustainability of world fish stocks: the

initiatives to protect cold-water corals from the impact of deep-sea fishing 5 th International

Seafood traceability for compliance: Country-level support for CDS implementation World Tuna

Deploying Predic/ve Models in the Cloud using Yhat Luke

Developments at the EPO and co-operation with China Niclas Morey Director, International

CORPO PORATE TE PR PRES ESEN ENTATI TION 4Q 4Q/FY /FY2018 2018 results esults Aerial

Q1 Presentation CEO Karl Johnny Hersvik CFO Alexander Krane Oslo, 30 April 2014 A company set

Applying Machine Learning Methods to Predicting Reliance on VA - PowerPoint PPT Presentation

Applying Machine Learning Methods to Predicting Reliance on VA Primary Care Edwin S. Wong, PhD VA Puget Sound Health Care System Department of Health Services, University of Washington 2018 AcademyHealth Annual Research Meeting June 25, 2018

Professional Reliance in the Professional Reliance in the Williams Lake TSA Empowering

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Welcome Predicting Change Outcomes Leveraging SQL Server Profiler Lee Everest SQL Rx Predicting

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Reliance Agreement Basics Julie Chamberlin, CCRP, CIP IRB Reliance &amp; QA/QI Specialist

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Predicting Hotel Cancellations with Machine Learning Michael el Grogan Machine Learning

Machine Learning for NLP Ethics and Machine Learning Aurlie Herbelot 2019 Centre for

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

WT/COMTD/RTA/7/1 17 August 2016 (16-4426) Page: 1/22 Committee on Trade and Development

Working together to combat I UU fishing to ensure the sustainability of world fish stocks: the

initiatives to protect cold-water corals from the impact of deep-sea fishing 5 th International

Seafood traceability for compliance: Country-level support for CDS implementation World Tuna

Deploying Predic/ve Models in the Cloud using Yhat Luke

Developments at the EPO and co-operation with China Niclas Morey Director, International

CORPO PORATE TE PR PRES ESEN ENTATI TION 4Q 4Q/FY /FY2018 2018 results esults Aerial

Q1 Presentation CEO Karl Johnny Hersvik CFO Alexander Krane Oslo, 30 April 2014 A company set

Reliance Agreement Basics Julie Chamberlin, CCRP, CIP IRB Reliance & QA/QI Specialist