Optimizing Risk Prediction for High Utilizers in a Safety Net System - - PowerPoint PPT Presentation

optimizing risk prediction for high utilizers in a safety
SMART_READER_LITE
LIVE PREVIEW

Optimizing Risk Prediction for High Utilizers in a Safety Net System - - PowerPoint PPT Presentation

Optimizing Risk Prediction for High Utilizers in a Safety Net System Zeyu (Zach) Li, MPH Kathleen Tatem, MPH Spriha Gogia, PhD, MPH Jeremy Ziring * Remle Newton-Dame, MPH Jesse Singer, DO, MPH Dave Chokshi, MD, MSc FACP All the presenting


slide-1
SLIDE 1

Optimizing Risk Prediction for High Utilizers in a Safety Net System

Zeyu (Zach) Li, MPH Kathleen Tatem, MPH Spriha Gogia, PhD, MPH Jeremy Ziring * Remle Newton-Dame, MPH Jesse Singer, DO, MPH Dave Chokshi, MD, MSc FACP

All the presenting author and co-authors have no relevant financial relationship to disclose * NYU School of Medicine, New York, NY

slide-2
SLIDE 2

INTRODUCTION

2

slide-3
SLIDE 3

About New York City Health + Hospitals

  • Largest municipal health system in the

country

  • Safety-net system: Mandate to care for

the uninsured/underserved in New York City

  • 11 hospitals, 6 diagnostic and treatment

centers, 5 long-term care facilities, 70+ ambulatory care centers, correctional health services

  • >1 million patients, 6 million visits per

year

STATEN ISLAND MANHATTAN BROOKLYN QUEENS THE BRONX

3

slide-4
SLIDE 4
  • Define “risk”
  • Stratify population into high, med and low

risk with risk score algorithm

  • Segment high risk population into

intervenable groups

  • Target drivers of high risk in each segment

with effective programming

Risk Stratification Approach

Stratification Segmentation Targeting

4

slide-5
SLIDE 5

METHODS

5

slide-6
SLIDE 6

Study Population

  • N = 833,969 adult patients with encounters at H+H during the measurement year

(Q3 2016 – Q2 2017)

  • Excluded:
  • Pregnant women
  • Actively incarcerated
  • Only ancillary care (eg. Radiology visits, blood draws)
  • Missing name, date of birth or sex in patient record

6

slide-7
SLIDE 7

Outcomes of Interest

  • Limited consensus on the definition of “High Risk” in Literature: visits, re-visits, days
  • Goal: predict high ED/Inpatient (“acute”) utilization in the prediction year (Q3

2017 – Q2 2018) from data in the measurement year (Q3 2016 – Q2 2017)

  • Outcomes tested:
  • 10+ acute days: Original algorithm (logistic regression) * outcome (top 1%)
  • 5+ acute days : To get at medium/high risk populations (top 5%)
  • No. of acute days: Continuous outcome

7 *https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5910357/

slide-8
SLIDE 8

Data Prep and Flow

Data Sources Candidate Predictors: 70+

Modeling Dataset

Final Dataset

Clinical Utilization Social Determinants Demographics

8

slide-9
SLIDE 9

Candidate Predictors

Demographics

  • Continuous Binned Age
  • Age Category (18-44, 45-64,

65-80, 81+)

  • Sex
  • Marital status
  • Race/ethnicity
  • Preferred spoken language
  • Payer group
  • Medicare
  • Medicaid
  • Self Pay
  • Commercial
  • Other

Clinical

  • Count of chronic conditions

(Elixhauser)

  • Individual chronic conditions

from Elixhauser, including:

  • HTN Complicated
  • HTN Uncomp
  • DM Complicated
  • DM Uncomp
  • Substance Use
  • Alcohol Use
  • Solid Tumor
  • Met. Cancer
  • Obesity
  • Renal Failure
  • Liver Disease
  • Congestive Heart Failure
  • 31 in total
  • Sickle Cell (CCS grouper)
  • Frailty indicator
  • Antipsychotic Rx
  • Anticoagulant Rx

Utilization

  • No. of ED visits
  • No. of inpatient (IP) visits
  • No. of ED/IP days
  • 10+ ED/IP days
  • 90+ ED/IP days
  • No. of outpatient visits
  • No. of primary care visits
  • 1+ non-emergent ED visits
  • 1+ primary care-treatable ED

visits

Social Determinants

  • No. of zip code changes
  • Neighborhood poverty level
  • No. of payer changes
  • Missed visits
  • Homelessness
  • History of incarceration

9

slide-10
SLIDE 10

Model Development & Validation Strategy

Training Set Validation Set Data Train Test Test Test Test Test Tuned Candidate Models Final Model

70% 30%

5-fold validation for all models

10

slide-11
SLIDE 11

RESULTS

11

slide-12
SLIDE 12

Evaluating Model Performance

Software

SAS R

Algorithm

Logistic LASSO CART

Outcome

10+ Days 10+ Days 5+ Days

  • No. of Days
  • No. of Days
  • No. of

Variables

33 30 34 17 18

Overall Model Performance AUROC

0.86 0.83 0.79 N/A N/A

RMSE

N/A N/A N/A 3.32 3.33

Top 1% Model Statistics PPV

44.6% 44.6% 44.8% 47.6% 48.4%

Sensitivity

16.2% 16.2% 16.2% 17.3% 15.0%

12

slide-13
SLIDE 13

Final Model Coefficients

LASSO Model Predicting No. of Acute Days

Demographics Utilization Medicare Pt 0.06

  • No. of ED Visits

0.35 Self-pay Pt

  • 0.04
  • No. of IP Visits

0.36 Other Payer Pt

  • 0.08
  • No. of ED/IP days (<30)

0.08 Pt with avoidable ER visit

  • 0.08

Clinical Indicators Social Determinants Alcohol Use Dx 0.19

  • No. of Zip changes

0.08 Psychoses Dx 1.17

  • No. of Payer changes

0.004 Substance Use Dx 0.38 Pt w/Missed visits (2+) 0.04 Congestive Heart Failure Dx 0.05 Homeless 0.28 Renal Failure Dx 0.02 History of Incarceration 0.47

  • No. of Elixhauser (count chronic

conditions) 0.21 Pt with antipsychotic Rx 0.40

Note: continuous variables are more influential than they appear

13

slide-14
SLIDE 14

From Data to Patient Care: Risk Stratification

Predicted Acute Days

Predicted vs. Observed Avg. Acute Days, Q3 2017-Q2 2018 Predicted acute days vs. average observed days: near-linear relationship

14

slide-15
SLIDE 15

DISCUSSION

15

slide-16
SLIDE 16

From Data to Patient Care – Model Deployment

  • High risk patient lists
  • List of patients predicted to be high risk are shared with facilities

quarterly

  • Epic integration
  • High risk flag integrated in Epic to promote care coordination of

patients before/during their visit

  • Model Deployment:
  • High Risk definition in our system
  • Top 5% of scored patients (translates to 3.6 days)
  • 57.5% of high risk patients return for an ED or inpatient visit

16

slide-17
SLIDE 17

From Data to Patient Care - Segmentation

17

slide-18
SLIDE 18

THANK YOU !

Contact us: PopHealthHighRisk@nychhc.org

18

slide-19
SLIDE 19

APPENDIX

19

slide-20
SLIDE 20

Risk Scoring at NYC H+H: A Brief History

  • In 2015: Algorithm to identify high risk ACO (Medicare) patients (Risk Score 1.0)
  • 2016/2017: Risk Score 2.0, a payer agnostic algorithm to predict high utilization (10+

acute days) at NYC Health + Hospitals

  • Prioritizes utilization, behavioral diagnoses like schizophrenia and alcohol

diagnoses

  • Quarterly lists of top 1% high risk patients sent to facilities
  • Algorithm published in JGIM earlier this year*
  • 2017/2018: Distribution of high risk patient lists and dashboard with action segments
  • This includes summary information of utilization, diagnoses, payer and

segments of high risk patients at NYC H+H overall and by facility.

  • 2018/2019: Risk Score 3.0 Development
  • Risk score re-optimization with augmented social determinants and machine

learning methods

20 *https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5910357/

slide-21
SLIDE 21

Risk Score 3.0: Literature Review

n=31 n=24 n=9

n=9

EPIC EHR INTEGRATION

  • A few organizations have successfully

integrated home-grown risk scores into Epic

  • Epic Healthy Planet can be leveraged

for Risk Stratification

RISING RISK

  • Difficult to predict
  • Various approaches:
  • Bloomers and Persisters
  • Binary Outcomes
  • Population Segmentation
  • Preventable Costs
  • Impactability

SEGMENTATION

  • Mix of statistical methods available for

segmentation:

  • CART
  • K-means clustering
  • Latent class analysis (MPLUS needed)
  • Expert led segment development

81+ Articles Reviewed

RISK SCORE CREATION

  • Common risk prediction modeling

methods:

  • Multivariable regression
  • LASSO
  • Random forest
  • CART
  • Different outcomes of interest:
  • Service utilization, clinical,

mortality, and hospitalization

  • Prediction on general utilization

is rare

  • Numerous predictors (see next slide)

21

slide-22
SLIDE 22

Homelessness Identification

  • Patients identified as homeless based on structured data

documentation:

  • Demographics (Registration)
  • Homeless/Undomiciled/Shelter listed in address field
  • Hospital or Shelter address listed in address field
  • ‘Person is homeless’ box checked in registration system
  • 10+ zip code changes within previous 12 months
  • Clinical diagnoses (Medical Record)
  • ICD-10 homelessness code (Z59.0) on problem list
  • Billing data (Bill / Claim)
  • ICD-10 homelessness code (Z59.0) in billing data
  • This does not include any free text information, such as Social Work

psychosocial assessments

slide-23
SLIDE 23

Cohort Characteristics

Characteristics Development Cohort N=583 778,

  • No. (%)

Validation Cohort N=250 191,

  • No. (%)
10+ Acute Days (Prediction Year) 16 109 (2.76%) 6 904 (2.76%) Past Utilizations ED Visits: 235 315 (40.3%) 100 670 (40.2%) 1-2 299 943 (51.4%) 128 817 (51.5%) 3-4 33 385 (5.72%) 14 315 (5.72%) 4+ 15 135 (2.59%) 6 389 (2.55%) IP Visits: 507 558 (86.9%) 217 556 (87.0%) 1 58 982 (10.1%) 25 368 (10.1%) 2 10 735 (1.84%) 4 587 (1.83%) 3 3 478 (0.60%) 1 384 (0.55%) 4+ 3 025 (0.52%) 1 296 (0.52%) 1+ Emergent PC Treatable ER visits 12 711 (2.18%) 5 424 (2.17%) Patient Demographics Marital Status: Married, Life Partner, Missing 151 869 (26.0%) 64 576 (25.8%) Single 386 952 (66.3%) 166 509 (66.6%) Separated, Widowed, Divorced 44 957 (7.70%) 19 106 (7.64%) Male 252 796 (43.3%) 108 339 (43.3%) Age: 18-44 298 053 (51.1%) 127 695 (51.0%) 45-64 202 976 (34.8%) 87 131 (34.8%) 65-80 67 022 (11.5%) 28 618 (11.4%) 81+ 15 727 (2.69%) 6 747 (2.70%) Ethnicity/Race: Non-Hispanic White 54 031 (9.26%) 22 893 (9.15%) Hispanic 197 699 (33.9%) 85 274 (34.1%) Non-Hispanic Black 204 564 (35.0%) 87 524 (35.0%) Other 127 484 (21.8%) 54 500 (21.8%) Speaks Non-English 177 324 (30.4%) 76 779 (30.7%) Payer (Most Recent): Medicaid 222 352 (38.1%) 95 455 (38.2%) Medicare 72 792 (12.5%) 31 234 (12.5%) Self-pay 194 071 (33.2%) 83 216 (33.3%) Other 94 563 (16.2%) 40 286 (16.1%)

23

slide-24
SLIDE 24

Cohort Characteristics (Continued)

Characteristics Development Cohort N=583 778, No. (%) Validation Cohort N=250 191, No. (%)

Diagnoses Alcohol Use 34 772 (5.96%) 14 815 (5.92%) Psychoses 24 258 (4.16%) 10 373 (4.15%) Depression 51 789 (8.87%) 22 227 (8.88%) Substance Use 30 232 (5.18%) 12 920 (5.16%) Congestive Heart Failure 12 896 (2.21%) 5 575 (2.23%) Diabetes Complicated 47 037 (8.06%) 20 131 (8.05%) Diabetes Uncomplicated 66 088 (11.3%) 28 359 (11.3%) Renal Failure 15 761 (2.70%) 6 748 (2.70%) Sickle Cell (CCS Grouper) 2 641 (0.45%) 1 102 (0.44%) Hypertension Complicated 14 983 (2.57%) 6 487 (2.59%) Hypertension Uncomplicated 139 021 (23.8%) 59 585 (23.8%) Obesity 59 719 (10.2%) 25 689 (10.3%)
  • No. of Chronic Conditions
1.27 (1.77) 1.26 (1.77) Frail Elderly 7 012 (1.20%) 3 009 (1.20%) Social Proxies Zip Code Change: 554 997 (95.1%) 237 998 (95.1%) 1-2 22 137 (3.79%) 9 444 (3.77%) 3+ 6 644 (1.14%) 2 749 (1.10%) Payer Change: 470 102 (80.5%) 201 491 (80.5%) 1-2 82 303 (14.1%) 35 191 (14.1%) 3-4 18 276 (3.13%) 7 912 (3.16%) 5+ 13 097 (2.24%) 5 597 (2.24%) Missed Visits: 0-1 424 078 (72.6%) 181 918 (72.7%) 2+ 159 700 (27.4%) 68 273 (27.3%) Homelessness 11 024 (1.89%) 4 735 (1.89%) History of Incarceration 16 762 (2.87%) 7 052 (2.82%) In Zip Code Where 30%+ of Neighborhood Living Under Federal Poverty Level 355 428 (60.9%) 152 024 (60.8%) Medication Prescriptions Anticoagulant Prescription 2 866 (0.49%) 1 168 (0.47%) Antipsychotic Prescription 17 838 (3.06%) 7 638 (3.05%)

24

slide-25
SLIDE 25