 
              Distinguishing Performance Using Patient-Reported Outcome Measures Adam Rose, MD MSc RAND Corporation Getty/AF-Studio
Interest in Patient-Reported Outcome Measures (PROs) as Performance Measures • Most quality of care measures focus on technical quality of care • PROs, such as health-related quality of life (HRQoL), measure something that inherently matters to patients • Particularly relevant among older patients with multiple chronic conditions (MCCs) • With limited life expectancy, maximizing HRQoL may be more relevant than measures of technical quality or many outcomes of care
Little is known about the feasibility of using PROs as performance measures • We know quite a lot about the ability of Hemoglobin A1c to distinguish performance between providers, clinics, or health plans based on sample size • Much less is known about potential PRO-based PMs • If we are to bring PRO-based PMs into widespread use, we need to know more about the sample sizes that will be needed to reliably distinguish between entities • Thus, we undertook this study
Objective • To examine the minimal sample sizes, per entity profiled, necessary to achieve acceptable reliability to compare entities on PRO-based PMs
Data Sources: Phase 1 • Health Outcomes Survey (HOS), administered by CMS to track health status of individuals enrolled in Medicare Advantage Organizations (MAOs) • Measures changes in HRQoL over a two-year period, using serial administrations of the Veterans RAND 12-Item Short Form (VR-12, similar to the SF-12)
Data Sources: Phase 2 • Primary data collected from primary care patients enrolled with Kaiser Permanente Colorado (KPCO) • Measured changes in HRQoL over a six-month period, using serial administrations of the Patient-Reported Outcomes Measurement Information System 29-Item Profile Measure (PROMIS-29)
Study Eligibility • For both Phases 1 and 2, participants needed to be age 65 or older and have at least 2 out of 13 chronic health conditions • In Phase 1 – conditions were assessed by self-report • In Phase 2 – conditions were assessed from KPCO clinical data (ICD-10 codes) – participants needed to be enrolled with KPCO, living at home, and enrolled with a KPCO primary care provider
13 Chronic Conditions • Arthritis • Other Heart Problems • Cancer • Sciatica • Chronic Lung Disease • Stroke • Congestive Heart Failure • Depression • Diabetes • Hypertension • Inflammatory Bowel Disease • Ischemic Heart Disease • Osteoporosis
Candidate Performance Measures • Alive with stable or improved score (binary measure) – Defined at 0 SD, ¼ SD, ½ SD, and 1 SD • Mean change in absolute score (continuous measure) • For VR-12, scores were based on PCS (physical component score) and MCS (mental component score) • For PROMIS-29, scores were based on PHS (physical health score) and MHS (mental health score)
Whom did we profile? • In Phase 1, we profiled 466 Medicare Advantage Organizations (MAOs) on ability to prevent decreases in HRQoL over time • In Phase 2, we profiled 13 KPCO primary care clinics on ability to prevent decreases in HRQoL over time
Reliability of performance measures • Proportion of variation in the PM attributable to real differences across the entities profiled, rather than random error • We began by calculating the intraclass correlation (ICC) for PMs to determine the proportion of variance explained at the clinic or MAO level • We then used the Spearman-Brown Prophecy formula to calculate the reliability achieved on each PM, as well as the minimum sample size per entity profiled that would have been needed to achieve reliability of 0.7
Results: Phase 1 • Data from 79,972 HOS respondents, distributed among 466 MAOs • Mean of 160 respondents per MAO • Mean age 74.7 • 39% Male • 74% White, non-Hispanic • 28% with two out of thirteen MCC, 26% with three, 20% with four, and 27% with five or more
Reliability Analyses by MAO – Phase 1 Performance Measure ICC Reliability Number Needed for Reliability 0.7 Stable or Improved PCS 0.001 0.14 2399 Stable or Improved MCS 0.0026 0.32 882 Mean Change in PCS 0.0016 0.22 1445 Mean Change in MCS 0.001 0.12 2912
Results: Phase 2 • Data from 337 KPCO patients, distributed among 13 primary care clinics • Mean of 27 respondents per MAO • Mean age 79.0, with 21% age 85 or older • 50% Male; 93% White, non-Hispanic • 44% with two out of thirteen MCC, 31% with three, 14% with four, and 11% with five or more • Mean baseline PHS score 44; mean MHS score 50
Reliability Analyses by Clinic – Phase 2 Performance Measure ICC Reliability Number Needed for Reliability 0.7 Stable or Improved PHS (1/4 SD) 0.007 0.16 341 Stable or Improved MHS (1/4 SD) 0 0 n/a Mean Change in PHS 0.001 0.03 2317 Mean Change in MHS 0 0 n/a
Discussion • In this study, none of the candidate PMs achieved a reliability of 0.7 with these sample sizes • The measure with the smallest necessary sample size for health plans (stable or improved MCS) would have required at least 882 respondents per MAO (or 5.5 times as many as we had) • The measure with the smallest sample for clinics (stable or improved PHS) would have required at least 341 respondents per clinic (or 13 times as many as we had) • Future efforts to test and validate PRO-based PMs should assume a sample size at least that large, per entity profiled
What would decrease the necessary sample size? • Greater illness burden – we required 2 of 13 MCCs, requiring more could decrease the necessary sample size – but make it harder to achieve • We already sampled an older population • Choosing a disease state with an expectation of decline over time, such as severe heart failure • Longer follow-up time, although this could erode the immediacy of the measure
Important questions going forward • How much of HRQoL decline is within the control of healthcare providers? • What structures or processes of care are linked to better performance at preventing HRQoL decline at the provider, clinic, or health plan level?
Limitations • It’s not clear that either level of measurement (health plan, clinic) is the ideal match for the ability to prevent HRQoL decline over time – provider-level may be better • It remains unproven that higher-quality care can prevent HRQoL decline over time – the hypothesized relationship which forms the premise of using a PRO as a PM
Conclusions • No candidate PM achieved acceptable reliability with the sample sizes we used here • Adding risk adjustment would further increase the sample size required • Future efforts to test PRO-based PMs should plan on using sample sizes at least this large (880 per health plan, 341 per clinic)
Acknowledgements • Co-Authors – Elizabeth Bayliss (Kaiser Permanente Colorado) – Lesley Baseman (RAND) – Emily Butcher (RAND) – Wenjing Huang (RAND) – Maria Orlando Edelen (RAND) • Funding: Funded by contract #HHSN271201500064C NIH NIA (PI: Edelen).
Recommend
More recommend