DATA QUALITY ASSESSMENT FRAMEWORK
LISA SCHILLING, MD, MSPH ACADEMY HEALTH ANNUAL RESEARCH MEETING JUNE 27, 2017
ASSESSMENT FRAMEWORK LISA SCHILLING, MD, MSPH ACADEMY HEALTH ANNUAL - - PowerPoint PPT Presentation
DATA QUALITY ASSESSMENT FRAMEWORK LISA SCHILLING, MD, MSPH ACADEMY HEALTH ANNUAL RESEARCH MEETING JUNE 27, 2017 LOTS OF ACKNOWLEDGMENTS Funding AHRQ 1R01HS019908 (SAFTINet, PI Schilling) AHRQ1R01HS019912 (SPAN, PI Steiner) AHRQ
LISA SCHILLING, MD, MSPH ACADEMY HEALTH ANNUAL RESEARCH MEETING JUNE 27, 2017
3 Weber, G. M., Mandl, K. D. & Kohane, I. S. Finding the missing link for big biomedical data. JAMA 311, 2479–2480 (2014).
6 Blood Pressure Measure Name Patients Used on Times used BLOOD PRESSURE 538,647 13,869,327 R AN NIBP 63,576 2,949,877 CARD BP 3 14,631 26,889 ABP INVASIVE PRESSURE 9,031 3,382,825 BLOOD PRESSURE (ED SEDATION) 7,402 41,498 EDU STAND BP 6,950 33,876 EDU LYING BP 6,941 32,609 CARD BP 2 6,878 9,934 ED PRE HOSP BP 6,323 7,117 BP #2 5,529 40,592
EDU SIT BP 4,957 6,152 CARD BP 4 4,452 6,806 R AN IBP ART 4,430 1,181,368 BP #3 4,330 24,675 BP - STANDING 4,098 6,120 BP - LYING 4,068 5,753 BP - SITTING 3,920 5,292 BP #4 3,477 15,898 BP PRE SEDATION 1,831 2,246 PAP 1,793 218,931 BLOOD PRESSURE (CS) 1,322 8,290 ART PRESSURE #2 404 136,579 R AN IBP PAP 71 6,488 R AN IBP P1 60 4,562 ECMO BLOOD PRESSURE 57 85,037 CARD BP 1 55 56 R AN IBP AO 53 4,129 RV PRESSURE 50 124 R AN IBP FAP 37 3,997 R AN IBP UAP 27 1,021 R AN IBP P2 13 339 R AN IBP P 11 282 R AN IBP LAP 11 634 BP #2 8 8 R AN IBP P4 2 2 R AN IBP BAP 1 1
CHCO Slides from Maggie Massary. Used with permission
7
CHCO Slides from Maggie Massary. Used with permission
8
10
COMMUNITY-DRIVEN CONSENSUS RECOMMENDATIONS FOR DQ REPORTING
11
COMMUNITY-DRIVEN CONSENSUS RECOMMENDATIONS FOR DQ REPORTING
13
knowledge) you have on hand.
standards, recognized benchmarks/comparators
15
align together as expected”
16
VERIFICATION VALIDATION Definition Example Definition Example COMPLETENESS: ARE THE DATA PRESENT? Density
against a denominator are expected based on internal knowledge.
against a time-
are expected based
knowledge. Includes total missingness measures.
patient observations between ETLs.
emergency room visits during flu season.
data density against a denominator are expected based on external knowledge.
data density against a time-oriented denominator are expected based on external knowledge. Includes total missingness measures.
patient observations across network data partners.
monthly emergency room visits during flu season are similar to health department reports.
VERIFICATION VALIDATION Definition Example Definition Example FIDELITY: ARE THE DATA DEPENDABLE? Metadata
conform to internal formatting constraints.
conform to relational constraints.
character.
tables as required.
to representational constraints based on external standards.
primary language variable in the demographics table conforms to ISO standards. Measure
measurement of the same fact show expected variability.
measurements are similar when taken by two separate nurses within the same facility.
databases (e.g., database 1 abstracted from database 2) yield similar results for identical measurements.
consistent between EHR data and registry data for the same facility. Derivation
conform to computational or programming specifications.
calculated Body Mass Index values are identical.
provided with identical specifications and identical data sets report identical results for derived values.
implemented in SAS and R yield identical results
Uniqueness
absent of duplicate measurements.
merged objects are
under a single MRN.
via EHR and claims data are only counted once.
in a source database is uniquely represented in a target database.
in a source database is represented by its components in a target database.
claims database represents a single encounter in the EHR.
EHR database is represented by its ingredients in a pharmacy database.
VERIFICATION VALIDATION Definition Example Definition Example PLAUSIBILITY: ARE THE DATA BELIEVABLE? Measure
distributions agree with an internal measurement or local knowledge.
measurements of the same fact are in agreement.
between variables and subgroups agree with local or common knowledge (Includes "expected" missingness).
values for height and weight.
measurement is similar to finger stick glucose measurement.
temperatures are similar.
specific contexts (pregnancy, prostate cancer).
not associated with
distributions (including subgroup distributions) agree with trusted reference standards or external knowledge.
identical measurements are obtained from two independent databases representing the same
credibility.
hospital and national reference lab are statistically similar under the same conditions.
codes are similar between two independent claims databases serving similar populations. Time
values conform to expected temporal properties.
transitions conform to expected properties.
per year conforms to expectations.
precedes a booster immunization.
values have similar temporal properties across one or more external comparators or gold standards.
transitions are similar to external comparators or gold standards.
conforms to Medicare data for similar populations.
match the state immunization registry sequence.
www.pcori.org 21
http://dododas.github.io/dqa-viz/dashboards.html
www.pcori.org 22
http://dododas.github.io/dqa-viz/dashboards.html
www.pcori.org 23
https://sigfried.github.io/parcoords/