Publicly Available Large Data Sets for Health Outcomes Research: - - PowerPoint PPT Presentation
Publicly Available Large Data Sets for Health Outcomes Research: - - PowerPoint PPT Presentation
Publicly Available Large Data Sets for Health Outcomes Research: Pearls, Pitfalls, Prices & More LAKSH IKA TEN N AKOON - MD , MS C , MP H IL , D TM&H R ESEAR C H SC IEN TIST TR AU MA, AC U TE C AR E AN D C R ITIC AL C AR E SU R GERY
Aims
- To encourage use of public data for
Research
- To characterize existing large clinical
databases
Databases Dates Nationwide Inpatient Sample (NIS) 1988- 2016 Nationwide Emergency Department Sample (NEDS) 2006-2016 Nationwide Readmissions Database (NRD) 2010-2016 KID Inpatient Data (KID) 1997,2000, 2003,2006, 2009, 2012, 2016 National Trauma Databank (NTDB) 2002-2016 National Surgical Quality Improvement program (NSQIP) 2005-2016 National Ambulatory Medical Care Survey (NAMCS) 1993-2015
Best Currently Available Databases
Source
HCUP
HCUP HCUP HCUP ACS ACS CDC
Databases Dates National Health and Nutrition Examination Survey (NHANES) 1999-2015 National Hospital Ambulatory Medical Care Survey (NHAMCS) 1992-2015 Medicare/SEER 1991-2015 MarketScan 2002-2011 Hospital based Registry data
Databases……….
Source CDC CDC Government Private Hospital Based
- The largest publicly available all-payer inpatient care
database in the United States
- Samples include all discharges from 20% stratified
sample of US hospitals
- NIS data can be weighted
to generate national estimates
- Years available: 1988 to 2016
- Has 8 million hospital stays a year
- NIS_2015_CORE data file has: 7,153,989 Records
Nationwide Inpatient Sample (NIS)
Cost & Data Load Software
- Cost of 2016 NIS : $625
- Original Data comes as CSV or ASCII files
- Load programs are available in:
STATA SAS SPSS
- Data storage : Large databases need a server or
BOX
NIS-Requirements
Citing HCUP Databases
- Citing HCUP Databases in Abstract and Manuscript:
As specified in the HCUP DUAs, include the database name, HCUP, and AHRQ as demonstrated below for each HCUP database:
- HCUP Nationwide Inpatient Sample (NIS). Healthcare Cost and
Utilization Project (HCUP). 2007-2009. Agency for Healthcare Research and Quality, Rockville, MD. www.hcup- us.ahrq.gov/nisoverview.jsp
Data Files ▪ Core Data ▪ Hospital Data ▪ Illness Severity Data ▪ Cost to charge ratio Data ▪ Diagnosis & Procedure Groups Data
▪ https://www.hcup-us.ahrq.gov/db/nation/nis/nisdde.jsp
What Data Elements Are in the NIS?
- Age at admission
- Gender of patient
- Race of Patient
- Location of patient’s residence
- Median household income for patient's ZIP code
- ICD-9-CM
diagnoses: primary and secondary diagnoses, number of diagnoses, diagnosis coding system
- External causes of injury and poisoning: ECODE 1-4, number
- f external cause of injury
- ICD-9-CM Procedures: primary and secondary procedures,
number of procedures, procedure systems, duration of primary and secondary procedures
- Total charges
- Disposition
- Length of stay
Core Data File
- Hospital bed size
- Type
- f
Hospital: government
- r
private; government, nonfederal, public; private, non- profit; private, investor-own
- Hospital Location: rural or urban,
- Location/teaching
status
- f
Hospital: rural, urban non-teaching, urban teaching
- Region of Hospital: Northeast, Midwest, South,
West
- Hospital Weights: weight to hospitals in AHA
universe, weight to hospitals in the State
Hospital Data File
Severity of Illness Data
- Severity of Illness Subclass
- Risk of Mortality Subclass
- 29 Comorbid conditions: Alcohol Abuse, Depression, Drug
Abuse, Liver Disease, Renal Failure, Obesity…………..
- Defined by Elixhauser Comorbid Scale
- https://www.hcup-
us.ahrq.gov/toolssoftware/comorbidityicd10/comorbidity_icd10.jsp
Cost-to-Charge- Ratio Data
- Year
- Hospital Unique Identifier
- Wage Index
- CCR_NIS (an Identifier, linking NIS 2012 to current )
- Calculate “Total Cost” based on above data and “Total Charges”
(TOTCHG) variable which is available in NIS core data file
- Formula : gen Total_COSTS= TOTCHG*CCR_NIS
- NEDS is the largest all-payer ED database in the
United States
- Samples include stratified samples of 20% of US
hospital-based Emergency Departments
- Years available: 2006-2016
- Number of ED visits: Between 25 and 30 million
(unweighted) records for ED visits from 950 hospitals
- Cost of NEDS 2016 $1000
Nationwide Emergency Department Sample (NEDS)
Four Data Files per year ▪ Core data ▪ Emergency department data ▪ Inpatient data ▪ Hospital Weights data
What Data Elements Are in the NEDS?
Examples of NEDS-Based Research
Nationwide Readmissions Database (NRD)
- Calculate national readmission rates for all payers and the uninsured
- Available nationally representative information on hospital
readmissions for all ages
- Unweighted NRD data from approximately 12 million discharges each
year
- Has Core data, Hospital data, Illness severity, Cost to Charge Ratio
data
- Available years 2010- 2016
- Cost of NRD 2016 data $1000
KID (Kids’ Inpatient Database )
- Only all-payer pediatric inpatient care database in the USA
- Contains 2-3 million hospital stays
- Helps to develop national & regional estimates on diseases
- Data available for Demographics, Injury characteristics,
Diagnosis, Hospital characteristics, Outcomes and Healthcare Cost
- Need to sign a DUA
- Cost of KID 2016 data $500
- The largest registry of trauma patients admitted to
trauma centers in the United States
- Data is not weighted
- No DUA (data user agreement)
- Samples are obtained from trauma center
- registries
▪ In 2011, 747 trauma centers were included
- Years available: 2002 -2016
- Data files are in CSV format
- Cost of 2016 NTDB data $300
National Trauma Data Bank (NTDB)
NTDB Data
- Demographic data
- Injury severity data
- Emergency department data
- Mechanisms of Injury data
- ICD9 and ICD10 Procedure data
- ICD9 and ICD10 Diagnosis data
- Discharge disposition data
- Facility data
- Vital signs data
- Protective devices & transportation data
- Comorbid and complications data
National Surgical Quality Improvement Program (NSQIP)
- A nationally validated, risk-adjusted, and outcomes-
based program
- NSQIP has prospective and outcomes data
- Years available: 2005 - 2011
- NSQIP will measure and improve the quality of surgical
care across surgical specialties
- 680 hospitals are participating NSQIP in 2017
What Data Elements Are in the NSQIP?
- Preoperative risk factors
- Intraoperative variables
- 30-day postoperative mortality and morbidity
- utcomes
- Demographic data
- Current Procedural Terminology (CPT) data
- Health and behavior data
- Physical examination data
- Free data for NSQIP participating hospitals
- Data Request Process
www.facs.org/quality-programs/acs-nsqip
- Need to sign a DUA (Data User Agreement)
- Download the data
www.facs.org/quality-programs/acs-nsqip
- Data files available in 3 different formats: Text, SPSS,
SAS
- Private database
- MarketScan is broadly representative of the commercially
insured population of United States
- High quality, longitudinal, and patient level data
- Low percentage of missing data
- Years available: 2002 – 2011
- Need to sign a DUA (Data User Agreement)
Cost around $50,000/year
MarketScan Data
- Patient socio-demographic data
- Admission date and type
- Diagnosis code (principal and secondary)
- Discharge status
- Procedure code (principal and secondary)
- Length of stay
- Place of service
- Provider ID
- Data on drugs/medications
What Data Elements Are in the MarketScan?
SEER-Medicare Data
- SEER-Medicare Linked Database
- Medicare beneficiaries with cancer
- Data derived from Surveillance, Epidemiology and End Results
- Diagnosis & Procedure codes: ICD9, ICD10, CPT,
- HCPCS (Healthcare Common Procedure Classification System)
- Patient Demographic and Socioeconomic Characteristics
- Comorbidity
- Breast, Colorectal, and Prostate Cancer Screening
- Radiation Therapy (includes codes to identify radiation therapy)
- Chemotherapy Use (includes codes to identify chemotherapy)
- Complications of Cancer Treatment
- Surveillance After Cancer Treatment
- Data sets available from 1991-2015
- Need to sign a DUA (Data User Agreement)
- Physician Characteristics
- Hospital Characteristics
- Health Care Costs Related to Cancer Treatment
National Health and Nutrition Examination Survey (NHANES)
- Cross-sectional and high quality survey data of adults and children
in United States
- Data available on nationally representative sample of about 5,000
persons/each year
- Years available:
1971-75—NHANES I 1976-80—NHANES II 1982-84—Hispanic Health and Nutrition Examination Survey (HHANES) 1988-94—NHANES III 1999-present--National Health and Nutrition Examination Survey (Continuous NHANES)
- Free to download the data from CDC website
- Socio-Demographic data
- Dietary data
- Clinical examination data (medical, dental, and
physiological measurements)
- Laboratory data
- Questionnaire data
- Genetic data
- Mortality data
- NHANES Medicare Utilization and Expenditure Linked
Files (Restricted data)
- NHANES Linked Mortality files
- NHANES Linked Social Security Administration Files
(Restricted Data)
What Data Elements Are in the NHANES?
- National survey has ambulatory medical care
services data in the United States
- Data will represent a sample of visits to non-
federal employed, office-based physicians who are primarily engaged in direct patient care
- NAMCS has high quality cross-sectional data
- Years available: 1973-Current
- Free to download the data from CDC
website
National Ambulatory Medical Care Survey (NAMCS)
- Socio-demographic data
- Source of payment and number of past visits
- Patient’s Primary Care Physician Information
Diagnosis
- Chronic disease checklist and disease management
programs
- Screening and diagnostic services
- Treatments and drugs prescribed
- Physician specialty
- EMR use and practice parameters
- sources of revenue
- Providers seen and duration of care under those
providers
What Data Elements Are in the NAMCS?
Evidence Based Research-NAMCS
National Hospital Ambulatory Medical Care Survey (NHAMCS)
- National probability sample survey of visits to Emergency and
Outpatient departments in nonfederal, general, and short-stay hospitals in United States
- Records-based survey data, producing annual estimates of the
number and attributes of visits to hospital emergency departments (EDs) in the U.S
- Survey is a visit based and cannot calculate prevalence and incidence
rates
- Years available: 1992-Current
- Free to download the data from CDC website