NEC METHODS: MATCHING, DEDUPLICATION, ANALYSIS & RESPONSE RATES - - PowerPoint PPT Presentation

nec methods matching deduplication analysis response rates
SMART_READER_LITE
LIVE PREVIEW

NEC METHODS: MATCHING, DEDUPLICATION, ANALYSIS & RESPONSE RATES - - PowerPoint PPT Presentation

1 NEC METHODS: MATCHING, DEDUPLICATION, ANALYSIS & RESPONSE RATES 28 October 2014 Matching & Deduplication 2 Purpose of the Merged Analytic Cross- Region Datasets 3 PIF-ER Merged Dataset Analyses on types of trainees who


slide-1
SLIDE 1

28 October 2014

1

NEC METHODS: MATCHING, DEDUPLICATION, ANALYSIS & RESPONSE RATES

slide-2
SLIDE 2

Matching & Deduplication

2

slide-3
SLIDE 3

Purpose of the Merged Analytic Cross- Region Datasets

3

 PIF-ER Merged Dataset

 Analyses on types of trainees who attended particular

events

 PIF-ER-ACRE Merged Dataset

 Analyses on outcomes of AETC training programs related

to self-assessed changes in provider behavior and clinical practice.

slide-4
SLIDE 4

Analytic Dataset Creation Overview

1.

Collect regional process and evaluation data

2.

Convert data in submitted format (Excel, CSV, SPSS) to SAS

3.

Reformat regional datasets to match expected data file specifications (e.g., character/numeric type)

Process data: HRSA data manual

Evaluation data: ACRE implementation manual

4.

Create all-region ER, PIF, ACRE IP , ACRE FUP , and FTCC PIF datasets by concatenating/appending regional files of the same type

5.

Create analytic PIF-ER merged dataset

6.

Create analytic PIF-ER-ACRE datasets

4

slide-5
SLIDE 5

Cross-Region Analytic Data

5

Steps 1, 2, 3, 4: Collect, convert, reformat

  • data. Create all-region ER,

PIF, ACRE IP and FUP datasets. Step 5: Create analytic ER-PIF dataset Step 6: Create analytic ER-PIF-ACRE dataset

slide-6
SLIDE 6

Creating the Analytic PIF-ER Merged Dataset

6

 Check to see which regions have repeats on

PROG_ID by LPS

 Merge PIF and ER  For 1-2 regions with repeated PROG_ID, sort and

merge the PIF and ER by AETC – LPS – and PROG_ID

 For all other regions that have distinct PROG_ID, sort

and merge the PIF and ER by AETC and PROG_ID

  • nly

PROG_ID AETC LPS Bottom of PIF:

slide-7
SLIDE 7

Creating the Analytic PIF-ER-ACRE Merged Dataset (1)

7

 Select eligible ACRE IP data  Check to see which regions have repeats on PROG_ID by LPS  Exclude records where all 4 IP questions are missing/blank  Exclude records where the PIF_ID is . [missing], 0, or 99999999  De-duplicate IP records by AETC, LPS (if applicable), PROG_ID,

PIF_ID, AIP1, AIP2

 Select eligible records from the previously created ER-PIF

merged dataset

 Include only records where there is at least 1 PIF record included

(e.g., there are some ERs without any PIFs)

 Exclude records where the PIF_ID is . [missing], 0, or 99999999

Cont.’d

slide-8
SLIDE 8

Creating the Analytic PIF-ER-ACRE Merged Dataset (2)

8  Sort the ER-PIF and the ACRE IP data by AETC LPS (if applicable) PROG_ID

PIF_ID. The ER-PIF dataset is further sorted by PIFDATE

 Merge the ER-PIF-IP by AETC LPS PROG_ID PIF_ID  De-duplicate the data based on the key variables AETC, LPS (if

applicable), PROG_ID, PIF_ID [*Note, this deletes <200 records]

 Sort the all-region ACRE FUP by AETC LPS (if applicable) PROG_ID PIF_ID  Sort the previously created ER-PIF-IP dataset by AETC LPS (if applicable)

PROG_ID, PIF_ID

 Merge the ER-PIF-IP with the ACRE FUP by these key variable  Restrict the analytic dataset to records with a valid, non-missing PIF_ID with

a PIF available [Note, approx 20K records removed]

slide-9
SLIDE 9

PIF ID

9

 PIF ID is available on the PIF, ACRE IP

, and ACRE FUP data

 Though not on the ER form, the Program ID on the PIF

and ER allows PIF IDs to be associated with events

 PIF ID used for matching

 Across training events (repeat trainees)  Across evaluation forms (ACRE IP and FUP) month of birth + day of birth + last 4 digits of SSN PIF_ID

slide-10
SLIDE 10

NEC valid PIF ID algorithm

10

 Valid PIF ID contains:

 Valid month of birth (1-12)  Valid day of birth (1-31)  Valid last 4 digits of SSN (≥1 and not 9999)

 Valid PIF ID is a numeric value <99999999  Examples of invalid PIF IDs:

 99999999  0  . [missing]  12345678  04049999  1122420932

 Records with invalid PIF IDs are excluded from regression analyses

slide-11
SLIDE 11

De-Duplication Examples

11

 For overall ACRE regression analyses:

 ER-PIF-ACRE dataset restricted to records with a valid PIF ID and with a

linked PIF

 Restricted dataset sorted by combined AETC region, PIF ID, eligibility for

ACRE IP , having associated IP record, and PIF date

 Last record is outputted

 For MAI ACRE regression analyses, similar:

 ER-PIF-ACRE dataset restricted to records with a valid PIF ID and with a

linked PIF

 Restricted ER-PIF-ACRE dataset sorted by combined AETC region, PIF ID,

having an MAI training record, eligibility for ACRE IP , having associated IP record, and PIF date

 Last record is outputted

slide-12
SLIDE 12

Recoding & Analysis

12

slide-13
SLIDE 13

Eligible Records for ACRE Regression Analyses

13

 Last eligible record among repeat trainees is used  “Eligible” means the PIF_ID is not an invalid code according to the

NEC algorithm, there is truly an associated PIF in the linked dataset

 Analytic population includes:  For IP: targeted IP trainee (i.e., attended Level 1, 2, or 3

training), who has an associated PIF and IP record, and is a direct HIV provider (PIF13=1)

 For FUP: targeted FUP trainee (i.e., attended Level 2 training and

topic included clinical management [ER4_1-16] or prevention and behavior change [ER4_29-31] topics), who has an associated PIF and FUP record, and is a direct HIV provider (PIF13=1)

slide-14
SLIDE 14

ACRE IP Eligible Trainings

Event Record form

14

ACRE immediate post questions asked immediately after training event ER9_3>0 ER9_2>0 ER9_1>0

  • OR-
  • OR-
slide-15
SLIDE 15

ACRE FUP Eligible Trainings

  • AND-

ANY

Event Record form

15

ACRE follow-up asked 6 weeks after training through a web-based survey ER9_2>0 ER4_1=1 or ER4_2=1 or etc. …. or ER4_31=1

slide-16
SLIDE 16

FY 11/12 AETC Cross-Region Trainees in IP Analyses

16

Data source: cross-region ER-PIF and ACRE IP FY11-12.

N = 108,687 excludes n = 2,459 event records without a PIF associated and n = 5,736 records with an invalid PIF ID. This number includes repeat trainees. Though n = 93,756 records fulfilled the IP target criteria, n = 42,465 (45.3%) ER-PIF- IP records that linked and fulfilled the target. Of these, n = 15,979 (52.7%) indicated they were direct HIV providers on the PIF. N = 72,642 ACRE IP records received by NEC N = 108,687 FY 11-12 trainees (based on linked AETC PIF and ER) n = 45,452 linked ER-PIF-ACRE IP n = 42,465 linked records and a targeted IP training n = 2,987 linked records and NOT a targeted IP training n = 30,331 linked records, IP targeted, and trainee’s last record in FY 11-12

slide-17
SLIDE 17

FY 11/12 AETC Cross-Region Trainees in FUP Analyses

17

Data source: cross-region ER-PIF and ACRE FUP FY11-12.

N = 3,847 ACRE FUP records received by NEC N = 108,687 FY 11-12 trainees (based on linked AETC PIF and ER) n = 2,620 linked ER-PIF-ACRE FUP n = 2,018 linked records and a targeted FUP training n = 602 linked records and NOT a targeted FUP training n = 1,707 linked records, FUP targeted, and trainee’s last record in FY 11-12 N = 108,687 excludes n = 2,459 event records without a PIF associated and n = 5,736 records with an invalid PIF ID. This number includes repeat trainees. Though n = 61,647 records fulfilled the FUP target criteria, n = 2,018 (3.3%) ER-PIF-FUP records that linked and fulfilled the target. Of these, n = 1,014 (59.4%) indicated they were direct HIV providers on the PIF and FUP survey.

slide-18
SLIDE 18

Analytic Variables

18  Regression models have included the following predictors:

 Big 6  Worked in Ryan White funded setting  Minority provider  Minority serving  Provider experience  HIV+ clients per month  Repeat trainee

 All of the above predictors come directly from the PIF except for Repeat

trainee status, which is based on the linked PIF-ER

 Regression models are restricted to direct providers of HIV+

 ACRE FUP web survey is targeted to direct providers

slide-19
SLIDE 19

Analytic Variable: Clinical Providers “BIG 6”

19

 Comes from PIF question 3

Clinical providers encompass 7 professional categories, though we often refer to them as “big 6” All other non-missing responses are coded as non-clinical providers

Participant Information Form

PIF3 Mutually exclusive

slide-20
SLIDE 20

Analytic Variable: Ryan White-Funded

20

 From the RWFUND administrative variable on the

bottom of the PIF

Participant Information Form

 Exceptions apply: some regions have advised the NEC

to use PIF8A for this information

RWFUND =1 =0 =1 =0 =9 PIF8A

slide-21
SLIDE 21

Analytic Variable: Minority Provider

21

 A minority provider is

Hispanic, multiracial, AI/AN, Asian, Native Hawaiian or Pacific Islander, or Black

 A non-minority provider is a

non-Hispanic White provider with only a single race indicated

 Those without any race

indicated are left as missing

Participant Information Form

PIF10_1 PIF10_2 PIF10_3 PIF10_4 PIF10_5 PIF9 =0 =1 Mutually exclusive Not mutually exclusive

slide-22
SLIDE 22

Analytic Variable: Minority Serving

22

 Among providers with direct service experience to

HIV-infected clients (PIF12_1=1 and PIF13=1):

 “Minority serving” (i.e., serves greater than half

minorities): PIF12B = 3 or 4

 Not minority serving (i.e., serves fewer than half

minorities): PIF12B = 0, 1, or 2

=0 =1 =2 =3 =4

Participant Information Form

Skip pattern: This question should only be answered if PIF12_1=1 and PIF13=1 PIF12_2

slide-23
SLIDE 23

Analytic Variable: Provider Experience

23

 Among providers with direct service experience to

HIV-infected clients (PIF12_1=1 and PIF13=1):

 Novice: 0 to <2 years of experience  New: 2 to <3 years of experience  Experienced: 3 or more years of experience = continuous numeric variable Skip pattern: This question should only be answered if PIF12_1=1 and PIF13=1 PIF14

slide-24
SLIDE 24

Analytic Variable: HIV+ Clients per Month

24

 Categories for HIV+ clients per month:

 0/month: PIF13 = 0 (No direct HIV+ services provided)

  • r PIF15 = 0

 1-19/month: PIF15 = 1 or 2  20+/month: PIF15 = 3 or 4 =0 =1 =2 =3 =4 Skip pattern: This question should only be answered if PIF12_1=1 and PIF13=1 PIF15

slide-25
SLIDE 25

Special Initiatives

25

slide-26
SLIDE 26

Repeat Trainees

26  Repeat trainee status is relative to the last eligible record during the

analysis period

 An individual who attended multiple AETC trainings with only 1 MAI

training would not be categorized as a repeat trainee in an MAI analysis, since the last eligible MAI training record is the first and only MAI training

 However, this same individual would be considered a repeat trainee

for a cross-region analysis during this time period

 A trainee is considered non-unique if s/he has the same PIF ID within a

combined AETC region (e.g., AETC13, 39, 51 considered combined PAMA region)

 Assumption: An individual took trainings within one region only. For

example, Trainee who moved from CA to NY with training records in both regions would be counted as two separate individuals in the cross- site data.

slide-27
SLIDE 27

Repeat Trainees – Combined AETC Codes

27

Regional AETC Name Combined AETC Codes Delta 1, 30 Florida/Caribbean 2, 31, 57, 61 Midwest 4, 32 Mountain Plains 5, 33, 56 New England 8, 35 NY/NJ 10, 36 Northwest 11, 37, 52 Pacific 12, 38, 50, 68 PAMA 13, 39, 51 Southeast 15, 40, 58 TX/OK 16, 41

slide-28
SLIDE 28

Repeat Trainees- Example

28

PIF_ID AETC Funding Type Training event (any type) during analytic period 12345678 13 MAI 1 12345678 13 Base 2 12345678 39 CDC testing 3 12345678 13 Base 4

 If PIF_ID 12345678 were truly a valid ID and the records below are all

event data for this trainee in the fiscal year, sorted by event date:

 In an MAI analysis, the latest MAI record would be retained. This trainee is

not a repeat trainee during the MAI training.

 In an overall analysis, this trainee is a repeat trainee. The fourth training

record retained for the analysis.

 Notes: AETC=39 is grouped with AETC=13 for the region PA/MA. Repeat

trainee analyses are coupled with the de-duplication process.

slide-29
SLIDE 29

 We identified data to include by limiting records to

those identified as MAI on the ER:

MAI Initiative Events

Event Record form

29

ER5_3=1 Not mutually exclusive

slide-30
SLIDE 30

 We identified data to include by limiting records to

those identified as “HIV Testing” on the ER and through the code associated with CDC funding used by AETC Regions:

HIV Testing Events

  • OR-
  • OR-

AETC = 30-41 (CDC testing code)

Event Record form

30

ER4_7=1 ER4_31=1

slide-31
SLIDE 31

ACRE Rescaled Outcomes

31

Original Scale IP Meanings FUP Meanings New Scale 1 “Novice” “Poor” “Disagree Strongly” “Strongly Disagree” 2 “Disagree” 25 3 “Neither Agree or Disagree” 50 4 “Agree” 75 5 “Expert” “Excellent” “Agree Strongly” “Strongly Agree” 100

 For ease of interpretation, all outcome responses were rescaled from 1-5 to

0-100 so that the results could be interpreted as percent change:

 Original scale values of 0 or >5 are recoded to missing. Decimal values

between 1-5 are rounded down to a whole number.

slide-32
SLIDE 32

Response Rates (ACRE-FUP)

32

slide-33
SLIDE 33

Response Rates - Background

33

 Over a wide range of disciplines, email response

rates average 20-30%

 Factors hypothesized to influence response rates

 Number of questions  Pre-notification  Follow-up  Salience

slide-34
SLIDE 34

2013 Response Rates*

34

 2013 response rates: 5% - 64%, avg: 30%

 Top responders:

 University of Hawaii (Pacific) – 64.1%  YVFWC (Northwest) – 47.7%  UNC Chapel Hill (SEATEC) – 42.5%  AARTH (Northwest) – 42.1%  Indiana (MATEC) – 38.5%

*Response rates by LPS for VF users with a minimum of 20 total participants

slide-35
SLIDE 35

2014 Response Rates*,**

35

 2014 response rates: 11% - 61%, avg: 27%

 Top responders:

 UK (SEATEC) – 60.5%  USC (SEATEC) – 55.8%  AZ AIDS ETC (Pacific) – 47.4%  SPIPA (Northwest) – 44.0%  Pittsburgh (PA/MA) – 41.0%

*Response rates by LPS for VF users with a minimum of 20 total participants **Response rates through October 1, 2014

slide-36
SLIDE 36

Response Rates

36

 LPS with >1 events per year have higher response

rates: 31% vs 23%

 Average response rate for LPS with 10+ events: 35%  Average response rate for LPS with 50+

attendees/event: 28%

 Average response rate for LPS with <20

attendees/event: 35%

slide-37
SLIDE 37

Response Rates

37

 Email comments from top responders:

 Online registration (UK & USC)  Participant buy-in (UK, USC, SPIPA)  Cultural awareness of participants (SPIPA)  Monthly audits from central office (UK & USC)

slide-38
SLIDE 38

Response Rates

38

 Additional comments?  Questions/concerns?