A Robust Recursive Partitioning Algorithm for Mining Multiple - PowerPoint PPT Presentation

A Robust Recursive Partitioning Algorithm for Mining Multiple Populations Jose Alvir 1 Javier Cabrera 2 Frank Caridi 1 Ha Nguyen 1 Pfizer Inc 1 & Rutgers University 2 Rutgers Biostatistics Day, 4/25/2008

The Challenge of Personalized Medicine • Drugs do not work for everybody • Certain drugs may work for certain individuals compared to other drugs • Individuals may need more or less of a drug than other individuals

The Challenge of Personalized Medicine • Shift from individuals to groups of individuals with similar characteristics • Search for subgroups where response is maximal • Classification techniques like CART are available

Pima Indians Diabetes Data Set 768 females at least 21 yrs old of Pima Indian heritage Variable Mean SD Number of times pregnant 3.8 3.4 Plasma glucose concentration 120.9 32 Diastolic blood pressure 69.1 19.4 Triceps skin fold thickness 20.5 16 2-Hour serum insulin 79.8 115.2 Body Mass Index 32 7.9 Diabetes pedigree function 0.5 0.3 Age 33.2 11.8 Diabetes 268/768

Classic Example of CART: Pima Indians & Diabetes • 768 Pima Indian females, 21+ years old ; 268 tested positive to diabetes • 8 predictors: PRG, PLASMA, BP, THICK, INSULIN, BODY, PEDIGREE, AGE P LA S M A <127. 5 | A G E <28. 5 B O D Y <29. 95 B O D Y <30. 95 B O D Y <26. 35 P LA S M A <145. 5 P LA S M A <157. 5 P LA S M A <99. 5 A G E <30. 5 0. 01325 0. 17500 0. 04878 0. 14630 0. 51430 0. 86960 P E D I G R E E <0. 561 B P <61 0. 18180 0. 72310 0. 40480 0. 73530 1. 00000 0. 32500

ARF – Activity Region Finder • Identify High Activity Regions • Find regions where concentration of “success” is highest, unlike other classification trees (e.g. CHAID, CART) that aim to predict response across the entire range • Splitting a node when there is substantial evidence that the response is higher/lower in the child node (compared to the parent node) • Written in R

Alvir J, Cabrera J, Caridi F, Nguyen H. Mining Clinical Trial Data. In Knowledge Discovery and Data Mining: Challenges and Realities with Real World Data, edited by Xingquan (Hill) Zhu and Ian Davidson, 2007

ARF applied to Pima Indian data DATASET n=768;p=35% PLASMA [155,199] n=122;p=80% PLASMA BODY [128,152] [29.9,45.7] n=153;p=49% n=92;p=88% AGE BODY PEDIGREE [29,56] [30.3,67.1] [0.344,1.394] n=199;p=35% n=99;p=64% n=55;p=96% PEDIGREE [0.439,1.057] n=38;p=82% Subset %Success n 1 PLASMA in [155,199] & BODY in [29.9,45.7] & PEDIGREE in [0.344,1.394] 96.364 55 2 PLASMA in [128,152] & BODY in [30.3,67.1] & PEDIGREE in [0.439,1.057] 81.579 38 3 PLASMA in [0,127] & AGE in [29,56] 35.176 199

Differences between CART & ARF trees • Best node for the CART tree has 9 observations with 100% diabetes • ARF tree has a node of 55 observations with 96% rate of diabetes • The node from CART has a high probability of occurring by chance • ARF tree produces sketches that summarize only important information and downplay less interesting information

Ziprasidone Placebo Controlled Trials 4- & 6-wk U.S. trials • Protocol 104 – 4 weeks N=195 • Protocol 106 – 4 weeks N=132 • Protocol 114 – 6 weeks N=299 • Protocol 115 – 6 weeks N=325 85 subjects on haloperidol excluded Total N = 951

Ziprasidone Data N by dose (mg./day) & Protocol # 104 106 114 115 PBO 47 47 92 80 10 46 40 55 43 86 80 47 104 120 42 76 160 103 200 83

Ziprasidone Data Mining Variables Outcomes: Change in BPRS Total score Predictors: age, sex, race, protocol, dose, baseline clinical ratings (positive Sx, CGI-S, anergia, depressive Sx, AIMS), duration of illness in years, current smoking status

Patient Characteristics Total = 951 N % Male 700 74 Race White 620 65 Black 234 25 Other 97 10 Smoker 716 75

Data Definitions • AIMS = mean of AIMS total/5 and TD severity • BPRS total & Sx scores (positive, depression, anergia) – absolute minimum is zero (items scored with minimum = 0 and not 1) • Positive Sx score – sum of conceptual disorganization, hallucinatory behavior, unusual thought content, suspiciousness • Depression – sum of anxiety, guilt feelings, depressive mood • Anergia – sum of blunted affect, emotional withdrawal, motor retardation • Residual BPRS change – Residual (observed minus predicted) LOCF BPRS total regressed on baseline BPRS

Patient Characteristics Mean S.D. Range BPRS change -5.1 13.4 -58, 55 Residual change 0 13.1 -45, 65 Baseline BPRS 35.9 11.0 14, 86 Age 38.7 10.1 18, 72 Duration of illness 16.0 9.6 0, 54 Baseline Positive Sx 12.7 3.4 4, 24 Baseline Depression 5.5 3.3 0, 17 Baseline CGI-S 4.8 0.8 3, 7 Baseline Anergia 6.0 3.4 0, 18 Baseline AIMS 0.4 0.6 0, 4

The Challenge of Personalized Medicine revisited • Can we identify subgroups for which the drug is more effective than placebo or other drugs? • Are there subsets for which a low dose is better than placebo? • Are there subsets for which a high dose is better than a low dose or vice versa?

Conventional tree methods can only answer these questions indirectly In conventional modelling: - The X space is defined by one sample - We estimate the conditional mean of a response variable given a set of predictors.

Comparative efficacy Subsets where: • the drug is more effective than placebo or other drugs • low dose is better than placebo • high dose is better than a low dose or vice versa In these situations: - The X space is defined by two or more samples. - We estimate the conditional difference of means or in general a function of the conditional means. - We extend ARF to the differences between two or more means

47th Interscience Conference on Antimicrobial Agents and Chemotherapy Chicago,September 17-20, 2007 Symptom Resolution with Azithromycin Extended Release Versus Amoxicillin/Clavulanate in Patients with Acute Sinusitis in a General Practice Physician Environment J. F. Piccirillo 1 , B. F. Marple 2 , C. S. Roberts 3 , J. R. Frytak 4 , V. F. Schabert 5 , J. C. Wegner 4 , H. Bhattacharyya 3 , S. P. Sanchez 3 1 Washington University School of Medicine, St Louis, MO 2 University of Texas Southwestern Medical Center, Dallas, TX 3 Pfizer Inc, New York, NY 4 i3 Innovus, Eden Prairie, MN 5 Integral Health Decisions Inc, Santa Barbara, CA

Sample Characteristics

Ziprasidone: 120 mg/160 mg Vs Placebo MULTIRESPONSE CART BASEDEP< 2.5 | URATILL>=16.5 BASEPOS< 15.5 -6.9380 5.4806 n=37 n=57 BCGIS< 5.5 12.5481 n=86 DURATILL< 7.5 14.3053 n=31 DURATILL>=3.5 RACE=bde DURATILL>=13 0.8727 n=62 ANERGIA>=8.5 -8.5192 7.8737 13.8646 n=30 n=27 n=37 -1.0445 7.1425 n=26 n=94

Ziprasidone: TOP THREE 120 mg/160 mg Vs 20 SPLITS Placebo 0 y -20 -40 0 5 10 15 BASEDEP x 10 20 0 0 y y -10 -20 -20 -40 -30 0 10 20 30 40 50 5 10 15 20 DURATIL BASEPOS x x

Software • These two ARF applications are being incorporated into PfarMineR , a suite of statistical methods for EDA and Data Mining • ARF is available at: http://www.rci.rutgers.edu/~cabrera/dm/DM.html

A Robust Recursive Partitioning Algorithm for Mining Multiple - PowerPoint PPT Presentation

A Robust Recursive Partitioning Algorithm for Mining Multiple Populations Jose Alvir 1 Javier Cabrera 2 Frank Caridi 1 Ha Nguyen 1 Pfizer Inc 1 & Rutgers University 2 Rutgers Biostatistics Day, 4/25/2008 The Challenge of Personalized

61A Lecture 6 Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Recursive Methods Noter ch.2 Recursive Methods Recursive problem solution Problems

Recursion Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Lesson 9 Recursive Types 2/19, 21 Chapters 20, 21 Recursive type Recursive type terms are

Recursive Methods Recursive problem solution Problems that are naturally solved by

Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply

Partitioning Introduction to Partitioning Mahapatra-Texas A&M-Spring02 1 System

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Assessing the Stability of Forecasting Models: Recursive Parameter Estimation and Recursive

Non-Recursive In-Place FFT Algorithm Idea: "Unwind the in-place recursive algorithm and work

Recursion Announcements Recursive Functions Recursive Functions Definition : A function is

Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael

1 1 Slide 5 Slide 6 Partitioning and Load Balancing Partitioning Goals Assignment of

Partitioning Problem and Usage Lecture 8 CSCI 4974/6971 26 Sep 2016 1 / 14 Todays Biz 1.

Investigating hypergraph-partitioning-based sparse matrix partitioning methods Bora U car

Model-based recursive partitioning for Bradley-Terry models Florian Wickelmaier Carolin Strobl

Lecture 4: Image pyramids PS1 due at midnight PS2 out, due next Tues. No Thursday

Disease Monitoring James W. Stark, MD, FAAN TISCH MS RESEARCH CENTER OF NEW YORK 22 ND ANNUAL

CMSC 20370/30370 Winter 2020 Understanding human abilities for inclusive technology Case Study:

Richardson-Lucy Deblurring for Moving Light Field Cameras Donald Dansereau 1 , Anders Eriksson 2

Image Reconstruction with Predictive Filter Flow Shu Kong, Charless Fowlkes Dept. of Computer

Workplace Accommodations for Wounded Warriors will begin at 12:30 PM Listening to the Webinar

Introduction to Artificial Intelligence Computer Vision: OpenCV Janyl Jumadinova October 12,

UNDERSTANDING AND PREDICTING IMAGE MEMORABILITY AT A LARGE SCALE A. Khosla, A. S. Raju, A.

A Robust Recursive Partitioning Algorithm for Mining Multiple - PowerPoint PPT Presentation

A Robust Recursive Partitioning Algorithm for Mining Multiple Populations Jose Alvir 1 Javier Cabrera 2 Frank Caridi 1 Ha Nguyen 1 Pfizer Inc 1 & Rutgers University 2 Rutgers Biostatistics Day, 4/25/2008 The Challenge of Personalized

61A Lecture 6 Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Recursive Methods Noter ch.2 Recursive Methods Recursive problem solution Problems

Recursion Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Lesson 9 Recursive Types 2/19, 21 Chapters 20, 21 Recursive type Recursive type terms are

Recursive Methods Recursive problem solution Problems that are naturally solved by

Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply

Partitioning Introduction to Partitioning Mahapatra-Texas A&amp;M-Spring02 1 System

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Assessing the Stability of Forecasting Models: Recursive Parameter Estimation and Recursive

Non-Recursive In-Place FFT Algorithm Idea: &quot;Unwind the in-place recursive algorithm and work

Recursion Announcements Recursive Functions Recursive Functions Definition : A function is

Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael

1 1 Slide 5 Slide 6 Partitioning and Load Balancing Partitioning Goals Assignment of

Partitioning Problem and Usage Lecture 8 CSCI 4974/6971 26 Sep 2016 1 / 14 Todays Biz 1.

Investigating hypergraph-partitioning-based sparse matrix partitioning methods Bora U car

Model-based recursive partitioning for Bradley-Terry models Florian Wickelmaier Carolin Strobl

Lecture 4: Image pyramids PS1 due at midnight PS2 out, due next Tues. No Thursday

Disease Monitoring James W. Stark, MD, FAAN TISCH MS RESEARCH CENTER OF NEW YORK 22 ND ANNUAL

CMSC 20370/30370 Winter 2020 Understanding human abilities for inclusive technology Case Study:

Richardson-Lucy Deblurring for Moving Light Field Cameras Donald Dansereau 1 , Anders Eriksson 2

Image Reconstruction with Predictive Filter Flow Shu Kong, Charless Fowlkes Dept. of Computer

Workplace Accommodations for Wounded Warriors will begin at 12:30 PM Listening to the Webinar

Introduction to Artificial Intelligence Computer Vision: OpenCV Janyl Jumadinova October 12,

UNDERSTANDING AND PREDICTING IMAGE MEMORABILITY AT A LARGE SCALE A. Khosla, A. S. Raju, A.

Partitioning Introduction to Partitioning Mahapatra-Texas A&M-Spring02 1 System

Non-Recursive In-Place FFT Algorithm Idea: "Unwind the in-place recursive algorithm and work