Learning Decision Trees Adaptively from Data Streams with Time Drift - PowerPoint PPT Presentation

Introduction ADWIN-DT Decision Tree Experiments Conclusions Learning Decision Trees Adaptively from Data Streams with Time Drift Albert Bifet and Ricard Gavaldà LARCA: Laboratori d’Algorísmica Relacional, Complexitat i Aprenentatge Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya September 2007

Introduction ADWIN-DT Decision Tree Experiments Conclusions Introduction: Data Streams Data Streams Sequence is potentially infinite High amount of data: sublinear space High speed of arrival: sublinear time per example Once an element from a data stream has been processed it is discarded or archived Example Puzzle: Finding Missing Numbers Let π be a permutation of { 1 , . . . , n } . Let π − 1 be π with one element missing. π − 1 [ i ] arrives in increasing order Task: Determine the missing number

Introduction ADWIN-DT Decision Tree Experiments Conclusions Introduction: Data Streams Data Streams Sequence is potentially infinite High amount of data: sublinear space High speed of arrival: sublinear time per example Once an element from a data stream has been processed it is discarded or archived Example Puzzle: Finding Missing Numbers Use a n -bit vector to Let π be a permutation of { 1 , . . . , n } . memorize all the Let π − 1 be π with one element numbers ( O ( n ) missing. space) π − 1 [ i ] arrives in increasing order Task: Determine the missing number

Introduction ADWIN-DT Decision Tree Experiments Conclusions Introduction: Data Streams Data Streams Sequence is potentially infinite High amount of data: sublinear space High speed of arrival: sublinear time per example Once an element from a data stream has been processed it is discarded or archived Example Puzzle: Finding Missing Numbers Data Streams: Let π be a permutation of { 1 , . . . , n } . O ( log ( n )) space. Let π − 1 be π with one element missing. π − 1 [ i ] arrives in increasing order Task: Determine the missing number

Introduction ADWIN-DT Decision Tree Experiments Conclusions Introduction: Data Streams Data Streams Sequence is potentially infinite High amount of data: sublinear space High speed of arrival: sublinear time per example Once an element from a data stream has been processed it is discarded or archived Example Data Streams: Puzzle: Finding Missing Numbers O ( log ( n )) space. Store Let π be a permutation of { 1 , . . . , n } . Let π − 1 be π with one element n ( n + 1 ) � − π − 1 [ j ] . missing. 2 j ≤ i π − 1 [ i ] arrives in increasing order Task: Determine the missing number

Introduction ADWIN-DT Decision Tree Experiments Conclusions Data Streams Data Streams At any time t in the data stream, we would like the per-item processing time and storage to be simultaneously O ( log k ( N , t )) . Approximation algorithms Small error rate with high probability An algorithm ( ǫ, δ ) − approximates F if it outputs ˜ F for which Pr [ | ˜ F − F | > ǫ F ] < δ .

Introduction ADWIN-DT Decision Tree Experiments Conclusions Data Streams Approximation Algorithms Frequency moments Frequency moments of a stream A = { a 1 , . . . , a N } : v � f k F k = i i = 1 where f i is the frequency of i in the sequence, and k ≥ 0 F 0 : number of distinct elements on the sequence F 1 : length of the sequence F 2 : self-join size, the repeat rate, or as Gini’s index of homogeneity Sketches can approximate F 0 , F 1 , F 2 in O ( log v + log N ) space. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximation the frequency moments. 1996

Introduction ADWIN-DT Decision Tree Experiments Conclusions Classification Example Contains Domain Has Time Data set that “Money” type attach. received spam describes e-mail yes com yes night yes features for yes edu no night yes deciding if it is no com yes night yes spam. no edu no day no no com no day no yes cat no day yes Assume we have to classify the following new instance: Contains Domain Has Time “Money” type attach. received spam yes edu yes day ?

Introduction ADWIN-DT Decision Tree Experiments Conclusions Classification Assume we have to classify the following new instance: Contains Domain Has Time “Money” type attach. received spam yes edu yes day ?

Introduction ADWIN-DT Decision Tree Experiments Conclusions Decision Trees Basic induction strategy: A ← the “best” decision attribute for next node Assign A as decision attribute for node For each value of A , create new descendant of node Sort training examples to leaf nodes If training examples perfectly classified, Then STOP , Else iterate over new leaf nodes

Introduction ADWIN-DT Decision Tree Experiments Conclusions VFDT / CVFDT Very Fast Decision Tree: VFDT Pedro Domingos and Geoff Hulten. Mining high-speed data streams. 2000 With high probability, constructs an identical model that a traditional (greedy) method would learn With theoretical guarantees on the error rate

Introduction ADWIN-DT Decision Tree Experiments Conclusions VFDT / CVFDT Concept-adapting Very Fast Decision Trees: CVFDT G. Hulten, L. Spencer, and P . Domingos. Mining time-changing data streams. 2001 It keeps its model consistent with a sliding window of examples Construct “alternative branches” as preparation for changes If the alternative branch becomes more accurate, switch of tree branches occurs

Introduction ADWIN-DT Decision Tree Experiments Conclusions Decision Trees: CVFDT No theoretical guarantees on the error rate of CVFDT CVFDT parameters : W : is the example window size. 1 T 0 : number of examples used to check at each node if the 2 splitting attribute is still the best. T 1 : number of examples used to build the alternate tree. 3 T 2 : number of examples used to test the accuracy of the 4 alternate tree.

Introduction ADWIN-DT Decision Tree Experiments Conclusions Decision Trees: ADWIN-DT ADWIN-DT improvements consist in : replace frequency statistics counters by estimators don’t need a window to store examples, due to the fact that we maintain the statistics data needed with estimators change the way of checking the substitution of alternate subtrees, using a change detector with theoretical guarantees Summary: Theoretical guarantees 1 No Parameters 2

Introduction ADWIN-DT Decision Tree Experiments Conclusions Time Change Detectors and Predictors: A General Framework Estimation ✲ x t ✲ Estimator

Introduction ADWIN-DT Decision Tree Experiments Conclusions Time Change Detectors and Predictors: A General Framework Estimation ✲ x t ✲ Estimator Alarm ✲ ✲ Change Detect.

Introduction ADWIN-DT Decision Tree Experiments Conclusions Time Change Detectors and Predictors: A General Framework Estimation ✲ x t ✲ Estimator Alarm ✲ ✲ Change Detect. ✻ ✻ ❄ ✲ Memory

Introduction ADWIN-DT Decision Tree Experiments Conclusions Window Management Models W = 101010110111111 Equal & fixed size subwindows Total window against subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent ADWIN: All Adjacent subwindows subwindows 1 01010110111111 1010101 1011 1111 [Dasu+ 06]

Learning Decision Trees Adaptively from Data Streams with Time Drift - PowerPoint PPT Presentation

Introduction ADWIN-DT Decision Tree Experiments Conclusions Learning Decision Trees Adaptively from Data Streams with Time Drift Albert Bifet and Ricard Gavald LARCA: Laboratori dAlgorsmica Relacional, Complexitat i Aprenentatge

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

Decision Trees Lecture 23 To left or to right 1 Decision Trees 2 Decision Trees A different

Decision Trees Lecture 22 To left or to right 1 Decision Trees 2 Decision Trees A different

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams Albert Bifet and Ricard

Decision Tree R Greiner Cmput 466 / 551 Learning Decision Trees Def'n: Decision Trees

( ( ) ) ( ) ( ) = = Work = h log t n B- B -Trees Trees B B- -Trees

Trees Chapter 11 Chapter Summary Introduction to Trees Applications of Trees Tree

Learning Decision Trees Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

Trees Eric McCreath Overview In this lecture we will explore: general trees, binary trees,

Decision Trees: Discussion Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

Lecture 23: Decision Trees Decision trees Prof. Julia Hockenmaier

Outline Univariate Trees 1 Decision Trees Classification Regression Pruning Steven J Zeil

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams

2-3-4 Trees and Red- Black Trees 204 erm CS 16: Balanced Trees 2-3-4 Trees Revealed Nodes

Variations and Brownian Motion with drift Bo Friis Nielsen 1 1 DTU Informatics 02407 Stochastic

Urgent Care Transformation Overview and Scrutiny Committee Briefing Consultation Decision Update

10/26/2016 Dan Bronson-Lowe, PhD, CIC Baxter Healthcare Generates knowledge Structured

Parts of Speech More Fine-Grained Classes More

Measurement of low energy Electronic Recoil Response and Electronic/Nuclear Recoils

On Measurements and Spatial Distribution of Light Absorbing Aerosols in the Arctic J. Backman 1 ,

Learning efficient logical robot strategies involving composable objects Andrew Cropper and

Status of Druid Manqi RUAN Laboratoire Leprince-Ringuet (LLR) Ecole Polytechnique 91128,

Learning Decision Trees Adaptively from Data Streams with Time Drift - PowerPoint PPT Presentation

Introduction ADWIN-DT Decision Tree Experiments Conclusions Learning Decision Trees Adaptively from Data Streams with Time Drift Albert Bifet and Ricard Gavald LARCA: Laboratori dAlgorsmica Relacional, Complexitat i Aprenentatge

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

Decision Trees Lecture 23 To left or to right 1 Decision Trees 2 Decision Trees A different

Decision Trees Lecture 22 To left or to right 1 Decision Trees 2 Decision Trees A different

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams Albert Bifet and Ricard

Decision Tree R Greiner Cmput 466 / 551 Learning Decision Trees Def'n: Decision Trees

( ( ) ) ( ) ( ) = = Work = h log t n B- B -Trees Trees B B- -Trees

Trees Chapter 11 Chapter Summary Introduction to Trees Applications of Trees Tree

Learning Decision Trees Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

Trees Eric McCreath Overview In this lecture we will explore: general trees, binary trees,

Decision Trees: Discussion Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

WITH C++ Prof. Amr Goneid AUC Part 9. Streams &amp; Files Prof. amr Goneid, AUC 1 Streams

Lecture 23: Decision Trees Decision trees Prof. Julia Hockenmaier

Outline Univariate Trees 1 Decision Trees Classification Regression Pruning Steven J Zeil

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data &amp; Real Time Data Streams

2-3-4 Trees and Red- Black Trees 204 erm CS 16: Balanced Trees 2-3-4 Trees Revealed Nodes

Variations and Brownian Motion with drift Bo Friis Nielsen 1 1 DTU Informatics 02407 Stochastic

Urgent Care Transformation Overview and Scrutiny Committee Briefing Consultation Decision Update

10/26/2016 Dan Bronson-Lowe, PhD, CIC Baxter Healthcare Generates knowledge Structured

Parts of Speech More Fine-Grained Classes More

Measurement of low energy Electronic Recoil Response and Electronic/Nuclear Recoils

On Measurements and Spatial Distribution of Light Absorbing Aerosols in the Arctic J. Backman 1 ,

Learning efficient logical robot strategies involving composable objects Andrew Cropper and

Status of Druid Manqi RUAN Laboratoire Leprince-Ringuet (LLR) Ecole Polytechnique 91128,

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams