 
              Adaptive Learning and Mining for Data Streams and Frequent Patterns Albert Bifet Laboratory for Relational Algorithmics, Complexity and Learning LARCA Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya Ph.D. dissertation, 24 April 2009 Advisors: Ricard Gavaldà and José L. Balcázar LARCA
Future Data Mining Future Data Mining Structured data Find Interesting Patterns Predictions On-line processing 2 / 59
Mining Evolving Massive Structured Data The basic problem Finding interesting structure on data Mining massive data Mining time varying data Mining on real time Mining structured data The Disintegration of Persistence of Memory 1952-54 Salvador Dalí 3 / 59
Data Streams Data Streams Sequence is potentially infinite High amount of data: sublinear space High speed of arrival: sublinear time per example Once an element from a data stream has been processed it is discarded or archived Approximation algorithms Small error rate with high probability An algorithm ( ε , δ ) − approximates F if it outputs ˜ F for which Pr [ | ˜ F − F | > ε F ] < δ . 4 / 59
Tree Pattern Mining Given a dataset of trees, find the complete set of frequent subtrees Frequent Tree Pattern (FT): Include all the trees whose support is no less than min_sup Closed Frequent Tree Pattern (CT): Trees are sanctuaries. Include no tree which has a Whoever knows how super-tree with the same to listen to them, support can learn the truth. CT ⊆ FT Herman Hesse 5 / 59
Outline Mining Evolving Tree Mining Mining Evolving Tree Data Streams Data Streams 6 Closure Operator 10 Incremental Framework 1 on Trees Method 2 ADWIN 7 Unlabeled Tree 11 Sliding Window Mining Methods Classifiers 3 Method 8 Deterministic MOA 4 12 Adaptive Method Association Rules ASHT 5 13 Logarithmic 9 Implicit Rules Relaxed Support 14 XML Classification 6 / 59
Outline Mining Evolving Tree Mining Mining Evolving Tree Data Streams Data Streams 6 Closure Operator 10 Incremental Framework 1 on Trees Method 2 ADWIN 7 Unlabeled Tree 11 Sliding Window Mining Methods Classifiers 3 Method 8 Deterministic MOA 4 12 Adaptive Method Association Rules ASHT 5 13 Logarithmic 9 Implicit Rules Relaxed Support 14 XML Classification 6 / 59
Outline Mining Evolving Tree Mining Mining Evolving Tree Data Streams Data Streams 6 Closure Operator 10 Incremental Framework 1 on Trees Method 2 ADWIN 7 Unlabeled Tree 11 Sliding Window Mining Methods Classifiers 3 Method 8 Deterministic MOA 4 12 Adaptive Method Association Rules ASHT 5 13 Logarithmic 9 Implicit Rules Relaxed Support 14 XML Classification 6 / 59
Outline Introduction 1 Mining Evolving Data Streams 2 Tree Mining 3 Mining Evolving Tree Data Streams 4 Conclusions 5 7 / 59
Data Mining Algorithms with Concept Drift No Concept Drift Concept Drift DM Algorithm DM Algorithm input output input output ✲ ✲ ✲ ✲ Counter 5 Static Model Counter 4 ✻ Counter 3 Counter 2 ✲ ✛ Counter 1 Change Detect. 8 / 59
Data Mining Algorithms with Concept Drift No Concept Drift Concept Drift DM Algorithm DM Algorithm input output input output ✲ ✲ ✲ ✲ Counter 5 Estimator 5 Counter 4 Estimator 4 Counter 3 Estimator 3 Counter 2 Estimator 2 Counter 1 Estimator 1 8 / 59
Time Change Detectors and Predictors (1) General Framework Problem Given an input sequence x 1 , x 2 ,..., x t ,... we want to output at instant t a prediction � x t + 1 minimizing prediction error: | � x t + 1 − x t + 1 | an alert if change is detected considering distribution changes overtime. 9 / 59
Time Change Detectors and Predictors (1) General Framework Estimation ✲ x t ✲ Estimator 10 / 59
Time Change Detectors and Predictors (1) General Framework Estimation ✲ x t ✲ Estimator Alarm ✲ ✲ Change Detect. 10 / 59
Time Change Detectors and Predictors (1) General Framework Estimation ✲ x t ✲ Estimator Alarm ✲ ✲ Change Detect. ✻ ✻ ❄ ✲ Memory 10 / 59
Optimal Change Detector and Predictor (1) General Framework High accuracy Fast detection of change Low false positives and false negatives ratios Low computational cost: minimum space and time needed Theoretical guarantees No parameters needed Estimator with Memory and Change Detector 11 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 1 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W for each t > 0 2 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 1 W 1 = 01010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 10 W 1 = 1010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 101 W 1 = 010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 1010 W 1 = 10110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 10101 W 1 = 0110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 101010 W 1 = 110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 1010101 W 1 = 10111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 W 0 = 10101011 W 1 = 0111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 | ˆ µ W 0 − ˆ µ W 1 | ≥ ε c : CHANGE DET.! W 0 = 101010110 W 1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 101010110111111 Drop elements from the tail of W W 0 = 101010110 W 1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Algorithm AD aptive Sliding WIN dow (2) ADWIN Example W = 01010110111111 Drop elements from the tail of W W 0 = 101010110 W 1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t > 0 3 do W ← W ∪{ x t } (i.e., add x t to the head of W ) 4 repeat Drop elements from the tail of W 5 until | ˆ µ W 0 − ˆ µ W 1 | < ε c holds 6 for every split of W into W = W 0 · W 1 7 Output ˆ µ W 12 / 59
Recommend
More recommend