Outline Discovering Interesting Patterns Through Users Interactive - PowerPoint PPT Presentation

Outline Discovering Interesting Patterns Through User’s Interactive Feedback Well begun is half done. Aristotle • Introduction and Background Dong Xin Xuehua Shen Qiaozhu Mei Jiawei Han • The Algorithm Presented by: • Examples Jeff Boisvert • Conclusions/Future Work April 11, 2007 • Critique of Paper This paper was presented at KDD ‘06 1 Introduction and Background Introduction and Background • Motivation • SVM – discover ‘interesting’ patterns in data – I think we have been presented with this enough – Subjective ‘interestingness’ � user • Clustering – Often too many patterns to assess manually – K-clusters - Minimize the maximum distance of each pattern to the nearest sample in a cluster • Distance measure – Jaccard distance (between two patterns) www.amazon.com • Setting ∩ T P ( ) T P ( ) = − 1 2 D P P ( , ) 1 – assume an available set of candidate patterns (freq item sets, etc) 1 2 ∪ T P ( ) T P ( ) – Have user rank a subset of the candidate patterns 1 2 – Learn from the users ranking – Have user rank more patterns • Ranking – Learn – Linear - i.e. 2 < 3 (difference in ranking would be 3-2 = 1) – … – Log-Linear - i.e. log(2) < log(3) (difference in ranking would be 0.176) 2 3

Outline The Algorithm • Overview Cluster N patterns in k clusters An algorithm must be seen to be believed. User ranks k patterns 1. Prune candidate patterns and micro-clustering Donald Knuth Refine model Re-rank all N patterns 2. Cluster N patterns into k clusters N=aN 3. Present k patterns to user for ranking • Introduction and Background 4. Refine the model with new user rankings • The Algorithm 5. Re-rank all N patterns with new model 6. Reduce N=a*N • Examples 7. Go to step 2 • Conclusions/Future Work • Areas to discuss • Critique of Paper – (1) Preprocessing – pruning and micro-clustering – Clustering – see introduction – (2) Selecting the k patterns to present to the user – (3) Modeling the users knowledge/ranking *** 4 5 The Algorithm ( Preprocessing ) The Algorithm ( k patterns ) • Clustering patterns • Pruning Cluster N patterns in k clusters Cluster N patterns in k clusters User ranks k patterns – Really have N micro-clusters but … User ranks k patterns – get representative patterns from candidates Refine model Refine model Re-rank all N patterns Re-rank all N patterns – start with maximal’s • Selecting Patterns N=aN N=aN – merge candidates into maximal's – Criteria 1 – patterns presented should not be redundant Which k patterns – representative pattern = maximal Redundant patterns often rank close to each other to present to user? Redundant if same composition/frequency – discard patterns, keep micro-cluster's (maximal’s) www.johndeerelandscapes.com – Criteria 2 – helps refine model of users knowledge of interesting pattern (not uninteresting patterns) • Micro-clustering • Method [ Gonzalez, 1985. Clustering to minimize the maximum intercluster distance ] – Randomly select the first pattern – Two patterns are merged if: – Second pattern – maximum distance from first pattern D(P 1 ,P 2 ) < epsilon – Third pattern – max distance to the nearest of the first and second patterns – … – D is the Jaccard distance Zaiane, COMPUT 695 notes – Epsilon provided by the user (i.e. 0.1) 6 7

The Algorithm ( refine model 1 ) The Algorithm ( refine model 2 ) *** main contribution of the paper • Log-Linear Model Cluster N patterns in k clusters Cluster N patterns in k clusters User ranks k patterns User ranks k patterns – How to model the users knowledge? – Say we have a pattern (P) in a data set of s items, f e (P) is: Refine model Refine model s – So far we have only ranked k out of N patterns… Re-rank all N patterns + ∑ Re-rank all N patterns = log f P ( ) u u N=aN N=aN e j = • Interestingness j 1 – Recall ordering of patterns by user as a constraint: – Difference between observed frequency and expected frequency f o (P) and f e (P) − > − log f ( ) P log f P ( ) log f ( P ) log f P ( ) – Observed from input o 1 e 1 o 2 e 2 – Expected calculated from the model of the users knowledge – Define a weight vector and new representation of the constraint above : f e (P) = M(P, θ ) – If f o (P) and f e (P) are different the pattern is interesting = − − − = w [ c , u , u ,..., u ] v P ( ) [log f ( ), P x ,..., x ] 0 1 s o 1 s • Ranking Will have k – if the user ranks P i as more interesting than P j : constraints R [ f o (P i ),f e (P i ) ] > R [ f o (P j ),f e (P j ) ] T > T w v P ( ) w v P ( ) ฀ ฀ – Log-linear model R [ f o (P),f e (P) ] = log f o (P) - log f e (P) 1 2 8 9 – This is a constraint on the model optimization R [ f o (P i ),f e (P i ) ] > R [ f o (P j ),f e (P j ) ] The Algorithm ( Re-rank all N patters ) The Algorithm ( Reduce N ) • Log-Linear Model ( cont.) • Reduce number of patterns = − − − = Cluster N patterns in k clusters Cluster N patterns in k clusters w [ c , u , u ,..., u ] v P ( ) [log f ( ), P x ,..., x ] User ranks k patterns User ranks k patterns 0 1 s o 1 s – Discard some patterns Refine model Refine model T > T N=aN w v P ( ) w v P ( ) ฀ ฀ Re-rank all N patterns Re-rank all N patterns 1 2 N=aN N=aN – a is specified by the user R [ f o (P i ),f e (P i ) ] > R [ f o (P j ),f e (P j ) ] – Will reduce the number of patterns to present to user at end – Stop when reached the max number of iterations also specified by the user END OF ALGORITHM ☺ Modified from • Biased belief model www.nasa.com – Not presented – Identical formulation to log-linear but assign a users belief probability to each transaction 1 = SVM Black Box v P ( ) [ ( ),..., x P x ( )] P = w [ p ,..., p ] 1 m f ( ) P 1 m o – Can now rank ALL N patterns with interesting measure: > T T m = number of transactions w v P ( ) w v P ( ) ฀ ฀ 10 11 1 2 x k (P) = 1 if the transaction k contains P R [ f (P),f (P) ] = K[v(P ),w ]

The Algorithm Outline Few things are harder to put up with than • Overview the annoyance of a good example. Mark Twain 1. Pre-process - prune / micro-clustering 2. Cluster N patterns into k clusters, present to user 3. Refine the model with new user rankings, re-rank patterns • Introduction and Background 4. Reduce N=a*N • The Algorithm 5. Stop when reached max number of iterations • Example • Input parameters • Conclusions/Future Work – a = shrinking ratio – k = number of user feedback patters • Critique of Paper – niter =number of iterations to consider (will control number of patterns in output) – Epsilon – micro-clustering parameter – Model type – log-linear vs. biased belief – Ranking type – linear vs. log 12 13 Example 1 Example - 2 Transactions Get microclusters Pick a pattern Pick a pattern Pick pattern • Their results on item sets: (35) (19) #1 #2 # k – Use data to simulate a persons prior knowledge – Partition data into 2 subsets, one background one for observed data If k = 2 present: 1 2 3 distance 0 2 2 3 7 1 6 8 0 8 1 – Background = users prior 0 5 3 4 1 1 7 1 3 6 4 5 4 2 6 4 2 0 5 0.5 – Accuracy measured by 4 3 6 3 4 2 5 5 0 0 8 1 0 8 1 3 6 4 1 to the user 4 1 0 0 1 4 4 3 2 0 6 2 ∩ 7 3 7 3 4 2 1 background learned for ranking top ( ) k top ( ) k 8 1 6 7 3 7 = Accuracy 5 1 2 0 8 1 0 8 6 7 6 5 7 Refine Log-linear 8 4 7 k 0 1 4 0.333 2 5 7 8 5 7 2 6 0 Model 8 2 7 0 6 2 0.667 1 8 1 5 6 7 1 6 4 With new f e use SVM 8 1 6 0.333 5 7 2 1 4 7 – Data set: 2 4 6 to rank all 19 1 5 2 4 2 4 8 6 7 0.667 6 4 2 49,046 transactions 2 2 6 transactions 4 5 2 6 5 7 8 4 7 0.667 7 7 8 4 5 2 2,113 items 7 1 4 8 5 7 0.667 5 2 7 Reduce N 4 2 2 8 7 4 8 2 7 0.667 average length of 74 8 0 1 Sort transactions by 6 8 7 1 6 4 0.667 – First 1000 transactions are observed set 1 6 4 rank, take the top 7 5 8 1 4 7 0.667 2 8 7 – 8,234 closed frequent item sets 7 2 5 1 5 2 0.667 aN, say a= 0.1, take 1 5 2 – Micro-clustering reduces to 769 7 5 7 6 4 2 1 the top 17 (19*0.9) 4 7 7 6 5 7 6 5 7 1 – Compare top k ranked patterns 14 15 4 5 2 1

Example - 3 Example - 4 • Their results on sequences: • Their results compared to other algorithms: – 1609 sentences – Same data as example 3 (1609 sentences) – 967 closed sequential patterns – They claim theirs is better… – Full feedback: use k = 967 Selective Sampling Yu, KDD ‘05. Top-N Shen and Zhai, SIGIR ‘05 16 17 Outline Conclusions • Conclusions "I would never die for my beliefs because I might be wrong.” – Interactive with user Bertrand Russell – Tries to learn the users knowledge – Flexible (but flexible = many parameters) • Introduction and Background – Does not work well with sparse data • The Algorithm • Proposed future work • Examples – Study different models for sparse data • Conclusions/Future Work – Better feedback strategies to maximize learning – Apply to other data types/sets • Critique of Paper 18 19

Outline Discovering Interesting Patterns Through Users Interactive - PowerPoint PPT Presentation

Outline Discovering Interesting Patterns Through Users Interactive Feedback Well begun is half done. Aristotle Introduction and Background Dong Xin Xuehua Shen Qiaozhu Mei Jiawei Han The Algorithm Presented by:

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

ExoGENI: A Mul--Domain IaaS Testbed Jeff Chase Duke

Overview of Adaptive Designs Think What is Possible 2008 Rutgers Biostatistics Day April 25,

Tuning the HF Calorimeter GFlash Simulation Using CMS Data Jeff Van Harlingen 1 Rahmat Rahmat 2

RhinoArm Jeff Caley Friday, December 10, 2010 XR-3 Robotic Arm Friday, December 10, 2010 The

Topic 1: Physical Phenomena, Materials Material Testing: Impact on Type Tests and Routine Tests

NEXT GENERATION 500KV HVDC EXTRUDED CABLE SYSTEM ! ! Hideo

CHRISTMAS 2019 REVIEW FOR SUPERMARKETS ADVERTISING MAIL A review of direct mail and door drop

Dynamic Programming 2/24/2005 1:46 AM Matrix Chain-Products Matrix Chain-Product: Compute