Profiling user belief in BI exploration for measuring subjective - PowerPoint PPT Presentation

Profiling user belief in BI exploration for measuring subjective interestingness Alexandre Chanson, Ben Crulis, Krista Drushku, Nicolas Labroche, Patrick Marcel DOLAP 2019 - 26 March 2019 University of Tours

What is Alice best next move? In fact, it depends! 1

A very subjective question? We would need to “brain dump” analysts 2

What is subjective interestingness? • Objective interestingness • user agnostic, based only on data • generality, reliability, peculiarity, diversity and conciseness, • directly measurable evaluation metrics: support confidence, lift or chi-squared measures in the case of association rules • summaries: compact descriptions of raw data at different concept levels (Geng & Hamilton) • characterize the patterns’ surprise and novelty when compared to previous user knowledge or expected data distribution • user adaptive exploration • subjective interestingness for explorative data mining 3 • Subjective interestingness

De Bie’s framework space probability distribution over the pattern space 4 • a pattern p ≈ restriction of data • a belief(p) ≈ prior knowledge as a • surprise(p) = − log ( belief ( p )) Interestingness ( p ) = surprise ( p ) | p |

How to translate subjective interestingness to BI? Two main problems: • Define the ”pattern” • Cell? • Query? • Query parts? • how to take into account the specificities of BI? • how can we decide that two pieces of information are related in BI? • do we consider the usage (the query logs)? • do we consider the structure (the DB schema)? 5 • Learn the belief function

Our proposal

Belief expressed over query parts Classically, a query part is either: • A group by set attribute • A measure • A selection predicate 6

Query parts as patterns Figure 1: Query as a restriction of the data space 7

Our recipe so far Figure 2: Caption what ingredients we want to use ? knowing that the question is then: what is the probability that someone 8

Random walk for learning the distribution • consider a graph where vertices are query parts and edges are relations (precedence, co-occurrence) between them • the user does a random walk over this graph • the long term distribution of the user gives a measure of importance of the query parts • it can be computed with a Page Rank • or better, by a Topic-Specific Page Rank: a Page Rank where the user’s query parts are more important than the others 9

Baking the pie 10

Experiments

Our ”Users” • Artificial data generated with CubeLoad [1] • mimic prototypical explorations • More ”consistent” than real users • Less noisy • Only 4 profiles Figure 3: CubeLoad Templates 11

Protocol of the qualitative experiment • determine if there is a belief profile that is representative of each CubeLoad template 12

Different user different beliefs 13

Protocol of the quantitative experiment Introducing a user agnostic recommender in the loop Robustness to logs exploring different regions (of the cube) 14

Observing a cognitive bubble Average Hellinger distance values on 10 runs when log files are identical 15

Conclusions • First attempt to model belief in BI • Experiments • Different simulated user templates == different beliefs distributions • Possible detection of the cognitive bubble phenomena 16 • Capture potential relations between user knowledge as a graph • ⇒ use well-known Page-Rank for estimating probabilities

On-going and Future work • What about belief distribution over cell contents? • theoretically appealing but computationally painful... • (but we’re on it) • What about belief evolution along the exploration? • Subjective interestingness is a trade-off between surprise and complexity of description • how to measure complexity of description in BI? • How to validate a user “brain dump”? • Perform a user study based on an improved query recommender system with interestingness 17

Long term vision 18

Questions ? 18

References i S. Rizzi and E. Gallinucci. Cubeload: A parametric generator of realistic OLAP workloads. In Advanced Information Systems Engineering - 26th International Conference, CAiSE 2014, Thessaloniki, Greece, June 16-20, 2014. Proceedings , pages 610–624, 2014.

Profiling user belief in BI exploration for measuring subjective - PowerPoint PPT Presentation

Profiling user belief in BI exploration for measuring subjective interestingness Alexandre Chanson, Ben Crulis, Krista Drushku, Nicolas Labroche, Patrick Marcel DOLAP 2019 - 26 March 2019 University of Tours What is Alice best next move? In

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Web User Profiling using Data Redundancy http://aminer.org/profiling Xiaotao Gu, Hong Yang, Jie

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

Profiling of Data-Parallel Processors Daniel Kruck 09/02/2014 09/02/2014 Profiling Daniel

Leaving no one behind The role of evidence-building and profiling to include displacement in

Expression Profiling Mark Voorhies 4/4/2011 Mark Voorhies Expression Profiling Review

COZ : Finding Code that Counts with Causal Profiling Anuja Golechha Agenda Profiling

Optimization Profiling VisualVM Exercise Meme Credit: Randall Munroe, hrefhttp://xkcd.comxkcd

Profiling of Algorithms Profiling refers to the experimental measurement of the performance of

An introduction to Profiling Physics Coding Club: 09/06/2017 D. Dickinson

Twitter User Profiling: Bot and Gender Identification 7 th Author Profiling Task PAN 2019 CLEF

Belief Decision Behavior: Theory and Evidence Todd Davies Belief Concepts Proposition

Belief and assertion. Evidence from mood shift Alda Mari Institut Jean Nicod , cnrs/ens/ehess/psl

Meta-Reinforcement Learning of Structured Exploration Strategies Abhishek Gupta , Russell

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

Obfuscation Using Distributional Features Bachelors Thesis Defense by Janek Bevendorff Date:

Bayesian estimation of sparse precision matrices Subhashis Ghoshal, North Carolina State

Infotheory for Statistics and Learning Lecture 1 Entropy Relative entropy Mutual

Communication Complexity BASICS Summer School 2015 Communication

Variational regularisation for inverse problems with imperfect forward operators and general

Optimum Source Resolvability Rate with Respect to f -Divergences Using the Smooth Rnyi Entropy

Approximate Relational Reasoning for Probabilistic Programs PhD Candidate: Federico Olmedo

Pure Exploration Stochastic Multi-armed Bandits Jian Li Institute for Interdisciplinary

Sambuz

Useful Links

Newsletter

Mail Us

Profiling user belief in BI exploration for measuring subjective - PowerPoint PPT Presentation

Profiling user belief in BI exploration for measuring subjective interestingness Alexandre Chanson, Ben Crulis, Krista Drushku, Nicolas Labroche, Patrick Marcel DOLAP 2019 - 26 March 2019 University of Tours What is Alice best next move? In

Overview Independence Belief Networks Conditional Independence Belief networks Chris

26:198:722 Expert Systems I Dempster-Shafer Belief Functions I Combining Belief Functions I Types

Web User Profiling using Data Redundancy http://aminer.org/profiling Xiaotao Gu, Hong Yang, Jie

Introduction: Belief vs Degrees of Belief Hannes Leitgeb LMU Munich October 2014 My three

Profiling of Data-Parallel Processors Daniel Kruck 09/02/2014 09/02/2014 Profiling Daniel

Leaving no one behind The role of evidence-building and profiling to include displacement in

Expression Profiling Mark Voorhies 4/4/2011 Mark Voorhies Expression Profiling Review

COZ : Finding Code that Counts with Causal Profiling Anuja Golechha Agenda Profiling

Optimization Profiling VisualVM Exercise Meme Credit: Randall Munroe, hrefhttp://xkcd.comxkcd

Profiling of Algorithms Profiling refers to the experimental measurement of the performance of

An introduction to Profiling Physics Coding Club: 09/06/2017 D. Dickinson

Twitter User Profiling: Bot and Gender Identification 7 th Author Profiling Task PAN 2019 CLEF

Belief Decision Behavior: Theory and Evidence Todd Davies Belief Concepts Proposition

Belief and assertion. Evidence from mood shift Alda Mari Institut Jean Nicod , cnrs/ens/ehess/psl

Meta-Reinforcement Learning of Structured Exploration Strategies Abhishek Gupta , Russell

RUN groupadd -r user &amp;&amp; useradd -r -g user user USER user $ docker run --read-only debian

Obfuscation Using Distributional Features Bachelors Thesis Defense by Janek Bevendorff Date:

Bayesian estimation of sparse precision matrices Subhashis Ghoshal, North Carolina State

Infotheory for Statistics and Learning Lecture 1 Entropy Relative entropy Mutual

Communication Complexity BASICS Summer School 2015 Communication

Variational regularisation for inverse problems with imperfect forward operators and general

Optimum Source Resolvability Rate with Respect to f -Divergences Using the Smooth Rnyi Entropy

Approximate Relational Reasoning for Probabilistic Programs PhD Candidate: Federico Olmedo

Pure Exploration Stochastic Multi-armed Bandits Jian Li Institute for Interdisciplinary

Sambuz

Useful Links

Newsletter

Mail Us

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian