OLAP Query Logs for Proactive Personalization Julien Aligon 1 - - PowerPoint PPT Presentation

olap query logs
SMART_READER_LITE
LIVE PREVIEW

OLAP Query Logs for Proactive Personalization Julien Aligon 1 - - PowerPoint PPT Presentation

ADBIS2011 Mining Preferences from OLAP Query Logs for Proactive Personalization Julien Aligon 1 Matteo Golfarelli 2 Patrick Marcel 1 Stefano Rizzi 2 Elisa Turricchia 2 1 Universit Franois Rabelais Tours Laboratoire


slide-1
SLIDE 1

ADBIS’2011

Mining Preferences from OLAP Query Logs for Proactive Personalization

Julien Aligon1 – Matteo Golfarelli2 – Patrick Marcel1 – Stefano Rizzi2 – Elisa Turricchia2

1Université François Rabelais Tours

Laboratoire Informatique France

2University of Bologna

DEIS Italy

Session 3.A – September 26th 2011

slide-2
SLIDE 2

ADBIS’2011

2 Mining Preferences from OLAP Query Logs for Proactive Personalization

Motivation

MDX query

slide-3
SLIDE 3

ADBIS’2011

3 Mining Preferences from OLAP Query Logs for Proactive Personalization

Motivation

MDX query MDX Query PREFERRING myMDX

[TKDE 2011] [ICDE 2011]

Formulation Effort  Prescriptiveness  Proactiveness  Expressiveness 

slide-4
SLIDE 4

ADBIS’2011

4 Mining Preferences from OLAP Query Logs for Proactive Personalization

Motivation

MDX Query PREFERRING myMDX

[TKDE 2011] [ICDE 2011]

Formulation Effort  Prescriptiveness  Proactiveness  Expressiveness  Profile inferred from the context and/or past actions. MDX query

slide-5
SLIDE 5

ADBIS’2011

5 Mining Preferences from OLAP Query Logs for Proactive Personalization

Motivation

MDX Query PREFERRING myMDX

[TKDE 2011] [ICDE 2011]

Formulation Effort  Prescriptiveness  Proactiveness  Expressiveness  Facts are ordered according to preferences MDX query

slide-6
SLIDE 6

ADBIS’2011

6 Mining Preferences from OLAP Query Logs for Proactive Personalization

Motivation

MDX Query PREFERRING myMDX

[TKDE 2011] [ICDE 2011]

Formulation Effort  Prescriptiveness  Proactiveness  Expressiveness  Anticipate the user's preference query MDX query

slide-7
SLIDE 7

ADBIS’2011

7 Mining Preferences from OLAP Query Logs for Proactive Personalization

Motivation

MDX Query PREFERRING myMDX

[TKDE 2011] [ICDE 2011]

Formulation Effort  Prescriptiveness  Proactiveness  Expressiveness  Use of a rich language for expressing preferences MDX query

slide-8
SLIDE 8

ADBIS’2011

8 Mining Preferences from OLAP Query Logs for Proactive Personalization

Motivation

MDX Query PREFERRING myMDX Formulation Effort  Prescriptiveness  Proactiveness  Expressiveness  MDX query

slide-9
SLIDE 9

ADBIS’2011

10 Mining Preferences from OLAP Query Logs for Proactive Personalization

Proposition

slide-10
SLIDE 10

ADBIS’2011

11 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #1 How to model the query log?

slide-11
SLIDE 11

ADBIS’2011

12 Mining Preferences from OLAP Query Logs for Proactive Personalization

  • A query is a set of fragments (Qf-set)

Issue #1 How to model the query log?

MDX query:

SELECT AvgIncome ON COLUMNS, Crossjoin(OCCUPATION.members, Crossjoin(Descendants(RACE.AllRaces,RACE.Mrn), Descendants(RESIDENCE.AllCities,RESIDENCE.Region))) ON ROWS FROM CENSUS WHERE TIME.Year.[2009]

Qf-set:

AvgIncome AllCities, Region AllRaces, Mrn Occ Year AllSex Year=2009

Measure: Levels: Selection:

  • A log is a set of qf-set
slide-12
SLIDE 12

ADBIS’2011

13 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #2 What preferences can be extracted from the log?

slide-13
SLIDE 13

ADBIS’2011

14 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #2 What candidate preferences can be extracted from the log?

What candidate preferences to extract ?

  • Rules of the form:

context candidate preference

Single fragment Qf-set (part of query)

How to extract these candidate preferences ?

  • Off-line extraction of association rules, using a classical algorithm (e.g.,

Apriori)

  • Confidence and support thresholds adjusted automatically, so that the set of

extracted rules covers all the log

slide-14
SLIDE 14

ADBIS’2011

15 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #2 What candidate preferences can be extracted from the log?

slide-15
SLIDE 15

ADBIS’2011

16 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #3 What preferences are relevant for the current query ?

slide-16
SLIDE 16

ADBIS’2011

17 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #3 What preferences are relevant for the current query ?

How candidate preferences of the log are found relevant?

  • By matching the rules of the log with the fragments of the user’s query q

Not every rule is relevant for the user’s query:

  • Pertinent rule: the context is in the Qf-set of the query
  • Effective rule: the candidate preference is in the Qf-set of the

query and allows to order the facts

slide-17
SLIDE 17

ADBIS’2011

18 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #3 What preferences are relevant for the current query ?

Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

Answer to the query:

(AllCities, AllRaces, Actors, 2009, AllSex, 15000) (Pacific, AllRaces, Actors, 2009, AllSex, 20000)

slide-18
SLIDE 18

ADBIS’2011

19 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #3 What preferences are relevant for the current query ?

Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

Non effective rule: AllSex  Year=2009

Answer to the query:

(AllCities, AllRaces, Actors, 2009, AllSex, 15000) (Pacific, AllRaces, Actors, 2009, AllSex, 20000)

NO PREFERENCE

slide-19
SLIDE 19

ADBIS’2011

20 Mining Preferences from OLAP Query Logs for Proactive Personalization

Answer to the query:

(Pacific, AllRaces, Actors, 2009, AllSex, 20000) (AllCities, AllRaces, Actors, 2009, AllSex, 15000) Issue #3 What preferences are relevant for the current query ?

Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

PREFERRED Pertinent and effective Rule: Year=2009  Region

slide-20
SLIDE 20

ADBIS’2011

21 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #3 What preferences are relevant for the current query ?

Extracted rules of the log: Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

slide-21
SLIDE 21

ADBIS’2011

22 Mining Preferences from OLAP Query Logs for Proactive Personalization

1st step: remove non pertinent (r1) and non effective (r5 , r7)

Issue #3 What preferences are relevant for the current query ?

Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

slide-22
SLIDE 22

ADBIS’2011

23 Mining Preferences from OLAP Query Logs for Proactive Personalization

2nd step: group by candidate preference

  • Region: 0.70
  • AllCities: 0.60
  • AvgIncome ∈ [500, 1000]: 0.55
  • Mrn: 0.45
  • Year: 0.40

Issue #3 What preferences are relevant for the current query ?

nb of preferences to add in the query : α=2 Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

slide-23
SLIDE 23

ADBIS’2011

24 Mining Preferences from OLAP Query Logs for Proactive Personalization

3rd step: select relevant fragment

  • Region: 0.70
  • AllCities: 0.60
  • AvgIncome ∈ [500, 1000]: 0.55
  • Mrn: 0.45
  • Year: 0.40

Issue #3 What preferences are relevant for the current query ?

Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

slide-24
SLIDE 24

ADBIS’2011

25 Mining Preferences from OLAP Query Logs for Proactive Personalization

3rd step: select relevant fragment

  • Region: 0.70
  • AllCities: 0.60
  • AvgIncome ∈ [500, 1000]: 0.55
  • Mrn: 0.45
  • Year: 0.40

Issue #3 What preferences are relevant for the current query ?

Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

slide-25
SLIDE 25

ADBIS’2011

26 Mining Preferences from OLAP Query Logs for Proactive Personalization

3rd step: select relevant fragment

  • Region: 0.70
  • AllCities: 0.60
  • AvgIncome ∈ [500, 1000]: 0.55
  • Mrn: 0.45
  • Year: 0.40

Issue #3 What preferences are relevant for the current query ?

Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

slide-26
SLIDE 26

ADBIS’2011

27 Mining Preferences from OLAP Query Logs for Proactive Personalization

3rd step: select relevant fragment

  • Region: 0.70
  • AllCities: 0.60
  • AvgIncome ∈ [500, 1000]: 0.55
  • Mrn: 0.45
  • Year: 0.40

Issue #3 What preferences are relevant for the current query ?

Qf-set:

AvgIncome ALLCITIES, REGION ALLRACES, MRN OCC YEAR ALLSEX YEAR=2009

Measure: Levels: Selection:

slide-27
SLIDE 27

ADBIS’2011

28 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #4 How to apply the relevant preferences to the query?

slide-28
SLIDE 28

ADBIS’2011

29 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #4 How to apply the relevant preferences to the query?

How to translate the relevant candidate preference fragments into the query?

  • By using the preference constructor defined by myMDX
slide-29
SLIDE 29

ADBIS’2011

30 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #4 How to apply the relevant preferences to the query?

4th step: translate the fragments

  • AvgIncome ∈ [500, 1000]
  • Mrn: 0.45

BETWEEN(AvgIncome, 500, 1000) CONTAIN(Race, Mrn)

slide-30
SLIDE 30

ADBIS’2011

31 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #4 How to apply the relevant preferences to the query?

How to combine preference constructors?

  • By AND clause between each successive constructors (Pareto combination)
  • Each preference has the same importance
slide-31
SLIDE 31

ADBIS’2011

32 Mining Preferences from OLAP Query Logs for Proactive Personalization

Issue #4 How to apply the relevant preferences to the query?

5th step: annotate the query

BETWEEN(AvgIncome, 500, 1000) CONTAIN(Race, Mrn) SELECT AvgIncome ON COLUMNS, Crossjoin(OCCUPATION.members, Crossjoin(Descendants(RACE.AllRaces,RACE.Mrn), Descendants(RESIDENCE.AllCities,RESIDENCE.Region))) ON ROWS FROM CENSUS WHERE TIME.Year.[2009] PREFERRING AvgIncome BETWEEN 500 AND 1000 AND RACE CONTAIN Mrn SELECT AvgIncome ON COLUMNS, Crossjoin(OCCUPATION.members, Crossjoin(Descendants(RACE.AllRaces,RACE.Mrn), Descendants(RESIDENCE.AllCities,RESIDENCE.Region))) ON ROWS FROM CENSUS WHERE TIME.Year.[2009]

slide-32
SLIDE 32

ADBIS’2011

33 Mining Preferences from OLAP Query Logs for Proactive Personalization

Experimental results

  • Tests are based on a synthetic log with about 1000 queries.

This log simulates N sessions as follows:

  • For each session
  • Generation of a random query
  • Look for interesting 20 drilldown queries

(in the sense of Sarawagi VLDB 1999)

  • 8 queries to be personalized are randomly extracted.
  • Tools:
  • Weka
  • Mondrian server
  • Oracle database (Real data from IPUMS USA)
slide-33
SLIDE 33

ADBIS’2011

34 Mining Preferences from OLAP Query Logs for Proactive Personalization

Experimental results

slide-34
SLIDE 34

ADBIS’2011

35 Mining Preferences from OLAP Query Logs for Proactive Personalization

Experimental results

slide-35
SLIDE 35

ADBIS’2011

36 Mining Preferences from OLAP Query Logs for Proactive Personalization

Conclusion

  • Automatic extraction of relevant candidate preferences based on

association rules

  • A proactive approach based on the log mining
  • Reducing formulation effort
slide-36
SLIDE 36

ADBIS’2011

37 Mining Preferences from OLAP Query Logs for Proactive Personalization

Perspectives

  • Extend the approach to better match myMDX expressiveness
  • Include prioritization in addition to Pareto
  • Take into account other preference constructors defined for

myMDX

  • Extend the approach for recommendation in a collaborative context
  • with a multi-user log
slide-37
SLIDE 37

Thanks for your attention Any questions? ADBIS’2011