Modeling Recurrent Distributions in Streams using Possible Worlds - - PowerPoint PPT Presentation

β–Ά
modeling recurrent distributions in streams
SMART_READER_LITE
LIVE PREVIEW

Modeling Recurrent Distributions in Streams using Possible Worlds - - PowerPoint PPT Presentation

Modeling Recurrent Distributions in Streams using Possible Worlds Michael Geilke, Andreas Karwath, and Stefan Kramer Johannes Gutenberg-Universitt Mainz, Germany October 20, 2015 Modeling Recurrent Distributions in Streams using Possible


slide-1
SLIDE 1

Modeling Recurrent Distributions in Streams using Possible Worlds

Michael Geilke, Andreas Karwath, and Stefan Kramer

Johannes Gutenberg-UniversitΓ€t Mainz, Germany

October 20, 2015

slide-2
SLIDE 2

Modeling Recurrent Distributions in Streams using Possible Worlds

2

Smart

slide-3
SLIDE 3

Modeling Recurrent Distributions in Streams using Possible Worlds

3

Smart

slide-4
SLIDE 4

Modeling Recurrent Distributions in Streams using Possible Worlds

4

Smart

slide-5
SLIDE 5

Modeling Recurrent Distributions in Streams using Possible Worlds

5

Smart

EDDO

slide-6
SLIDE 6

Modeling Recurrent Distributions in Streams using Possible Worlds

6

Smart

𝑔 EDDO

slide-7
SLIDE 7

Modeling Recurrent Distributions in Streams using Possible Worlds

7

Smart

𝑔 EDDO

Query 1 [marginalize]

What is the data distribution of the sensors in the living room?

slide-8
SLIDE 8

Modeling Recurrent Distributions in Streams using Possible Worlds

8

Smart

𝑔 EDDO

Query 2 [setting hard evidence]

Two residents are in the living room. What is the probability that they watch TV?

slide-9
SLIDE 9

Modeling Recurrent Distributions in Streams using Possible Worlds

9

Smart

𝑔 EDDO POEt Alg 1 Alg 2 Out 1 Out 2 Itemsets

slide-10
SLIDE 10

Modeling Recurrent Distributions in Streams using Possible Worlds

10

Smart

slide-11
SLIDE 11

Modeling Recurrent Distributions in Streams using Possible Worlds

11

Smart

slide-12
SLIDE 12

Modeling Recurrent Distributions in Streams using Possible Worlds

12

𝑔 EDDO

Smart

slide-13
SLIDE 13

Modeling Recurrent Distributions in Streams using Possible Worlds

13

Smart

Recurrences

  • day and night
  • working days and weekends
  • seasons
slide-14
SLIDE 14

Modeling Recurrent Distributions in Streams using Possible Worlds

14

Recurrences

  • pattern could be more complex
  • may only affect a part of the house
slide-15
SLIDE 15

Modeling Recurrent Distributions in Streams using Possible Worlds

15

Goal: a representation that

  • is constantly updated
  • is representing current and historical data distributions,
  • is able to represent recurrences
  • provides a query mechanism
slide-16
SLIDE 16

Modeling Recurrent Distributions in Streams using Possible Worlds

16

do all of that in an online fashion Tasks for proposed method

  • 1. recognize regions of drift
  • 2. represent density of data stream segments
  • 3. identify recurrences on the density level
  • 4. identify recurrences between parts of different densities
slide-17
SLIDE 17

Modeling Recurrent Distributions in Streams using Possible Worlds

17

do all of that in an online fashion Tasks for proposed method

  • 1. recognize regions of drift
  • 2. represent density of data stream segments
  • 3. identify recurrences on the density level
  • 4. identify recurrences between parts of different densities
slide-18
SLIDE 18

Modeling Recurrent Distributions in Streams using Possible Worlds

18

do all of that in an online fashion Tasks for proposed method

  • 1. recognize regions of drift
  • 2. represent density of data stream segments
  • 3. identify recurrences on the density level
  • 4. identify recurrences between parts of different densities
slide-19
SLIDE 19

Modeling Recurrent Distributions in Streams using Possible Worlds

19

do all of that in an online fashion Tasks for proposed method

  • 1. recognize regions of drift
  • 2. represent density of data stream segments
  • 3. identify recurrences on the density level
  • 4. identify recurrences between parts of different densities
slide-20
SLIDE 20

Modeling Recurrent Distributions in Streams using Possible Worlds

20

do all of that in an online fashion Tasks for proposed method

  • 1. recognize regions of drift
  • 2. represent density of data stream segments
  • 3. identify recurrences on the density level
  • 4. identify recurrences between parts of different densities
slide-21
SLIDE 21

Modeling Recurrent Distributions in Streams using Possible Worlds

21

do all of that in an online fashion Tasks for proposed method

  • 1. recognize regions of drift
  • 2. represent density of data stream segments
  • 3. identify recurrences on the density level
  • 4. identify recurrences between parts of different densities
slide-22
SLIDE 22

Modeling Recurrent Distributions in Streams using Possible Worlds

Recognize Regions of Drift

22

slide-23
SLIDE 23

Modeling Recurrent Distributions in Streams using Possible Worlds

Recognize Regions of Drift

23

Window-based approach

  • extension of an approach by Dries and RΓΌckert

A B C

slide-24
SLIDE 24

Modeling Recurrent Distributions in Streams using Possible Worlds

Recognize Regions of Drift

24

Window-based approach

  • extension of an approach by Dries and RΓΌckert
  • compute density values with current estimate f

A B C 𝑔 𝑔

slide-25
SLIDE 25

Modeling Recurrent Distributions in Streams using Possible Worlds

Recognize Regions of Drift

25

Window-based approach

  • extension of an approach by Dries and RΓΌckert
  • compute density values with current estimate f
  • perform drift detection with Wilcoxon rank-sum test

A B C 𝑔 𝑔

Wilcoxon

slide-26
SLIDE 26

Modeling Recurrent Distributions in Streams using Possible Worlds

Recognize Regions of Drift

26

Window-based approach

  • extension of an approach by Dries and RΓΌckert
  • compute density values with current estimate f
  • perform drift detection with Wilcoxon rank-sum test
  • update f with clean instances only

A B C 𝑔

slide-27
SLIDE 27

Modeling Recurrent Distributions in Streams using Possible Worlds

Recognize Regions of Drift

27

A B C

Window-based approach

  • extension of an approach by Dries and RΓΌckert
  • compute density values with current estimate f
  • perform drift detection with Wilcoxon rank-sum test
  • update f with clean instances only

𝑔

slide-28
SLIDE 28

Modeling Recurrent Distributions in Streams using Possible Worlds

Recognize Regions of Drift

28

A B C 𝑔

Window-based approach

  • extension of an approach by Dries and RΓΌckert
  • compute density values with current estimate f
  • perform drift detection with Wilcoxon rank-sum test
  • update f with clean instances only
slide-29
SLIDE 29

Modeling Recurrent Distributions in Streams using Possible Worlds

Recognize Regions of Drift

29

Window-based approach

  • extension of an approach by Dries and RΓΌckert
  • compute density values with current estimate f
  • perform drift detection with Wilcoxon rank-sum test
  • update f with clean instances only

C

slide-30
SLIDE 30

Modeling Recurrent Distributions in Streams using Possible Worlds

Recurrences of Densities

30

C

Recurrent or new?

slide-31
SLIDE 31

Modeling Recurrent Distributions in Streams using Possible Worlds

Recurrences of Densities

31

C

𝑔

1

𝑔

2

𝑔

3

𝑔

4

𝑔

5

Recurrent or new?

  • compare with pool of existing density

estimates

  • use statistical test we proposed earlier
  • reactivate estimate if one is found
  • initialize a new one otherwise

𝑔

𝑗

slide-32
SLIDE 32

Modeling Recurrent Distributions in Streams using Possible Worlds

Recurrences of Densities

32

C

𝑔

1

𝑔

2

𝑔

3

𝑔

4

𝑔

5

Recurrent or new?

  • compare with pool of existing density

estimates

  • use statistical test we proposed earlier
  • reactivate estimate if one is found
  • initialize a new one otherwise

𝑔

𝑗

slide-33
SLIDE 33

Modeling Recurrent Distributions in Streams using Possible Worlds

Recurrences of Densities

33

C

𝑔

1

𝑔

2

𝑔

3

𝑔

4

𝑔

5

Recurrent or new?

  • compare with pool of existing density

estimates

  • use statistical test we proposed earlier
  • reactivate estimate if one is found
  • initialize a new one otherwise

𝑔

𝑗

slide-34
SLIDE 34

Modeling Recurrent Distributions in Streams using Possible Worlds

Recurrences of Densities

34

C

𝑔

1

𝑔

2

𝑔

3

𝑔

4

𝑔

5

Recurrent or new?

  • compare with pool of existing density

estimates

  • use statistical test we proposed earlier
  • reactivate estimate if one is found
  • initialize a new one otherwise

𝑔

𝑗

slide-35
SLIDE 35

Modeling Recurrent Distributions in Streams using Possible Worlds

Recurrences of Densities

35

C

𝑔

1

𝑔

2

𝑔

3

𝑔

4

𝑔

5

Recurrent or new?

  • compare with pool of existing density

estimates

  • use statistical test we proposed earlier
  • reactivate estimate if one is found
  • initialize a new one otherwise

Wilcoxon

slide-36
SLIDE 36

Modeling Recurrent Distributions in Streams using Possible Worlds

Recurrences of Density Parts

36

𝑔 π‘Œ1, π‘Œ2, … , π‘Œ8 = 𝑔

1 π‘Œ1, π‘Œ3, π‘Œ8 βˆ™ 𝑔 2 π‘Œ2, π‘Œ4, π‘Œ5 βˆ™ 𝑔 3 π‘Œ6 βˆ™ 𝑔 4 π‘Œ7

Introduction of modules If the 𝑔

𝑗 cannot be decomposed any further, then 𝑔 1, 𝑔 2, 𝑔 3, 𝑔 4

are called the modules of 𝑔.

slide-37
SLIDE 37

Modeling Recurrent Distributions in Streams using Possible Worlds

37

p1 p2 p3 p4

slide-38
SLIDE 38

Modeling Recurrent Distributions in Streams using Possible Worlds

Query Mechanism

38

p1 p2 p3 p4

  • probabilistic extension of

possible worlds semantics

  • requires density estimators

supporting inference tasks

slide-39
SLIDE 39

Modeling Recurrent Distributions in Streams using Possible Worlds

Query Mechanism

39

p1 p2 p3 p4

Query 3 [over multiple worlds] Given world W, what is the probability that the resident will switch

  • n the light in the office room?
  • probabilistic extension of

possible worlds semantics

  • requires density estimators

supporting inference tasks

slide-40
SLIDE 40

Modeling Recurrent Distributions in Streams using Possible Worlds

Evaluation: Modules

40

  • evaluation on

synthetic and real-world datasets

  • without modules performance is

better in many cases, but only slightly

  • more explicit representation that

enables detection of recurrences

Datasets Synthetic Bayesian networks with different numbers of nodes, different numbers of instances different numbers of variable groups Real-World Electricity Shuttle Waterlevel Covertype

slide-41
SLIDE 41

Modeling Recurrent Distributions in Streams using Possible Worlds

Evaluation: Recurrences

41

Recurrences Densities Modules 416 1259 100% 78% distance distance

slide-42
SLIDE 42

Modeling Recurrent Distributions in Streams using Possible Worlds

Conclusions and Future Work

  • framework to model recurrent densities and

recurrent parts of the densities

  • online estimator
  • extension of possible worlds semantics for

query mechanism

Future Work:

  • more sophisticated modeling of density parts (conditional)
  • recycling of modules
  • implementation of query mechanism

42

slide-43
SLIDE 43

Modeling Recurrent Distributions in Streams using Possible Worlds

Thank you for your attention

43