Heuristics Miner for Time Intervals Andrea Burattin and Alessandro - - PowerPoint PPT Presentation

heuristics miner for time intervals
SMART_READER_LITE
LIVE PREVIEW

Heuristics Miner for Time Intervals Andrea Burattin and Alessandro - - PowerPoint PPT Presentation

Heuristics Miner for Time Intervals Andrea Burattin and Alessandro Sperduti Department of Pure and Applied Mathematics University of Padua, Italy April 28th, 2010 Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals


slide-1
SLIDE 1

Heuristics Miner for Time Intervals

Andrea Burattin and Alessandro Sperduti

Department of Pure and Applied Mathematics University of Padua, Italy

April 28th, 2010

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-2
SLIDE 2

Slide 2 of 16

What is Process Mining I

Business Process From the IEEE Glossary: “a sequence of steps performed for a given purpose; for example, the software development process”, that changes inputs into outputs.

Order received Goods available Goods wrapping Shipping note Shipping

Each performed action is registered into a log

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-3
SLIDE 3

Slide 3 of 16

What is Process Mining II

Main process mining areas When the model of the process is not available: Control-flow discovery aims to build a model describing the behaviour of the process; When the model of the process is available: Conformance analysis tries to fit a log to the given process model. Independent from the process model availability: Organizational mining tries to extract a “social network” that establishes relations between actions’ authors;

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-4
SLIDE 4

Slide 4 of 16

What is Process Mining III

Extracted models

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-5
SLIDE 5

Slide 5 of 16

Control–flow discovery example run

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-6
SLIDE 6

Slide 5 of 16

Control–flow discovery example run

Order received

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-7
SLIDE 7

Slide 5 of 16

Control–flow discovery example run

Order received Payment received

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 #1 > #2 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-8
SLIDE 8

Slide 5 of 16

Control–flow discovery example run

Order received Payment received Goods available

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 #2 > #3 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-9
SLIDE 9

Slide 5 of 16

Control–flow discovery example run

Order received Payment received Goods available Shipping

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 #3 > #4 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-10
SLIDE 10

Slide 5 of 16

Control–flow discovery example run

Order received Payment received Goods available Shipping

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-11
SLIDE 11

Slide 5 of 16

Control–flow discovery example run

Order received Payment reminder Payment received Goods available Shipping

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 #1 > #2 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-12
SLIDE 12

Slide 5 of 16

Control–flow discovery example run

Order received Payment reminder Payment received Goods available Shipping

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 #2 > #3 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-13
SLIDE 13

Slide 5 of 16

Control–flow discovery example run

Order received Payment reminder Payment received Goods available Shipping

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 #3 > #4 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-14
SLIDE 14

Slide 5 of 16

Control–flow discovery example run

Order received Payment reminder Payment received Goods available Shipping

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 10:00 5 Shipping apr 26, 2010 12:30 #4 > #5

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-15
SLIDE 15

Slide 6 of 16

Control–flow discovery example run

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 12:30 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-16
SLIDE 16

Slide 6 of 16

Control–flow discovery example run

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Payment reminder apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Goods available apr 26, 2010 12:30 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-17
SLIDE 17

Slide 6 of 16

Control–flow discovery example run

# Activities Completion Time Instance 1 1 Order received apr 21, 2010 12:00 2 Payment received apr 22, 2010 09:00 3 Goods available apr 26, 2010 08:30 4 Shipping apr 26, 2010 10:15 Instance 2 1 Order received apr 23, 2010 15:45 2 Goods available apr 25, 2010 15:45 3 Payment received apr 25, 2010 17:31 4 Payment reminder apr 26, 2010 12:30 5 Shipping apr 26, 2010 12:30

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-18
SLIDE 18

Slide 7 of 16

Heuristics Miner, core behaviour I

Heuristics Miner evaluates a “dependency function” between two activities (e.g. X, Y ), in order to decide if the relationship holds: X ⇒ Y = |X > Y | − |Y > X| |X > Y | + |Y > X| + 1 Where: X > Y holds if X executed at time t and Y at t + 1 |X > Y | is the number of times that X > Y holds in the log With all the relations above a threshold, the algorithm builds a directed graph with all the dependencies.

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-19
SLIDE 19

Slide 8 of 16

Heuristics Miner, core behaviour II

We can have both X ⇒ Y and X ⇒ Z:

X Y Z

Y and Z can be executed in mutual exclusion (XOR) or in parallel (in no specific order; AND) X ⇒ (Y ∧ Z) = |Y > Z| + |Z > Y | |X > Y | + |X > Z| + 1 If the value is above a threshold than AND relation else XOR

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-20
SLIDE 20

Slide 9 of 16

Time intervals in the logs

Many times, activities are stored inside the log in terms of many “sub-activities”, composing the main one:

Start End Main acvity Sub‐acvity 1 Sub‐acvity 2 Sub‐acvity n‐1 Sub‐acvity n t

Considering the first and the last sub-activity, we can build a time interval for the main activity.

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-21
SLIDE 21

Slide 10 of 16

Information on the intervals vs time spot

Allen’s Interval Algebra: the “overlap relation”

A B C D D C B A

A B C D A B C D

Events as me intervals Instantaneous events

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-22
SLIDE 22

Slide 11 of 16

New definitions, with time intervals support I

Heuristics Miner a > b direct succession of points Heuristics Miner++ a > b direct succession of intervals

A B C C B A

A > B holds A > B does not hold A > C holds

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-23
SLIDE 23

Slide 12 of 16

New definitions, with time intervals support II

Direct succession, A > B

A B

ti tk tj

C

Parallelism (overlap relation), A B

A B

ti tu tj

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-24
SLIDE 24

Slide 13 of 16

New definitions, with time intervals support III

Dependency function, for time intervals X ⇒ Y = |X > Y | − |Y > X| |X > Y | + |Y > X| + 2 · |XY | + 1

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-25
SLIDE 25

Slide 13 of 16

New definitions, with time intervals support III

Dependency function, for time intervals X ⇒ Y = |X > Y | − |Y > X| |X > Y | + |Y > X| + 2 · |XY | + 1 AND function, for time intervals X ⇒ (Y ∧ Z) = |Y > Z| + |Z > Y | + 2 · |Y Z| |X > Y | + |X > Z| + 1

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-26
SLIDE 26

Slide 14 of 16

Results for a real case

Result for Heuristics Miner

Activity 0 Activity 4 Activity 20 Activity 10 Activity 30 Activity 23 Activity 11 Activity 22 Activity 32 Activity 31 Activity 21

Log composed of 1465 log traces (in H.M., considering only the starting event)

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-27
SLIDE 27

Slide 14 of 16

Results for a real case

Result for Heuristics Miner

Activity 0 Activity 4 Activity 20 Activity 10 Activity 30 Activity 23 Activity 11 Activity 22 Activity 32 Activity 31 Activity 21

Result for Heuristics Miner++

Activity 11 Activity 4 Activity 21 Activity 0 Activity 10 Activity 30 Activity 20 Activity 22 Activity 31 Activity 32 Activity 23

Log composed of 1465 log traces (in H.M., considering only the starting event)

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-28
SLIDE 28

Slide 15 of 16

Results for an artificial dataset

10 20 30 40 50 60 70 80 90 100 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 Percentage of activities as intervals F1 measure average, min and max

F1 = 2 · p · r p + r p = tp tp + fp r = tp tp + fn

Test data: 100 random processes logs with 6 activities tp: correctly mined dependencies fp: dependencies present in the original model but not in the mined one fn: dependencies present in mined model but not in the original one

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals

slide-29
SLIDE 29

Slide 16 of 16

Conclusions and future works

What we achieved: We considered each activity as a time interval Added the “notion” of time intervals into the Heuristics Miner algorithm The new version of the algorithm is “backward compatible” Possible future works: Test of the algorithm against more (and bigger) processes Autonomous identification of best parameters’ values Support for “noise” into the time intervals, example:

equal to

Andrea Burattin and Alessandro Sperduti Heuristics Miner for Time Intervals