Process Mining Tutorial Computational Intelligence in HealthCare 20 - - PowerPoint PPT Presentation

process mining
SMART_READER_LITE
LIVE PREVIEW

Process Mining Tutorial Computational Intelligence in HealthCare 20 - - PowerPoint PPT Presentation

Process Mining Tutorial Computational Intelligence in HealthCare 20 - 24 September 2010, Eindhoven, the Netherlands prof.dr.ir. Wil van der Aalst www.processmining.org Focus of most modeling and analysis techniques is on right-hand side


slide-1
SLIDE 1

Process Mining

prof.dr.ir. Wil van der Aalst

www.processmining.org Tutorial Computational Intelligence in HealthCare 20 - 24 September 2010, Eindhoven, the Netherlands

slide-2
SLIDE 2

Focus of most modeling and analysis techniques is on right-hand side …

PAGE 1

diagnosis/ requirements configuration/ implementation enactment/ monitoring adustment (re)design models data insight discussion verification performance analysis animation specification documentation configuration

slide-3
SLIDE 3

Let’s play …

PAGE 2

event log process model

Play-In

event log process model

Play-Out

event log process model

Replay

  • extended model

showing times, frequencies, etc.

  • diagnostics
  • predictions
  • recommendations
slide-4
SLIDE 4

Menu

1. Introduction to process mining 2. Two types of processes:

  • Lasagna processes
  • Spaghetti processes

3. The Alpha algorithm 4. Over/underfitting 5. Replay

  • Conformance checking
  • Predictions

6. Process mining Software (BI versus ProM)

PAGE 3
slide-5
SLIDE 5 PAGE 4

http://www.canarypete.be/

slide-6
SLIDE 6 PAGE 5
slide-7
SLIDE 7

Growth of data

PAGE 6
slide-8
SLIDE 8 PAGE 7

Data Mining

Smoker Drinker Weight Short (91/10) Yes No Long (30/1) No Yes Long (150/20) Short (321/25) <81.5 ≥81.5

Process Mining = Process Analysis

start register initial conditions check_A needed? check_A modify conditions check_B needed? check_B check_C needed? check_C asses risk decline c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 make
  • ffer
handle response handle payment send insurance documents timeout1 timeout2 withdraw
  • ffer
c14 c15 c16 c17 (RM,RD) (RM,RD) (E,SD) (E,RD) (SM,SD) (E,SD) (E,FD) (E,SD) (E,SD) (YE,RD) (YE,RD) (FE,FD) (RM,RD)

+

slide-9
SLIDE 9 PAGE 8

Process Mining

  • Process discovery: "What is

really happening?"

  • Conformance checking: "Do

we do what was agreed upon?"

  • Performance analysis:

"Where are the bottlenecks?"

  • Process prediction: "Will this

case be late?"

  • Process improvement: "How

to redesign this process?"

  • Etc.
slide-10
SLIDE 10

Process Discovery Example

PAGE 9

α

slide-11
SLIDE 11 PAGE 10

>,→,||,# relations

  • Direct succession: x>y iff

for some case x is directly followed by y.

  • Causality: x→y iff x>y and

not y>x.

  • Parallel: x||y iff x>y and

y>x

  • Choice: x#y iff not x>y and

not y>x.

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task A case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task D case 4 : task D

A>B A>C A>E B>C B>D C>B C>D E>D A→B A→C A→E B→D C→D E→D B||C C||B

ABCD ACBD AED

slide-12
SLIDE 12 PAGE 11

Basic Idea Used by α Algorithm (1)

a b (a) sequence pattern: a→b

slide-13
SLIDE 13 PAGE 12

Basic Idea Used by α Algorithm (2)

a b c (b) XOR-split pattern: a→b, a→c, and b#c

a b c (c) XOR-join pattern: a→c, b→c, and a#b

a b c (b) XOR-split pattern: a→b, a→c, and b#c

slide-14
SLIDE 14 PAGE 13

Basic Idea Used by α Algorithm (3)

a b c (d) AND-split pattern: a→b, a→c, and b||c

a b c (e) AND-join pattern: a→c, b→c, and a||b

a b c (d) AND-split pattern: a→b, a→c, and b||c

slide-15
SLIDE 15

Example Revisited

PAGE 14

A B C D E p2 end p4 p3 p1 start B#E C#E …

Result produced by α algorithm

A>B A>C A>E B>C B>D C>B C>D E>D A→B A→C A→E B→D C→D E→D B||C C||B

slide-16
SLIDE 16 PAGE 15

Process mining: Linking events to models

software system (process) model event logs

models analyzes

discovery

records events, e.g., messages, transactions, etc. specifies configures implements analyzes supports/ controls

extension conformance

“world”

people machines

  • rganizations

components business processes

slide-17
SLIDE 17

Old Toolset

/ name of department PAGE 16

22-

software system (process) model event logs

models analyzes

discovery

records events, e.g., messages, transactions, etc. specifies configures implements analyzes supports/ controls

extension conformance

“world”

people machines

  • rganizations

components business processes

ProMimport MXML

slide-18
SLIDE 18

New Toolset

/ name of department PAGE 17

22-

software system (process) model event logs

models analyzes

discovery

records events, e.g., messages, transactions, etc. specifies configures implements analyzes supports/ controls

extension conformance

“world”

people machines

  • rganizations

components business processes

XESame 5.2 ►6.0 MXML ►XES

slide-19
SLIDE 19

Motivation for changes

/ name of department PAGE 18

22-

software system (process) model event logs

models analyzes

discovery

records events, e.g., messages, transactions, etc. specifies configures implements analyzes supports/ controls

extension conformance

“world”

people machines

  • rganizations

components business processes

XESame 5.2 ►6.0 MXML ►XES

dist stri ributio ion deco coupling ng log

  • gic

c and nd UI dealing ng w with h hund ndre reds of

  • f plug

ug-ins ns extendible ble se sema mantics no p no prog rogra ramm mming ing map mapping p prob roblems

slide-20
SLIDE 20 PAGE 19

Where did we apply process mining?

  • Municipalities (e.g., Alkmaar, Heusden, Harderwijk, etc.)
  • Government agencies (e.g., Rijkswaterstaat, Centraal

Justitieel Incasso Bureau, Justice department)

  • Insurance related agencies (e.g., UWV)
  • Banks (e.g., ING Bank)
  • Hospitals (e.g., AMC hospital, Catharina hospital)
  • Multinationals (e.g., DSM, Deloitte)
  • High-tech system manufacturers and their customers

(e.g., Philips Healthcare, ASML, Ricoh, Thales)

  • Media companies (e.g. Winkwaves)
  • ...
slide-21
SLIDE 21

Example of a Lasagna Process

slide-22
SLIDE 22

Example: WMO Harderwijk

  • Process related to the execution of “Wet

Maatschappelijke Ondersteuning” (WMO) Harderwijk

  • Handling WMO applications
  • WMO: supporting citizens of municipalities (illness,

handicaps, elderly, etc.).

  • Examples:
  • wheelchair, scootmobiel, ...
  • adaptation of house (elevator), ...
  • household help, ...
PAGE 21
slide-23
SLIDE 23

Event log

(796 applications, 5187 events)

PAGE 22
slide-24
SLIDE 24

Helicopter view of 1.5 years

PAGE 23
slide-25
SLIDE 25

Huge variance in durations

PAGE 24
slide-26
SLIDE 26

Process discovered using Genetic Miner

PAGE 25
slide-27
SLIDE 27

Various representations

PAGE 26
slide-28
SLIDE 28

Fuzzy Miner

PAGE 27
slide-29
SLIDE 29

Seamless abstraction

PAGE 28

more detailed more abstract

slide-30
SLIDE 30

Fuzzy Replay

PAGE 29
slide-31
SLIDE 31

Conformance checking using Replay

PAGE 30

= should not have happened but did = should have happened but did not

slide-32
SLIDE 32

Performance analysis using Replay

PAGE 31
slide-33
SLIDE 33

Performance information

PAGE 32
slide-34
SLIDE 34

Prediction based

  • n Replay
PAGE 33
slide-35
SLIDE 35

Spaghetti Processes

Balanci ncing ng B Between n Underfitt ittin ing a g and Overfit ittin ting

slide-36
SLIDE 36 PAGE 35

Process spectrum

structured (Lasagna) unstructured (Spaghetti)

slide-37
SLIDE 37 PAGE 36
slide-38
SLIDE 38

How can process mining help?

  • Detect bottlenecks
  • Detect deviations
  • Performance

measurement

  • Suggest

improvements

  • Decision support

(recommendation and prediction)

  • Provide mirror
  • Highlight important

problems

  • Avoid ICT failures
  • Avoid management

by PowerPoint

  • From “politics” to

“analytics”

PAGE 41
slide-39
SLIDE 39

Learning processes: The Alpha Algorithm

slide-40
SLIDE 40 PAGE 43

Process Mining: The alpha algorithm

α

algorithm

22 Opbergen en einde 10 registreren 14 eindcontrolere, tekenen Standaard 17 bepalen vervolg 9 Bepalen vervolg1 18 registreren offerte gesloten 13 inv., 1e controle, printen STANDAARD 3 controleren compleetheid/juistheid 1 start 2 collectief of particulier 12 Bepalen offerte standaard of NIET klaar voor invoeren Goedgekeurde offerte begin proces klaar voor controle compleet/juist klaar voor registreren naar registreren
  • fferte uitgeprint
klaar voor einde Standaard offerte afgekeurde offerte 20 ontvangst verklaring P2 accoord verklaring 7 ontvangst gegevens P1 ontbrekende gegevens 19 wachten op accoord verklaring 16 eindcontrolere, tekenen niet std. 15 inv, 1e controle, printen NIET STD. retour gewenst wachten2 4 dubbele aanvraag? 5 navraag VA (telefoon) 6 opvragen
  • ntbrekende
gegevens NS uitgeprint D2 geen retour
  • ntvangen
Niet Standaard offerte 21 registreren offerte afgelegd is collectief
  • pvagen gegevens
wachten dubbele D1 Geen reactie 8 verlopen deadline 11 afwijzen Afgekeurd NS afgewezen collectief retour reeds ontvangen P of C retour gewenst particulier zonder retour collectief particulier en invoeren particulier en afwijzen niet compleet/onjuist particulier collectief incompleet voldoende
  • nvoldoende
slide-41
SLIDE 41 PAGE 44

Alpha algorithm

slide-42
SLIDE 42 PAGE 45

Without transactional information (just completes)

slide-43
SLIDE 43 PAGE 46

Starting point: event logs

event logs, audit trails, databases, message logs, etc. www.xes-standard.org

slide-44
SLIDE 44 PAGE 47

XES (compatible with MXML)

Event log consists of:

  • traces (process instances)

− events

  • Standard extensions:
  • concept (for naming)
  • lifecycle (for transactional properties)
  • org (for the organizational perspective)
  • time (for timestamps)
  • semantic (for ontology references)
slide-45
SLIDE 45 PAGE 48

extensions loaded every trace has a name every event has a name and a transition classifier = name + transition start of trace (i.e. process instance) name of trace name of event (activity name) resource transition timestamp

slide-46
SLIDE 46 PAGE 49 PAGE 49

start of trace name of trace name of event (activity name) resource data associated to event timestamp end of trace (i.e. process instance)

slide-47
SLIDE 47 PAGE 50

Example log

  • Minimal information in log: case

id’s and task id’s.

  • Additional information: event

type, time, resources, and data.

  • Sequences:
  • 1: ABCD
  • 2: ACBD
  • 3: ABCD
  • 4: ACBD
  • 5: EF
  • So this log there are three

possible sequences:

  • ABCD
  • ACBD
  • EF

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

slide-48
SLIDE 48 PAGE 51

>,→,||,# relations

  • Direct succession: x>y iff

for some case x is directly followed by y.

  • Causality: x→y iff x>y and

not y>x.

  • Parallel: x||y iff x>y and

y>x

  • Choice: x#y iff not x>y and

not y>x.

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

A>B A>C B>C B>D C>B C>D E>F A→B A→C B→D C→D E→F B||C C||B

ABCD ACBD EF

slide-49
SLIDE 49 PAGE 52

Basic idea (1) x y

x→y

slide-50
SLIDE 50 PAGE 53

Basic idea (2)

x→y, x→z, and y||z

x z y

slide-51
SLIDE 51 PAGE 54

Basic idea (3)

x→y, x→z, and y#z

x z y

slide-52
SLIDE 52 PAGE 55

Basic idea (4)

x→z, y→z, and x||y

x y z

slide-53
SLIDE 53 PAGE 56

Basic idea (5)

x→z, y→z, and x#y

x y z

slide-54
SLIDE 54 PAGE 57

It is not that simple! Basic Alpha algorithm

Let W be a workflow log over T. α(W) is defined as follows.

  • 1. TW = { t ∈ T | ∃σ ∈ W t ∈ σ},
  • 2. TI = { t ∈ T | ∃σ ∈ W t = first(σ) },
  • 3. TO = { t ∈ T | ∃σ ∈ W t = last(σ) },
  • 4. XW = { (A,B) | A ⊆ TW ∧ A ≠ ø ∧ B ⊆ TW ∧ B ≠ ø ∧

∀a ∈ A∀b ∈ B a →W b ∧ ∀a1,a2 ∈ A a1#W a2 ∧ ∀b1,b2 ∈ B b1#W b2 },

  • 5. YW = { (A,B) ∈ X | ∀(A′,B′) ∈ X A ⊆ A′ ∧B ⊆ B′⇒ (A,B) = (A′,B′) },
  • 6. PW = { p(A,B) | (A,B) ∈ YW } ∪{iW,oW},
  • 7. FW = { (a,p(A,B)) | (A,B) ∈ YW ∧ a ∈ A } ∪ { (p(A,B),b) | (A,B) ∈

YW ∧ b ∈ B } ∪{ (iW,t) | t ∈ TI} ∪{ (t,oW) | t ∈ TO}, and

  • 8. α(W) = (PW,TW,FW).
slide-55
SLIDE 55 PAGE 58

Example revisited

W:

case case 1 1 : t : task ask A A case case 2 2 : t : task ask A A case case 3 3 : t : task ask A A case case 3 3 : t : task ask B B case case 1 1 : t : task ask B B case case 1 1 : t : task ask C C case case 2 2 : t : task ask C C case case 4 4 : t : task ask A A case case 2 2 : t : task ask B B case case 2 2 : t : task ask D D case case 5 5 : t : task ask E E case case 4 4 : t : task ask C C case case 1 1 : t : task ask D D case case 3 3 : t : task ask C C case case 3 3 : t : task ask D D case case 4 4 : t : task ask B B case case 5 5 : t : task ask F F case case 4 4 : t : task ask D D

A B C D E F

α(W)

A>B A>C B>C B>D C>B C>D E>F A→B A→C B→D C→D E→F B||C C||B

slide-56
SLIDE 56 PAGE 59

Exercise (1)

  • What does the Alpha algorithm produce for a log

consisting only of the following traces?

  • ABCD
  • ACBD
  • AED
  • Direct succession: x>y iff for

some case x is directly followed by y.

  • Causality: x→y iff x>y and not y>x.
  • Parallel: x||y iff x>y and y>x
  • Choice: x#y iff not x>y and not

y>x.

Let W be a workflow log over T. α(W) is defined as follows.

  • 1. TW = { t ∈ T | ∃σ ∈ W t ∈ σ},
  • 2. TI = { t ∈ T | ∃σ ∈ W t = first(σ) },
  • 3. TO = { t ∈ T | ∃σ ∈ W t = last(σ) },
  • 4. XW = { (A,B) | A ⊆ TW ∧ A ≠ ø ∧ B ⊆ TW ∧ B ≠ ø ∧

∀a ∈ A∀b ∈ B a →W b ∧ ∀a1,a2 ∈ A a1#W a2 ∧ ∀b1,b2 ∈ B b1#W b2 },

  • 5. YW = { (A,B) ∈ X | ∀(A′,B′) ∈ X A ⊆ A′ ∧B ⊆ B′⇒ (A,B) =

(A′,B′) },

  • 6. PW = { p(A,B) | (A,B) ∈ YW } ∪{iW,oW},
  • 7. FW = { (a,p(A,B)) | (A,B) ∈ YW ∧ a ∈ A } ∪ {

(p(A,B),b) | (A,B) ∈ YW ∧ b ∈ B } ∪{ (iW,t) | t ∈ TI} ∪{ (t,oW) | t ∈ TO}, and

  • 8. α(W) = (PW,TW,FW).
slide-57
SLIDE 57 PAGE 60

Another example taken step-by-step ...

slide-58
SLIDE 58 PAGE 61

A>B A>C A>E B>C D>D C>B C>D E>D A→B A→C A→E B→D C→D E→D B||C C||B

slide-59
SLIDE 59 PAGE 62
slide-60
SLIDE 60 PAGE 63

A and B need to be non-empty.

A>B A>C A>E B>C D>D C>B C>D E>D A→B A→C A→E B→D C→D E→D B||C C||B

# #

slide-61
SLIDE 61 PAGE 64
slide-62
SLIDE 62 PAGE 65
slide-63
SLIDE 63 PAGE 66

Exercise (2)

  • What does the Alpha algorithm produce for a log

consisting only of the following traces?

  • ACD
  • BCE
  • Direct succession: x>y iff for

some case x is directly followed by y.

  • Causality: x→y iff x>y and not y>x.
  • Parallel: x||y iff x>y and y>x
  • Choice: x#y iff not x>y and not

y>x.

Let W be a workflow log over T. α(W) is defined as follows.

  • 1. TW = { t ∈ T | ∃σ ∈ W t ∈ σ},
  • 2. TI = { t ∈ T | ∃σ ∈ W t = first(σ) },
  • 3. TO = { t ∈ T | ∃σ ∈ W t = last(σ) },
  • 4. XW = { (A,B) | A ⊆ TW ∧ A ≠ ø ∧ B ⊆ TW ∧ B ≠ ø ∧

∀a ∈ A∀b ∈ B a →W b ∧ ∀a1,a2 ∈ A a1#W a2 ∧ ∀b1,b2 ∈ B b1#W b2 },

  • 5. YW = { (A,B) ∈ X | ∀(A′,B′) ∈ X A ⊆ A′ ∧B ⊆ B′⇒ (A,B) =

(A′,B′) },

  • 6. PW = { p(A,B) | (A,B) ∈ YW } ∪{iW,oW},
  • 7. FW = { (a,p(A,B)) | (A,B) ∈ YW ∧ a ∈ A } ∪ {

(p(A,B),b) | (A,B) ∈ YW ∧ b ∈ B } ∪{ (iW,t) | t ∈ TI} ∪{ (t,oW) | t ∈ TO}, and

  • 8. α(W) = (PW,TW,FW).
slide-64
SLIDE 64 PAGE 67

Exercise (3)

  • What does the Alpha algorithm produce for a log

consisting only of the following traces?

  • ACEG
  • AECG
  • BDFG
  • BFDG
  • Direct succession: x>y iff for

some case x is directly followed by y.

  • Causality: x→y iff x>y and not y>x.
  • Parallel: x||y iff x>y and y>x
  • Choice: x#y iff not x>y and not

y>x.

Let W be a workflow log over T. α(W) is defined as follows.

  • 1. TW = { t ∈ T | ∃σ ∈ W t ∈ σ},
  • 2. TI = { t ∈ T | ∃σ ∈ W t = first(σ) },
  • 3. TO = { t ∈ T | ∃σ ∈ W t = last(σ) },
  • 4. XW = { (A,B) | A ⊆ TW ∧ A ≠ ø ∧ B ⊆ TW ∧ B ≠ ø ∧

∀a ∈ A∀b ∈ B a →W b ∧ ∀a1,a2 ∈ A a1#W a2 ∧ ∀b1,b2 ∈ B b1#W b2 },

  • 5. YW = { (A,B) ∈ X | ∀(A′,B′) ∈ X A ⊆ A′ ∧B ⊆ B′⇒ (A,B) =

(A′,B′) },

  • 6. PW = { p(A,B) | (A,B) ∈ YW } ∪{iW,oW},
  • 7. FW = { (a,p(A,B)) | (A,B) ∈ YW ∧ a ∈ A } ∪ {

(p(A,B),b) | (A,B) ∈ YW ∧ b ∈ B } ∪{ (iW,t) | t ∈ TI} ∪{ (t,oW) | t ∈ TO}, and

  • 8. α(W) = (PW,TW,FW).
slide-65
SLIDE 65

More on Process Discovery

slide-66
SLIDE 66 PAGE 69

Examples of process discovery techniques

  • Algorithmic techniques
  • Alpha miner
  • Alpha+, Alpha++, Alpha#
  • FSM miner
  • Fuzzy miner
  • Heuristic miner
  • Multi phase miner
  • Genetic process mining
  • Single/duplicate tasks
  • Distributed GM
  • Region-based process mining
  • State-based regions
  • Language based regions
  • Classical approaches not dealing with concurrency
  • Inductive inference (Mark Gold, Dana Angluin et al.)
  • Sequence mining
slide-67
SLIDE 67 PAGE 70

Genetic Mining

(Ana Karla Alves de Medeiros et al.)

  • 1. initial population
  • 2. fitness test
  • 3. select best parents
  • 4. crossover
  • 5. children
  • 6. mutation
  • 7. new population
A B C E H F I J D L K M A B C E H F I J L M A B C E H F I J L K M A C H F I J L K M
slide-68
SLIDE 68 PAGE 71

Design choices

  • 1. initial population
  • 2. fitness test
  • 3. select best parents
  • 4. crossover
  • 5. children
  • 6. mutation
  • 7. new population

representation fitness crossover mutation

slide-69
SLIDE 69 PAGE 72

Properties of Genetic Mining

  • Requires a lot of computing power.
  • Can deal with noise, infrequent behavior, duplicate

tasks, invisible tasks, etc.

  • Allows for incremental improvement and

combinations with other approaches (heuristics post-optimization, etc.).

slide-70
SLIDE 70 PAGE 73

Challenge: Balancing Between Underfitting and Overfitting

slide-71
SLIDE 71 PAGE 74

The essence

A B C D E

ABCD ACBD AED ABCD ABCD AED ACBD ...

slide-72
SLIDE 72 PAGE 75

But ...

A B C D E

Any log containg activities A, B, C, D, and E.

start end

slide-73
SLIDE 73 PAGE 76

Finding a balance

A D C E B A D C E B

ACD BCE ... ACD ACE BCE BCD ...

(a) (b) (c) (d)

more behavior more behavior

slide-74
SLIDE 74 PAGE 77

A D C E B A D C E B

ACD ACE BCE BCD 99 85

slide-75
SLIDE 75 PAGE 78

A D C E B A D C E B

ACD ACE BCE BCD 99 88 85 78

slide-76
SLIDE 76 PAGE 79

A D C E B A D C E B

ACD ACE BCE BCD 99 2 85 3

slide-77
SLIDE 77 PAGE 80

Structure: Is this the simplest model (Occam's Razor)? Fitness: Is the event log possible according to the model? Precision: Is the model not underfitting (allow for too much)? Generalization: Is the model not overfitting (only allow for the “accidental” examples)?

Evaluating process mining results

slide-78
SLIDE 78 PAGE 81
slide-79
SLIDE 79

Representing process models

PAGE 82
slide-80
SLIDE 80 PAGE 83 Trip facts are released for accounting Planned trip is approved Travel Expenses Advance payment Need to correct planned trip is transmitted Unrequested trip has taken place Trip facts and receipts have been released for checking Approved trip has taken place Trip costs statement is transmitted Entry
  • f trip
facts Entry
  • f a
travel request Accounting date is reached Payment amount transmitted to bank/ payee Cancellation Trip costs must be included in cost accounting Amounts liable to employment tax transmitted to payroll Amounts relevant to accounting transmitted to payroll accounting Need for trip has arisen Payments must be released Trip is requested Approval
  • f trip
facts Payment must be effected Approval
  • f travel
request Trip expenses reimbursement is rejected Planned trip must be canceled Trip advance is transmitted/ paid Trip expenses reimbursement must be canceled Trip is canceled Trip costs cancelation statement is transmitted Planned trip is rejected Approval
  • f trip
facts is transmitted A B C E D
slide-81
SLIDE 81 PAGE 84

Highlights more important paths More significant nodes are emphasized

slide-82
SLIDE 82 PAGE 85

Aggregation

Clustering of coherent, less significant structures

Abstraction

Removing isolated, less significant structures

More to learn from maps...

slide-83
SLIDE 83 PAGE 86

Fuzzy miner

slide-84
SLIDE 84 PAGE 87

Showing reality

slide-85
SLIDE 85

Back to the future …

slide-86
SLIDE 86

software system (process) model event logs

models analyzes

discovery

records events, e.g., messages, transactions, etc. specifies configures implements analyzes supports/ controls

extension conformance

“world”

people machines

  • rganizations

components business processes

PAGE 89
slide-87
SLIDE 87 PAGE 90

Pre redi dict ct: When wil will I b be h home? ? At 1 11.26! Rec ecomme

  • mmend: How
  • w to

to get home ASAP et home ASAP? Take Take a a lef eft tu t turn! Detec etect: You You d drive too ve too fas ast! t!

slide-88
SLIDE 88

Operational Support: Detect, Predict, and Recommend

PAGE 91

current data historic data (simulation) models learn (discover and enhance) detect predict recommend alerts predictions recommendations

slide-89
SLIDE 89

Operational Support and Conformanc Checking Based on Replay

slide-90
SLIDE 90

A B C D E p2 end p4 p3 p1 start

Play Out (Classical use of models)

PAGE 93

A B C D A C B D A B C D A E D A C B D A C B D A E D A E D

slide-91
SLIDE 91

Play In (Process Discovery)

PAGE 94

A B C D E p2 end p4 p3 p1 start

ABCD ACBD AED ACBD AED ABCD … a process discovery algorithm like the α algorithm

slide-92
SLIDE 92

A B C D E p2 end p4 p3 p1 start

Replay

PAGE 95

A B C D

slide-93
SLIDE 93

A B C D E p2 end p4 p3 p1 start

Replay can detect problems

PAGE 96

AC D

Problem! missing token Problem! token left behind

slide-94
SLIDE 94

A B C D E p2 end p4 p3 p1 start

Replay can extract timing information

PAGE 97

A5B8 C9D13

5 8 9 13

3 4 5 4

3 2 6 5 8 7 6 4 7 7 4 3

slide-95
SLIDE 95

Example: Conformance Checker

slide-96
SLIDE 96 PAGE 99

Conformance checker

(Anne Rozinat et al.)

How to quantify this?

slide-97
SLIDE 97 PAGE 100

Fitness by replay

m=missing,r=remaining,c=consumed,p=produced

slide-98
SLIDE 98 PAGE 101

No problem (m=0, r=0)

slide-99
SLIDE 99 PAGE 102

Another (impossible) trace

slide-100
SLIDE 100 PAGE 103
slide-101
SLIDE 101 PAGE 104

Fitness calculation

slide-102
SLIDE 102 PAGE 105

Examples

f=1.000 f=0.995 f=0.540

slide-103
SLIDE 103 PAGE 106

Diagnostics

slide-104
SLIDE 104 PAGE 107

Other Metrics

  • Fitness is not sufficient: hence other metrics are

needed such as behavioral and structural appropriateness, etc.

  • These metrics cover aspects such as:
  • Punishing for "too much" behavior.
  • Punishing for "overly complex" models.
slide-105
SLIDE 105

Another Replay Example: Time-based Operational Support (TOS)

slide-106
SLIDE 106

Architecture

PAGE 109
slide-107
SLIDE 107

Step 1: Learn Transition System

PAGE 110

Prefix-set-activity abstraction is used here. Other abstractions possible.

slide-108
SLIDE 108

Step 2: Replay Log for Time Information

PAGE 111

e=elapsed, r=remaining, and s=sojourn.

slide-109
SLIDE 109

Step 3: Calculate statistics

PAGE 112 PAGE 112
slide-110
SLIDE 110

Step 4: Start Operational Support

PAGE 113
slide-111
SLIDE 111

A B C D

known past unknown future current state

A B A B ? ? A B C ?

detect: B does not fit the model (not allowed, too late, etc.) predict: some prediction is made about the future (e.g. completion date or outcome)

T=10

recommend: based on past experiences C is recommended (e.g., to minimize costs)

PAGE 114
slide-112
SLIDE 112

Business Intelligence Tools?

slide-113
SLIDE 113 PAGE 116

Business Intelligence Tools?

  • Business Objects (SAP)
  • Cognos Business Intelligence (IBM)
  • Oracle Business Intelligence
  • Hyperion (Oracle)
  • SAS Business Intelligence
  • Microsoft Business Intelligence
  • SAP Business Intelligence (SAP BI)
  • Jaspersoft (Open Source Business Intelligence)
  • Pentaho BI Suite (Open Source)
  • ....
  • Dashboards, reports, scorecards,
  • Slicing and dicing, data mining, ...
slide-114
SLIDE 114 PAGE 117

Process Mining Software

ARIS Process Performance Manager Interstage Automated Business Process Discovery & Visualization Process Discovery Focus Futura Reflect Enterprise Visualization Suite Comprehend BPM|one fluxicon/nitro ProcessGold

slide-115
SLIDE 115

Conclusion

slide-116
SLIDE 116

Process Mining !

slide-117
SLIDE 117

More Information

PAGE 120 PAGE 120

IEEE Task Force on Process Mining

  • ProM Software: prom.sourceforge.net
  • Process mining: www.processmining.org
  • ProM 5 series nightly builds: prom.win.tue.nl/tools/prom/nightly5/
  • ProM 6 series nightly builds: prom.win.tue.nl/tools/prom/nightly/
  • Converting logs (MXML-based) promimport.sourceforge.net
  • XES: www.xes-standard.org and www.openxes.org
  • Papers et al.: vdaalst.com
  • IEEE Task Force on Process Mining: www.win.tue.nl/ieeetfpm/