Marlon Dumas University of Tartu, Estonia Estonian Theory Days | - - PowerPoint PPT Presentation

marlon dumas university of tartu estonia
SMART_READER_LITE
LIVE PREVIEW

Marlon Dumas University of Tartu, Estonia Estonian Theory Days | - - PowerPoint PPT Presentation

Marlon Dumas University of Tartu, Estonia Estonian Theory Days | 3-4 October 2015 Process Mining Discovery discovered model Deviance Performance event log Difference Enhanced model diagnos7cs event log / Conformance 2


slide-1
SLIDE 1

Marlon Dumas University of Tartu, Estonia

Estonian Theory Days | 3-4 October 2015

slide-2
SLIDE 2

Process Mining

ü/û

event ¡log discovered ¡model Discovery Conformance Deviance Difference diagnos7cs Performance input ¡model Enhanced ¡model event ¡log’

2

slide-3
SLIDE 3

Automated Process Discovery

3

Enter Loan Application Retrieve Applicant Data Compute Installments Approve Simple Application Approve Complex Application Notify Rejection Notify Eligibility

CID ¡ Task ¡ Time ¡Stamp ¡ … ¡ 13219 ¡Enter ¡Loan ¡Applica3on ¡ 2007-­‑11-­‑09 ¡T ¡11:20:10 ¡

  • ­‑ ¡

13219 ¡Retrieve ¡Applicant ¡Data ¡ 2007-­‑11-­‑09 ¡T ¡11:22:15 ¡

  • ­‑ ¡

13220 ¡Enter ¡Loan ¡Applica3on ¡ 2007-­‑11-­‑09 ¡T ¡11:22:40 ¡

  • ­‑ ¡

13219 ¡Compute ¡Installments ¡ 2007-­‑11-­‑09 ¡T ¡11:22:45 ¡

  • ­‑ ¡

13219 ¡No3fy ¡Eligibility ¡ 2007-­‑11-­‑09 ¡T ¡11:23:00 ¡

  • ­‑ ¡

13219 ¡Approve ¡Simple ¡Applica3on ¡ 2007-­‑11-­‑09 ¡T ¡11:24:30 ¡

  • ­‑ ¡

13220 ¡Compute ¡Installements ¡ 2007-­‑11-­‑09 ¡T ¡11:24:35 ¡

  • ­‑ ¡

… ¡ … ¡ … ¡ … ¡

slide-4
SLIDE 4

Automated Process Discovery

  • Relations-based

– Alpha

4

slide-5
SLIDE 5

Alpha Algorithm

  • Direct successors:

A > B, B > C, C > D, A > C, C > B, B > E, E > F C > E, E > G B > D

A B C D A C B E F

  • Causality:

A → B, C → D, A → C, B → E, C → E, E → F, E → G , B → D

  • Concurrency:

B ║ C

  • Exclusiveness: all other pairs

A B C E G A C B D

5

slide-6
SLIDE 6

Alpha Relations Matrix

A B C D E F G A

# → → → # # #

B ←

# || → → # #

C ←

|| # → → # #

D

# ← ← # # # #

E

# ← ← # # → →

F

# # # # ← # #

G

# # # # ← # #

6

slide-7
SLIDE 7

A B C D E F G A

# → → # # # #

B ←

# || → → # #

C ←

|| # → → # #

D

# ← ← # # # #

E

# ← ← # # → →

F

# # # # ← # #

G

# # # # ← # #

Alpha Algorithm – Patterns

7

a→ b, a→ c, b ║ c

slide-8
SLIDE 8

Automated Process Discovery

  • Relations-based

– Alpha: lossy (Badouel 2012) – Alpha++, Alpha#, Alpha$ – Heuristics miner (frequency information)

  • Genetic
  • Region theory
  • Petri net synthesis
  • Integer Linear Programming (ILP)

8

slide-9
SLIDE 9

Automated Process Discovery

Automated process discovery method Simplicity

minimal size & structural complexity

Precision

does not parse traces not in the log

Fitness

parses the traces of the log

Generalization

parses traces of the process not included in the log

9

slide-10
SLIDE 10

Conformance Checking

?

10

slide-11
SLIDE 11

Alignment-Based Conformance Check

Log Model

A B C D E A B B C

Alignment

E

Fitness Precision How much behavior of the log is captured by the model? How accurate is the model describing the log?

Munoz-Gama et al. Petri nets 2013 11

slide-12
SLIDE 12

Imprecision of Alignment-Based Conformance Checking

  • {ABCD, ACBD} à 100%
  • {ABD, ACBCD} à 100%
  • {ACD, ABCBD} à 100%

12

slide-13
SLIDE 13

Deviance Mining

13 T1 ¡<e11[d111:v111, ¡…, ¡d11n:v11n] ¡e12[d121:v121, ¡…, ¡d12m:v12m] ¡… ¡e1p[d1p1:v1p1, ¡…, ¡d1pm:v1pm]> ¡ … ¡ Tq ¡<eq1[dq11:vq11, ¡…, ¡dq1n:vq1n] ¡eq2[dq21:vq21, ¡…, ¡dq2m:vq2m] ¡… ¡eqp[dqp1:vqp1, ¡…, ¡dqpm:vqpm]> ¡ ¡

Find a function F: Trace à Boolean (or probability [0…1]) s.t.

  • F is an accurate approximation of the given labeling
  • F is explainable, e.g. set of simple predicates
slide-14
SLIDE 14

Deviance Mining via Sequence Classification

  • Apply discriminative sequence mining methods to

extract features characteristic of one class

  • Build classification models (e.g. decision trees)
  • Extract difference diagnostics from classification model
  • C. Sun et al. Mining explicit rules for software process evaluation.

ICSSP’2013. 14

slide-15
SLIDE 15

No Unified Foundation

Automated process discovery

  • Behavioral relations, theory of regions, ILP, …

Conformance checking

  • Replay, alignments

Deviance mining

  • Model delta analysis, sequence classification

15

slide-16
SLIDE 16

(Prime) Event Structures

  • Model of concurrency based on events

(occurrences of actions) and three relations

– Causality – Conflict – Concurrency

16

A B D E C

A C B D E E

slide-17
SLIDE 17

Petri Nets à à Event Structures

17

τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ

a b c d d c b d d

slide-18
SLIDE 18

Nets With Cycles à à Prefix Unfolding

18

Petri net N Complete prefix unfolding Causality-preservng prefix unfolding

slide-19
SLIDE 19

Comparison of Event Structures

19

{},{} {A},{A} {A,B},{A,B} {A,B,D},{A,B,D} {A,B,C,D},{A,B,D} {A,B,C,D,E}, {A,B,D,E}

match(A ) match(B ) match(D ) left_hide( C) match(D )

A B D E C

A C B D E E

? ES1 ES2

Armas-Cervantes et al. Behavioral Comparison of Process Models Based on […] Event Structures. BPM’2014

Partially Synchronized Product (PSP) In ES1, tasks C and B are mutually exclusive, while in ES2, B precedes C

slide-20
SLIDE 20

Event Structures for Process Mining

Deviance mining Conformance checking Process discovery

20

slide-21
SLIDE 21

Event Logs è è Event Structures

B || C

Concurrency Oracle Run Merger

5 2 3

21

slide-22
SLIDE 22

Event Structures for Log Delta Analysis

22 van Beest et al. Log delta analysis: Interpretable differencing of business process event logs. BPM’2015

slide-23
SLIDE 23

Event Structures for Log Delta Analysis

In L1, task C can be skipped after B, whereas in L2 it cannot

van Beest et al. Log delta analysis: Interpretable differencing of business process event logs. BPM’2015 23

slide-24
SLIDE 24

Log Delta Analysis

  • vs. Sequence Classification

448 cases 7329 events 363 cases, 7496 events Sequence classification 106-130 statements

IF |“NursingProgressNotes”| > 7 .5 THEN L1 IF |“Nursing Progress Notes”| ≤ 7 .5 AND |“Nursing Assessment”| > 1.5 THEN L2 …

Log delta analysis 48 statements

In L1, “Nursing Primary Assessment” is repeated after “Medical Assign Start” and “Triage Request”, while in L2 it is not. …

24 van Beest et al. Log delta analysis: Interpretable differencing of business process event logs. BPM’2015

slide-25
SLIDE 25

Event Structures for Conformance Checking

Receive application Check credit history Assess loan risk Appraise property Assess eligibility

A B C D E

A B D E C

Receive application Appraise property Assess eligibility Check credit history Assess loan risk

A B C D E

A C B D E E

25

ABDE ADBE ACDE ADCE

slide-26
SLIDE 26

Event Structures for Process Discovery?

26

ABDE ACDE ACDF Fold Merge Synth .

slide-27
SLIDE 27

Process Mining Reloaded

27