PAGE 1 How did PM tooling develop Three key over time? When did - - PowerPoint PPT Presentation

page 1 how did pm tooling develop three key over time
SMART_READER_LITE
LIVE PREVIEW

PAGE 1 How did PM tooling develop Three key over time? When did - - PowerPoint PPT Presentation

PAGE 1 How did PM tooling develop Three key over time? When did observations process mining What are the start? main research challenges? Conclusion What are the main Why is process PM developments discovery so How about data in


slide-1
SLIDE 1
slide-2
SLIDE 2

PAGE 1

slide-3
SLIDE 3

PAGE 2

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-4
SLIDE 4

PAGE 3

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-5
SLIDE 5

Positioning Process Mining

4

process mining

Data Mining (DM)

(clustering, classification, rule discovery, etc.)

Business Process Management (BPM)

(process analysis/modeling, enactment, verification, etc.)

performance-oriented questions, problems and solutions compliance-oriented questions, problems and solutions

slide-6
SLIDE 6

History and Origins of BPM

PAGE 5

database system user interface database system user interface database system application BPM system

1960 1975 1985 2000

application application application

BPM WFM

  • ffice

automation data modeling

  • perations

management scientific management business intelligence software engineering formal methods business process reengineering

Skip Ellis, Office Talk, 1979 Michael Zisman, SCOOP, 1977 Anatol Holt, Information Systems Theory Project, 1968 Carl Adam Petri, Petri nets, 1962

slide-7
SLIDE 7

History and Origins of Data Mining

PAGE 6

Classical statistics (since 500 BC): descriptive statistics (e.g., sample mean) statistical inference (e.g., confidence interval, regression, hypothesis testing). Artificial intelligence (since 1950): making intelligent machines by applying human-thought- like processing to statistical problems. Machine learning (since 1950): construction and study of systems that can learn from data.

slide-8
SLIDE 8

Data Mining: Supervised Learning

  • Labeled data, i.e., there is a response variable that

labels each instance.

  • Goal: explain response variable (dependent variable)

in terms of predictor variables (independent variables).

  • Classification techniques (e.g., decision tree

learning) assume a categorical response variable and the goal is to classify instances based on the predictor variables.

  • Regression techniques assume a numerical

response variable. The goal is to find a function that fits the data with the least error.

PAGE 7

slide-9
SLIDE 9

Example: Decision tree learning

PAGE 8

logic failed (79/10)

  • ≥ 8

passed (31/7) failed (101/8) linear algebra program ming

  • perat.

research cum laude (20/2) <8 <6 <6 passed (82/7) ≥ 6 ≥ 6 passed (87/11) ≥ 7 <7 linear algebra ≥ 6 <6 failed (20/4)

slide-10
SLIDE 10

Unsupervised Learning

  • Unsupervised learning assumes unlabeled data, i.e.,

the variables are not split into response and predictor variables.

  • Examples: clustering (e.g., k-means clustering and

agglomerative hierarchical clustering) and pattern discovery (association rules)

PAGE 9

slide-11
SLIDE 11

Example: Association rules

PAGE 10

slide-12
SLIDE 12

Example: Episode Mining

PAGE 11

a b c d E1 b c E2 a b c d E3

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

a c b d e c b b c f a e e b c d c b E1 E2 (16x) E1 E3

slide-13
SLIDE 13

PAGE 12

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-14
SLIDE 14

Language identification in the limit (Mark Gold 1967)

  • Mother uses sentences from some

language {aab, ab, ab, abc, …}.

  • "Perfect child" listens to mother and

hypothesizes what the full language is like (given all sentences so far).

  • Eventually the perfect child’s

hypothesis is correct and never changes again (without knowing), i.e., only finitely many wrong hypotheses are generated.

  • A language is learnable in the limit if

such a perfect child exists.

PAGE 13

Language identification in the limit by E Mark Gold, Information and Control, 10(5):447–474, 1967.

slide-15
SLIDE 15

Language identification in the limit (E. Mark Gold 1967)

  • Gold showed that most languages cannot be

learned in the limit (including the most simple

  • nes like regular languages (ab*(c|d)).
  • He noted that it matters whether the child gets

positive and negative examples (corrections), whether the mother is evil, etc.

  • Frequencies matter!
  • Representational bias matters!

PAGE 14

sentence ≅ trace in event log language ≅ process model

slide-16
SLIDE 16

Myhill-Nerode Theorem (1958) and the Biermann/Feldman Algorithm (1972)

  • There is a unique minimal deterministic finite

automaton recognizing a regular language L ( shown by John Myhill and Anil Nerode in 1958).

  • The equivalence classes defined by ≅ determine the

states of the automaton: x ≅ y if there is no z such that xz∉L and yz∈L.

  • Cannot be applied to example traces: overfitting and

no generalization.

  • Alan W. Biermann and Jerome A. Feldman propose

in 1972 techniques to learn finite state machines from examples (e.g., considering k-tails).

PAGE 15

  • Nerode. Linear automaton transformations. Proc. Amer. Math. Soc. 9 1958 541-544.

Biermann and Feldman. On the synthesis of finite-state machines from samples of their behaviour. IEEE Transactions on Computers, 21:592–597, 1972.

slide-17
SLIDE 17
slide-18
SLIDE 18

Where/when did process mining start?

  • Myhill/Nerode(1958)?
  • Gold (1967)?
  • Baum/Welch (1970)?
  • Biermann/Feldman (1972)?
  • Rakesh Agrawal (1994)?

− Apriori algorithm for frequent patterns, later extended to sequences, episodes, …

  • Jonathan Cook and Alexander Wolf (1998)?

− "Discovering Models of Software Processes from Event-Based Data" − using techniques similar to Biermann/Feldman (k-tails) and Baum/Welch (Markov models)

  • Rakesh Agrawal, Dimitrios Gunopulos, Frank Leymann?

− "Mining Process Models from Workflow Logs" (1998) − Flowmark process models without discovering type of splits and joins, no loops, etc.

  • Anindya Datta (1998)?

− Automating the Discovery of AS-IS Business Process Models − Biermann/Feldman style work, embedded in BPM

PAGE 17

slide-19
SLIDE 19
slide-20
SLIDE 20

Initial team

PAGE 19

slide-21
SLIDE 21

PAGE 20

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-22
SLIDE 22

Workflow Mining

PAGE 21

diagnosis/ requirements configuration/ implementation enactment/ monitoring adjustment (re)design models data insight discussion verification performance analysis animation specification documentation configuration

slide-23
SLIDE 23

Models, data, and systems coexist

PAGE 22

( r e ) d e s i g n implement/configure r u n & a d j u s t model-based analysis d a t a

  • b

a s e d a n a l y s i s

slide-24
SLIDE 24
slide-25
SLIDE 25

Team in November 2007

PAGE 24

Some people are missing, e.g., Peter van den Brand.

slide-26
SLIDE 26

Current process mining spectrum

(including alignments, operational support, and multiple perspectives)

PAGE 25

information system(s)

current data

“world”

people machines

  • rganizations

business processes documents historic data resources/

  • rganization

data/rules control-flow de jure models resources/

  • rganization

data/rules control-flow de facto models provenance explore predict recommend detect check compare promote discover enhance diagnose

cartography navigation auditing event logs Models

“pre mortem” “post mortem”

slide-27
SLIDE 27

PAGE 26

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-28
SLIDE 28

Pre-ProM

(figure from March 2002!)

PAGE 27

Staffware InConcert MQ Series

workflow management systemen

FLOWer Vectus Siebel

case handling / CRM systemen

SAP R/3 BaaN Peoplesoft

ERP systems

gemeenschappelijk XML formaat voor het opslaan van workflow logs

EMiT Little Thumb

mining tools

InWoLvE Process Miner Exper- DiTo alpha algorithm including time analysis (BvD) predecessor

  • f MXML

format predecessor of ProM's heuristic miner (TW) mining with duplicate tasks (Joachim Herbst) mining block structured models (Guido Schimm) evaluation tool (Laura Maruster) The first tool to support the alpha algorithm for process mining was the MiMo (Mining Module) tool based on ExSpect. Later it was implemented in EMiT and ProM.

Tobias Blickle (ARIS PPM)

slide-29
SLIDE 29

PAGE 28

EMiT MiMo Little Thumb Process Miner

slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33
slide-34
SLIDE 34
slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38

PAGE 37

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-39
SLIDE 39

How good is my model: Four forces

PAGE 38

fitness simplicity generalization precision

Process Mining

ability ¡to ¡explain ¡

  • bserved ¡behavior

avoiding ¡ underfitting Occam’s ¡Razor avoiding ¡

  • verfitting

lift gravity thrust drag

Leaving out one of these dimensions during discovery will lead to degenerate cases!

slide-40
SLIDE 40
slide-41
SLIDE 41

PAGE 40

formal (not just a picture) fast (should not take years) sound (result should at least be free

  • f deadlocks,

etc.) ability to balance all conformance dimensions (fitness, precision, generalization, and simplicity) incl. noise provide guarantees (not just a best effort) 1 2 3 4 5

slide-42
SLIDE 42

PAGE 41

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-43
SLIDE 43

PAGE 42

  • conformance checking to diagnose deviations
  • squeezing reality into the model to do model-based

analysis

#1 Alignments are essential!

slide-44
SLIDE 44

PAGE 43

#2 Models are like the glasses required to see and understand event data!

slide-45
SLIDE 45
slide-46
SLIDE 46

PAGE 45

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-47
SLIDE 47

PAGE 46

Finding sheep with five legs

we are getting close…

slide-48
SLIDE 48

PAGE 47

Distributing process mining problems to cope with big data

slide-49
SLIDE 49

PAGE 48

On-the-fly process mining Operational support

slide-50
SLIDE 50

Concept drift

PAGE 49

Concept drift

slide-51
SLIDE 51

Cross-organizational mining

PAGE 50

cross-organizational / comparative process mining

slide-52
SLIDE 52

PAGE 51

context aware process mining

slide-53
SLIDE 53

PAGE 52

Suppor upporting ing the he pr proces

  • cess
  • f
  • f pr

proces

  • cess mining

mining

slide-54
SLIDE 54

PAGE 53

Conclusion How about data mining and business process management? When did process mining start? What are the main PM developments in this century? How did PM tooling develop

  • ver time?

Three key

  • bservations

Why is process discovery so difficult? What are the main research challenges?

slide-55
SLIDE 55
slide-56
SLIDE 56