Accurate and Transparent Path Prediction Using Process Mining Gal - - PowerPoint PPT Presentation

accurate and transparent path prediction using process
SMART_READER_LITE
LIVE PREVIEW

Accurate and Transparent Path Prediction Using Process Mining Gal - - PowerPoint PPT Presentation

8-11 September 2019, Bled, Slovenia Accurate and Transparent Path Prediction Using Process Mining Gal BERNARD Periklis ANDRITSOS University of Lausanne, University of Toronto, Faculty of Business and Faculty of Information, Economics


slide-1
SLIDE 1

Accurate and Transparent Path Prediction Using Process Mining

Gaël BERNARD University of Lausanne, Faculty of Business and Economics (HEC), Switzerland Periklis ANDRITSOS University of Toronto, Faculty of Information, Canada 8-11 September 2019, Bled, Slovenia

slide-2
SLIDE 2

2 Event: A Event: B Event: C Event: D Event: E

Prefix Suffix

slide-3
SLIDE 3

Use Case: Call Center_

3

slide-4
SLIDE 4

4

Use Case: Healthcare _

slide-5
SLIDE 5

5

Use Case: Online Retail _

slide-6
SLIDE 6

6

Definitions

slide-7
SLIDE 7

Trace Trace

Input

7

09/09/2019 - 16:35:37: Open the ticket 09/09/2019 - 16:37:39: Transfer the ticket 20/09/2019 - 13:12:31: Update the information 21/09/2019 - 09:14:32: Inform the customer 21/09/2019 - 09:14:32: Close the ticket

Events: Trace Event logs

slide-8
SLIDE 8

Process Mining

  • Process Mining

Discovery Algorithm

  • Inductive Miner
  • Process Tree

8

Event Logs: <abdef> <bdaegef> <dcefeg> <cdeg>

slide-9
SLIDE 9

Related Works

  • LSTM [1]
  • Process Mining based approach [2]

9 [1] Tax, N., Verenich, I., La Rosa, M., Dumas, M.: Predictive business process mon- itoring with lstm neural networks. In: International Conference on Advanced In- formation Systems Engineering. pp. 477–492. Springer (2017) [2] Polato, M., Sperduti, A., Burattin, A., de Leoni, M.: Time and activity sequence prediction of business process instances. arXiv preprint arXiv:1602.07566 (2016)

slide-10
SLIDE 10

10

LaFM

Loop aware Footprint Matrix

slide-11
SLIDE 11

LaFM

Loop aware Footprint Matrix

11

Step 1: Discover a process model Step 2: Build a footprint Step 3: Make prediction using the footprint

slide-12
SLIDE 12

Discover a process model

12 Event Logs Inductive Miner

Step 1: Discover Step 2: Build Step 3: Predict

slide-13
SLIDE 13

Capturing the Behaviors

13

Step 1: Discover Step 2: Build Step 3: Predict

Parallel

Exclusive choice

Loop

Order of execution Branch executed Number of times loops are executed

slide-14
SLIDE 14

Traces ABDEF and2(1)| Traces ABDEF 1 and2(1)| xor3| Traces ABDEF 1 1 and2(1)| xor3| Traces and4(1)| ABDEF 1 1 1 and2(1)| xor3| Traces and4(1)| ABDEF 1 1 1 and2(1)| xor3| Traces and2(2)| and4(1)| ABDEF 1 1 1 1 and2(1)| xor3| Traces and2(2)| and4(1)| and4(2)| ABDEF 1 1 1 2 1 and2(1)| xor3| Traces and2(2)| and4(1)| and4(2)| ABDEF 1 1 1 2 1 and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| ABDEF 1 1 1 2 1 2 and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| ABDEF 1 1 1 2 1 2 and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| loop5| ABDEF 1 1 1 1 2 1 2 and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| loop5| ABDEF 1 1 1 1 2 1 2 and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| loop5| xor7|loop5{1} ABDEF 1 1 1 1 2 1 1 2 and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| loop5| xor7|loop5{1} xor7|loop5{2} ABDEF 1 1 1 1 2 1 ∅ 1 2 and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| loop5| xor7|loop5{1} xor7|loop5{2} 1 2 1 2 1 2 2 1 1 2 1 ∅ ∅ ∅ 2 1 2 2 ABDEF BDAEGEF DCEFEG … 1 1 1 1 2 1 ∅ 1 2

Build the footprint

Step 1: Discover Step 3: Predict

Parallel

Exclusive choice

Loop

To Record:

  • 14

Step 2: Build

slide-15
SLIDE 15

Predict

  • Prefix: D

15

Step 1: Discover Step 2: Build Step 3: Predict

and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| loop5| xor7|loop5{1} xor7|loop5{2} 1 2 1 2 1 2 2 1 1 2 1 ∅ ∅ ∅ 2 1 2 2 ABDEF BDAEGEF DCEFEG … 1 1 1 1 2 1 ∅ 1 2

A EFEF B

slide-16
SLIDE 16
  • Prefix: DCEFEFEFEFE..
  • Xor7|Loop5{6} => ?

16

Step 1: Discover Step 2: Build

and2(1)| xor3| Traces and2(2)| and2(3)| and4(1)| and4(2)| loop5| xor7|loop5{1} xor7|loop5{2} 1 2 1 2 1 2 2 1 1 2 1 ∅ ∅ ∅ 2 1 2 2 ABDEF BDAEGEF DCEFEG … 1 1 1 1 2 1 ∅ 1 2

Step 3: Predict

Abstract and predict

slide-17
SLIDE 17

Evaluation Procedure

17

  • 30 synthetic datasets, publicly available [1]
  • 2/3 => training, 1/3 => test [2]
  • Algorithms tested:
  • LaFM, Markov Chain, LSTM
  • Metric used for accuracy:
  • Damerau–Levenshtein similarity [3]

[1] https://data.4tu.nl/repository/uuid:7455 4e7-8cc0-45b8-8a89-93e9c9dfab05 [2] Tax, N., Verenich, I., La Rosa, M., Dumas, M.: Predictive business process mon- itoring with lstm neural networks. In: International Conference on Advanced In- formation Systems

  • Engineering. pp. 477–492. Springer (2017)

[3] Damerau, F .J.: A technique for computer detection and correction of spelling er- rors. Communications of the ACM 7(3), 171–176 (1964)

slide-18
SLIDE 18

Results

18

slide-19
SLIDE 19

19

C-LaFM

Clustered LaFM

slide-20
SLIDE 20

20

slide-21
SLIDE 21

C-LaFM

  • Intuition: Complex datasets can be well

describe using several process models.

  • C-LaFM: Clustered LaFM
  • Based on Ngrams
  • Clustering using HDBSCAN [1]

21

[1] L. McInnes, J. Healy, S. Astels, hdbscan: Hierarchical density based clustering In: Journal of Open Source Software, The Open Journal, volume 2, number 11. 2017

slide-22
SLIDE 22

C-LaFM

22

Strong representatives

Discover BPM Replay

Weak representatives

slide-23
SLIDE 23

C-LaFM: Classifier

  • SGD: Stochastic Gradient Descent classifier
  • Training phase:
  • Train the strong representative with all prefix

lengths

  • Prediction phase:
  • Apply the SGD to assign the prefix to a cluster

23

slide-24
SLIDE 24

Evaluation

24

slide-25
SLIDE 25

Evaluation

25

slide-26
SLIDE 26

Evaluation

26

slide-27
SLIDE 27

27

Conclusion

slide-28
SLIDE 28

Conclusion

  • Black-Box vs White-Box
  • Limitations:
  • Pieces missing for Explainable AI:
  • Intelligible way to propose the prediction
  • Alternatives to Inductive Miner, HDBSCAN, and

SGD not tested

28