Data-Centric Workflow and Business Processes Victor Vianu and - - PowerPoint PPT Presentation
Data-Centric Workflow and Business Processes Victor Vianu and - - PowerPoint PPT Presentation
Data-Centric Workflow and Business Processes Victor Vianu and Jianwen Su Outline n Introduction Jianwen n Theoretical work on analysis of data-driven workflows Victor n Survey of issues in practical data-driven workflows Jianwen n Discussions of
Outline
n Introduction Jianwen n Theoretical work on analysis of data-driven workflows Victor n Survey of issues in practical data-driven workflows Jianwen n Discussions of Further Research Challenges Jianwen & Victor
20160413 2 Foundations of Data Management
n A BP is an assembly of (human) tasks to accomplish an objective v Eg: Obtaining a Permit n Each workflow model matches a BP model n Each workflow activity is a software program ( ) that interfaces one task
in the BP
n A WfM system manages executions, resources, documents, etc. n Will be sloppy: BP ≈ Process ≈ Work/low
BPMS ≈ WfMS
Business Processes & Workflow Management
Application Init review Review Approval Fee Certi/icate Delivery
BP Workflow Management (WfM) System workflow
3 20160413 Foundations of Data Management
20160413 4
Rising Demand of WfMSs
n Ubiquitous of BPs / workflows v old (traditional business apps) to
new (e-science, healthcare, digital governmental apps)
v all sizes (by # of tasks, # of enactments, …) v the number of workflow apps rapidly increasing n Digitization/IT causes pressure to scale, automate v E-documents, ability to get a lot of inputs: enhanced productivity
à raised expectation of workflow
v Rapidly changing environments: shortens idea-to-system time n Rapidly growing market: “Suites” alone $2.7B 2015 (Gartner) n Workflow management remains an art v design, development (implementation), making changes
are mostly ad hoc and rely on human creativity
Foundations of Data Management
n The need for workflow management is ubiquitous n Current workflow technologies provide inadequate support
for a variety of essential functionalities
n For transactional workflow, a key inhibitor is the lack of intuitively clear
ways to combining the various aspects of workflow
n There is a “long tail” phenomenon in applications that need and/or use
workflow management technologies
n Application areas of business, digital government, healthcare, and scientific
workflow face many common/overlapping problems, but are developing paradigms, techniques and tools largely in isolation
20160413 5
http://dcw2009.cs.ucsb.edu/
5 Findings [2009 NSF Workshop on Data-Centric Workflows]
Foundations of Data Management
Key Application Challenges
n Complexity v Detailed complexity,
dynamic complexity
n Facilitate Human interactions v Design stake holders,
client/performer
n Extra-functional aspects v Compliance, guarantees,
security/privacy, …
n Variation, evolution, and long tail v E.g., legacy systems, … n Workflow Interoperation
20160413 6
n Unifying holistic conceptual model
for Wf/BP management CHEVI
n Design and runtime issues CEVI n Reasoning
CVI
n Workflow analytics/discovery/
improvement CHEV
n Provenance
CE
n Process mining
CHEV
n Interoperation
CI
Research Challenges
http://dcw2009.cs.ucsb.edu/
Foundations of Data Management
Five Classes of Biz Process Models
n Data agnostic : data mostly not present v WF (Petri) nets, BPMN, UML activity diagrams, … n Data-aware : data (variables) present but missing storage and
management
v BPEL, YAWL, BPMN 2.0 (?), … n Storage-aware : persistent stores but mappings to/from biz process data
managed ad hoc
v jBPM, Activiti, JTang, … n Artifact-centric : entity (biz data) and lifecycle v GSM (Barcelona), (D)EZ-Flow, … n Universal artifact : add automation for modeling all five types, data-storage
mapping
v Universal Artifacts (UA)
7 20160415 20160413 Foundations of Data Management
Outline
n Introduction Jianwen n Theoretical work on analysis of data-driven workflows Victor n Survey of issues in practical data-driven workflows Jianwen n Discussions of Further Research Challenges Jianwen & Victor
20160413 8 Foundations of Data Management
Outline
n Introduction Jianwen n Theoretical work on analysis of data-driven workflows Victor n Survey of issues in practical data-driven workflows Jianwen v Independence of Data and Execution Management v Workflow Management v Workflow Interoperation n Discussions of Further Research Challenges Jianwen & Victor
20160413 9 Foundations of Data Management
Enterprise System Architectures
n Evolve around information/database systems n Multiple applications with overlapping data
20160413 10
Application OS 1970’s Application OS 1980’s DBMS Warehouse Management OS Purchase Order Manag. OS DBMS HR Application OS DBMS
HR DB POM DB WH data /iles
[Weske 2012]
Foundations of Data Management
Enterprise Resource Planning (ERP)
n Composed of (extended) database systems and
application specific software for applications since 2000’s
n Typically too complex to integrate, or interoperate: very challenging
20160413 11
product planning marketing & sales
Database System Enterprise Resource Planning
application system application system
...
inventory management
standard DBMS technology
1990’s
Foundations of Data Management
...
Supply Chain Management
...
Customer Relation Management
Key Obstacle: Ad Hoc Workflow Management in ERP
n Most enterprise applications include business processes/workflows n WfMSs: handcrafted, out-sourced, or standalone (jBPM, Activiti, …) n BPs are key assets/functions of enteprises
20160413 12
product planning marketing & sales
Database System Enterprise Resource Planning
application system application system
...
inventory management
standard DBMS technology
Workflow Management: ad hoc, handcrafted, many context dependent decisions
Foundations of Data Management
Typical Workflow Management System Architecture
n Used in YAWL, jBPM, Activiti, Barcelona, …,
and possibly systems from major vendors
n During execution, data can be held in each of the shaded shapes
à creates many problems
20160413 13
Execution Engine Local data store Enterprise database
Task wrapper
. . .
Task wrapper Task wrapper
WfMS
[van der Aalst-van Hee 2004] Includes all data required for control flow decisions, correlations, …
Foundations of Data Management
Example: Enterprise Database Fails
n DBMS does recovery, but data may not be
consistent with data in the local store, engine, and wrappers
n Other failures similar: local data store v Engine (workflow “log”?) v Wrapper (more trouble if keeping persistent data, e.g., MVC) n Other difficulties: Process change, compliance, …
20160413 14
Execution Engine Local data store
Enterprise database
Task wrapper
. . .
Task wrapper Task wrapper
WfMS
Enterprise database
ShippingAddress: undefined Update ShippingAddress
Foundations of Data Management
[S.-Yang EVL-BP15]
Independence of Data Management and Execution Management
n Clean separation of responsibilities v WfMS: Execution v DBMS: Data n Allows Divide-and-Conquer for management functions v Helps in many aspects
Execution Independence the freedom of changing the process execution system while leaving conceptual BP models unchanged and vice versa
15
[Sun-S.-Yang BPM14 TMIS16]
20160415 20160413 Foundations of Data Management
Intersections of Databases & Software Engineering
n Could benefit a wide range of data-intensive or data-centric software
applications Currently lacks:
n Models, frameworks, and principles n Theoretical foundations n Tools and techniques
20160413 16 Foundations of Data Management
Conceptualizing Running Workflows (or Other Software)
Work/low instances Database
n Each workflow (BP) instance consists of
a universal artifact and a lifecycle
n Data mappings are ad hoc in current development practice n Primitive mapping support: ADO.NET Entity Framework n
17 20160413
. . .
Foundations of Data Management
[Melnik-Adya-Bernstein TODS08]
Example: Enterprise Database (& Lifecycle)
n Includes keys, foreign keys, and
a cardinality specification on each foreign key
18
Repair Application w(ID) w(Customer Name) r(Customer Address) . . . Application Review . . . Repairperson Assignment w(Service ID) w(Repairperson Name) w(Repairperson Phone) . . . Document Archive . . . On-site Repair w(Material ID) w(Material) . . . Post-repair Visit . . .
tUser tLastName tFirstName tPhone tAddress . . . tRepair tRepairID tCustomerLN tCustomerFN tReason tDate . . . tServiceInfo tServiceID tRerpairID_SI tTime
. . .
tRepairPerson tServiceID_P tRerpairpersonLN tRepairpersonFN . . . tMaterialInfo tMaterialID tServiceID_MI tMaterial
. . .
tReview tReviewID tServiceID_R tReviewResult
. . .
* * * * + ?
20160413 Foundations of Data Management
Example: The (Universal) Artifact
Tuple and (nested) set constructs
19
aID unique aRepairInfo aCustomer aRepair aReason aDate aCust_Name aCust_Addr aCust_Last_Name aCust_First_Name aService unique aTime aRepairPerson aService_Info aRP LastName aRP FirstName aRP Phone aMaterial aReviewID aResult aMaterial Info unique in aMaterial Info aMaterial_Info aReview_Info unique
20160413 Foundations of Data Management
Data Mapping Idea
SIID Date RepairID
S01 11/15 R101 S02 11/29 R102 S03 12/17 R101
RIID CustName
R101 David R102 Peter
ServiceInfo RepairInfo
UserName Address
Peter A2 David A1 James A3
User
R101 David A1 S01 11/29
Repair RID Customer CName Addr Services SID
. . . . . .
Date
{ }
. . .
S03 12/17
. . .
Addr = Addr.Customer.RID @RepairInfo(RIID).CustName @User(UserName).Address @ @
[Sun-S.-Wu-Yang ICDE14]
20 20160415 20160413 Foundations of Data Management
21
n aID : tRepair.tRepairID n aReason = aReason.aRepairInfo.aID@tRepair(tRepairID).tReason n aCustAddr = aCustAddr.aCust_Name.[aCust_Last_Name,
aCust_First_Name]@tUser(tLastName, tFirstName).tAddress
Example: Cross Reference Paths
aID unique aRepairInfo aCustomer aRepair aReason aDate aCust_Name aCust_Addr aCust_Last_Name aCust_First_Name
tUser tLastName tFirstName tPhone tAddress
. . .
tRepair tRepairID tCustomerLN tCustomerFN tReason tDate
. . .
20160413 Foundations of Data Management
[Sun-S.-Wu-Yang ICDE14]
n Database updatability:
for each update Δd on d, there is an update Δe such that Δe(µ(d)) = µ(Δd(d))
n Artifact updatability:
for each update Δe on µ(d), there is an update Δd such that µ(Δd(d)) = Δe(µ(d))
Updatability
Workflow e = µ(d) DB d µ
Δe Δd
µ
Δd(d) Δe(µ(d))
22 20160415 20160413 Foundations of Data Management
n Database updatability: forward, can always be done n Artifact updatability: backward, often not possible
Very closely related to database view update problem
[Bancilhon-Spyratos TODS81]
v View complement [BS81] [Lechtenbörger et al PODS03] v Clean source [Dayal-Bernstein TODS82][Wang et al DKE06] n For cross-ref-path expressions:
Every non-overlapping UA map is updatable [Sun-S.-Wu-Yang ICDE14]
Updatability & View Update
23 20160413 Foundations of Data Management
n µ is isolating if each update on a single artifact (instance) will not affect
write (and/or read) attributes of other artifact instances
n Main result: Isolation can be tested v Testing “conflicting” updates
v EXPTIME with conditional updates
Isolation of BP Instances
Snapshot DB d µ . . .
[Sun-S.-Wu-Yang ICDE14]
24 20160413 Foundations of Data Management
n Fundamentals v What are these mappings? v Updatability, what else? v Mapping languages n Design principles v Isolation, for lifecycles? runtime mechanisms? v Data design completeness, needs ontology v Implementability: translating IOPEs on artifact to DB n Transactions v Workflow vs databases
Connecting Artifacts and Databases
20160413 25 Foundations of Data Management
Outline
n Introduction Jianwen n Theoretical work on analysis of data-driven workflows Victor n Survey of issues in practical data-driven workflows Jianwen v Independence of Data and Execution Management v Workflow Management v Workflow Interoperation n Discussions of Further Research Challenges Jianwen & Victor
20160413 26 Foundations of Data Management
Four Stages of Biz Process Life Cycle
n Need to manage process models (schemas)
20160413 27
(re-)design implementation/ configuration enactment/ monitoring diagnosis/ requirements
Foundations of Data Management
n Examples of what’s needed: v Design tools, discovery, analytics, runtime support, changes,
composition, constructing views, …
n A rough classification of management operations:
A repository of work/low schemas (Raw) Log (a sequence of records) Work/low schemas Execution traces Static analysis Log query Data tables etc. Resource planning Process mining, BI
Input Output
Workflow Management
20160413 28 Foundations of Data Management
Model manipulations Process mining
Model Manipulations
n Why querying model repository? v Find workflows (schemas) that do “this” from a repository
(e.g., ~200K schemas in CNR)
v “This” may be, a single activity, a sequence of activities, a graph of
activities, or even with cycles
l Could involve data contents, e.g., purchases including an X-ray machine,
without manager’s approval, …
n Views/model translations v Behavior preserving, and: quality of models, understandability n Workflow change/evolution
20160413 29 Foundations of Data Management
Repository Queries
n A classification of about 20 approaches: [Wang-Jin-Wong-Wen WWWJ13] v Exact match vs similarity based v Graph vs behaviors
- e.g., [Deutch-Milo JCSS12], VisTrail [Scheidegger-Vo-Koop-Freire-Silva SIGMOD08]
v Operation semantics (ontology) n May return models/schemas or “top” traces n Overlapping vs containment (equivalence) n Data not included
20160413 30 Foundations of Data Management
Types of Process Mining
n The following types of process mining can be distinguished: v Determine basic performance metrics v Determine process model v Determine organizational model v Analyze social network
(i.e., relations between actors)
v Analyze performance characteristics
(i.e., derive rules explaining performance)
20160413 31
If …then …
Start Register order Prepare shipment Ship goods (Re)send bill Receive payment Contact customer Archive order EndFoundations of Data Management
Determine Basic Performance Metrics
n Process/control-flow perspective: flow time, waiting time, processing time,
synchronization time, e.g.
v What is the average flow time of orders? v What is the maximum waiting time for activity approve? v What percentage of requests is handled within 10 days? v What is the minimum processing time of activity reject? v What is the average time between scheduling an activity and actually starting it?
n Resource perspective: frequencies, time, utilization, and variability, e.g.
v How many times did Sue complete activity reject claim? v How many times did John withdraw activity go shopping? v How many times did Clare suspend some running activity? v How much time did Peter work on instances of activity reject claim? v How much time did people with role Manager work on this process? v What is the utilization of John? v What is the average utilization of people with role Manager? v How many times did John work for more than 2 hours without interruption?
32 20160413 Foundations of Data Management
33
Example (ARIS PPM)
n IDS Scheer’s ARIS Process Performance Manager
20160413 Foundations of Data Management
Determine Organizational Model
n Discover the organizational model (i.e., roles, departments, etc.) without
prior knowledge about the structure of the organization
34 20160413 Foundations of Data Management
Analyze Social Network
n Social Network Analysis (SNA) n Based on: v Handover of work v Subcontracting v Working together v Reassignments v Doing similar tasks
20160413 35 Foundations of Data Management
Analyze Performance Characteristics
n Each case (process/workflow instance) has a number of properties: v Resource that worked on a specific activity v Value of a characteristic data element
(e.g., size of order, age of customer, etc.)
v Performance metrics of case (e.g., flow time) n Using machine-learning techniques it is possible to find relevant relations
between these properties
n Examples: v If John and Mike work
together, it takes longer
v Expensive cases require less processing v . . .
20160413 36
caseid Act A Act B ... Act Z Data D1 Data D2 ... Data D9 Proc time Wait Time Flow time 1 John Mike Anne $50 20y 80% 12h 3d 3.5d 2 Clare Jim Ike $75 15y 75% 6h 3d 3.25d 3 John Mike Clare $55 20y 80% 18h 4d 4.75d ... ... ... ... ... ... ... ... ... ... ... ...
Foundations of Data Management
37
Process Mining: Determine Process Model
n Discover a process model (e.g., in terms of a Petri net or workflow net)
without prior knowledge about the structure of the process
n Useful for: v Process discovery
(What is the process?)
v Delta analysis
(Are we doing what was specified?)
v Performance analysis
(How can we improve?)
B A C E F D
20160413
activity log
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D
Foundations of Data Management
n Fitness:
the discovered model should allow for the behavior seen in the event log
n Precision:
the discovered model should not allow for behavior completely unrelated to what was seen in the event log
n Generalization:
the discovered model should generalize the example behavior seen in the event log
n Simplicity:
the discovered model should be as simple as possible
20160413 38
Process Discovery Criteria
Foundations of Data Management
Alpha Algorithm
39
α
22 Op berge n en ein de 10 regist reren 14 ein dcontrolere, tekene n Standaard 17 bepale n vervolg 9 Bepalen vervolg 1 18 regist reren offert e ges lo t e n 13 inv., 1e controle, printen STANDAARD 3 controleren comple ethei d/j u is theid 1 st a rt 2 collecti e f of partic u lier 12 Bepalen
- ffert e
st a nda ard of NI ET klaar voor invoe ren Go edg ekeurde of fe rte begin proces klaar voor controle comple et/ juis t klaar voor re gist reren naar registreren
- ffert e u
it g eprint klaar voor einde Standaard offerte afgek e urde offert e 20 ont van gst verklaring P2 accoord verklaring 7 ont van gst gegeven s P1 ont brek e nde gegeven s 19 wachten op accoord verklaring 16 ein dcontrolere, tekene n nie t st d . 15 inv, 1e controle, printen NI ET STD. ret our gewenst wachten2 4 dubbele aanvra ag? 5 nav raa g VA (telef oon ) 6 opv rag en
- nt brek e
nde gegeven s NS u i tg eprint D2 geen ret our
- nt van
gen Niet Standaard of fert e 21 regist reren offert e afgelegd is collectief
- pv a
gen gegevens wachten dubbele D1 Ge en react ie 8 verlo pen deadlin e 11 afwijz e n Afgekeu rd NS afgewe z en collecti e f retour ree ds ontvang en P of C retou r ge wenst partic u lier zon der retou r collecti e f partic u lier en invoeren partic u lier en afwijzen nie t compleet/onjuis t partic u lier collecti e f incomp leet voldo ende
- nv o
ldoende
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D
n Minimal information in log:
case id’s and task id’s
n Additional data ignored:
event type, time, resources, and data
n This example log: three
possible sequences (traces):
v ABCD v ACBD v EF n Alternative log representation
[ABCD2, ACBD2, EF]
20160413 Foundations of Data Management
Four Ordering Relations >, →, ||, #
n Direct succession:
x > y iff in some case x is directly followed by y
n Causality:
x → y iff x > y ∧ y ≯ x
n Parallel:
x || y iff x > y ∧ y > x
n Choice:
x # y iff x ≯ y ∧ y ≯ x
n Footprint:
40
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D A>B A>C B>C B>D C>B C>D E>F A→B A→C B→D C→D E→F B||C C||B
[ABCD2, ACBD2, EF]
A#D A#E …
20160413 Foundations of Data Management
A B C D E F A # → → # # # B ← # || → # # C ← || # → # # D # ← ← # # # E # # # # # → F # # # # ← #
α Algorithm and Process Mining
n Given a sound and structured workflow net N, a log L of N is complete if
(1) L contains all possible > relations in N and (2) every transition of N occurs
n Given a sound & structured workflow net N such that
no two transitions share input places and share output places, and a complete log L of N, α(L)=N
n Can’t handle short loops (size < 3), an improvement: α+ v Given a complete log, α/α+ can mine any sound & structured wf-net n Not considered: frequencies, data n Other approaches: causal nets, Markov model (state machines) v Genetic algorithms, “region-based” mining (e.g., state sets or languages)
20160413 41
[vanderAalst+ TKDE04]
Foundations of Data Management
Mining Processes with Data
n Majority of existing process mining methods cannot be applied
directly to artifact-centric BPs
v existing process mining techniques tailored to classical monolithic
processes where each process execution can be described just by the flow of activities
v for processes with n-to-m relationships (among log records) classical
process mining techniques yield incomprehensible results
n Possible approach: v chain of methods that can be applied to discover artifact lifecycle models
in GSM notation
20160413 42
[Popova-Fahland-Dumas IJCIS15]
Foundations of Data Management
20160413 43
Artifact Discovery: The Main Idea
n All events may be recorded together in single log, without any case/
artifact-based grouping
[Popova-Fahland-Dumas IJCIS15]
Timestamp EventType Data (attribute-value pairs) 11-24,17:12 ReceivePO items=(it0), POrderID=1 11-24,17:13 CreateMO supplier=supp6, items=(it0), POrderID=1, MOrderID=1 11-24,19:56 ReceiveMO supplier=supp6, items=(it0), POrderID=1, MOrderID=1 11-24,19:57 ReceiveSupplResp supplier=supp6, items=(it0), POrderID=1, MOrderID=1, answer=accept 11-25,07:20 ReceiveItems supplier=supp6, items=(it0), POrderID=1, MOrderID=1 11-25,08:31 Assemble items=(it0), POrderID=1, MOrderID=1 11-25,08:53 ReceivePO items=(it0,it1,it2,it3), POrderID=2
Foundations of Data Management
n Does not assume any database schemas n Log records has attribute-value pairs n Entity precedence: creation time of entities (instances)
Artifact Structure Discovery
20160413 44
[Popova-Fahland-Dumas IJCIS15]
Foundations of Data Management
Outline
n Introduction Jianwen n Theoretical work on analysis of data-driven workflows Victor n Survey of issues in practical data-driven workflows Jianwen v Independence of Data and Execution Management v Workflow Management v Workflow Interoperation n Discussions of Further Research Challenges Jianwen & Victor
20160413 45 Foundations of Data Management
n Orchestration: Hub-and-spoke or mediated
Store Seller Warehouse Bank Store Seller Warehouse Bank Mediator
Web Services approaches are fundamentally process centric Data-centric approach can enable
- Explicit modeling of
correlations
- Mediation at scale
20160413 46
Web Services Collaboration Models (Extreme Cases)
n Choreography: Peer to peer
Foundations of Data Management
n A choreography defines how biz processes should collaborate to achieve a
business goal
n Goal: Support for choreography languages: v Design “correctness”, auto realization, mechanisms for monitoring, ... v However: Notion of correlation not explicitly modeled
Store Seller Warehouse Bank
choreography
20160413 47
Choreography of Web Services
Foundations of Data Management
A Classification of Collaboration Models
Distributed (choreography) WSCDL Let’s Dance
- Conv. Protocol
Choreography4Artifacts Centralized (orchestration) BPEL BPMN YAWL Roman Services HUB No logical data model Logical data model, Centralized data Logical data model, Distributed data
20160413 48 Foundations of Data Management
Control Flow Data
O1 P1 P3 P2 M1 F1 F2
Order Purchase Payment Ful/illment Data: if the amount is less $100, then a fulfillment instance can ship even the check is not received Instance-level correlation: Which process instances are correlated during the runtime? Who sends messages to whom?
20160413 49
Two Key Aspects of Choreography Languages
n Use of artifacts at each node can provide systematic support for correlations
[Sun-Xu-S. ICSOC12]
Foundations of Data Management
n Two artifact instances are correlated if they are involved in a common
collaborative BP instance
v Messaging only between correlated instances n Correlations of a collaborative BP are defined in a diagram, with one
artifact as the root or primary process
v Directed edge indicates
creation of BP instance(s)
v Cardinality constraints
are also defined
v Some syntactic
restrictions (acyclic, “1” on root, …)
n Correlations can also be derived
Payment Purchase Ful=illment Order
1 m m 1 1 1
store bank warehouse seller
20160413 50
Correlation Diagrams
Foundations of Data Management
n Participants’ interactions with the hub: v Read data from the hub v Perform tasks for hub that progress artifacts along their lifecycle v Subscribe for notifications based on conditions, transitions, etc. n Performed using SOA/REST, but in an unconventional manner
Candidate Pool Travel Agencies Hiring Departments Evaluator Pool Human Resources (HR) Reimbursement Hiring Interoperation Hub
Interoperation Hub Example
Stakeholder Organizations of type “Hiring Dept”: System also tracks individual people in each stakeholder
- rganization
Job Application Job Opening
20160413 51 Foundations of Data Management
[Hull et al ICSOC09, SRIIGC12]
data-centric, supports rather than controls
Outline
n Introduction Jianwen n Theoretical work on analysis of data-driven workflows Victor n Survey of issues in practical data-driven workflows Jianwen v Independence of Data and Execution Management v Workflow Management v Workflow Interoperation n Discussions of Further Research Challenges Jianwen & Victor
20160413 52 Foundations of Data Management
Discussions and Further Research Challenges
n Unified holistic conceptual models (data + process) n Design tools, reasoning, views/transformation, process quality n Runtime issues: data consistency, wf transactions, logging n Process mining n Process improvement n Provenance n Interoperation
20160413 53 Foundations of Data Management
Reasoning and Synthesis Problems
n Verification: given all four elements n Incremental checking v E.g., for process changes n Synthesis: given artifacts, activities, and a property, construct process
model
n Functional properties and non-functional properties
20160413 54 Foundations of Data Management
Artifacts (data model)
+
Activities (services, tasks)
+
if C enable … Process Model (eg GSM)
⎟= ϕ
Property
Interoperation/Collaboration
n Very similar: n Realization problem: construct process models given other 3 n Also interesting to verify e.g. LTL+FO properties of collaborative BPs
20160413 55 Foundations of Data Management
Artifacts (data model)
+
Activities (services, tasks)
+
if C enable … Process Models with communication
⎟= C
choreography
Process Model Management (Data-Centric Wfs)
n Static analysis is most studied, approximate or incremental verification
needed
n BI in the DB community, in absence of workflow models n Very few research in other categories n Wf log: “fuzzy” notion but central to many problems
20160413 56 Foundations of Data Management
A repository of work/low schemas (Raw) Log (a sequence of records) Work/low schemas Model manipulations Process mining Execution traces Static analysis Log query Data tables etc. Resource planning Process mining, BI
Input Output
Execution Management
n Data management: data mappings v Specification, maintenance of consistency n Workflow transactions and db transactions n Logging n Combining workflow and analytics: v Workflow execution à Wrangling à Machine Learning à Wf change
20160413 57 Foundations of Data Management