Lost in transaction? Strategies to deal with (in)consistency in - - PowerPoint PPT Presentation

lost in transaction
SMART_READER_LITE
LIVE PREVIEW

Lost in transaction? Strategies to deal with (in)consistency in - - PowerPoint PPT Presentation

Lost in transaction? Strategies to deal with (in)consistency in distributed systems @berndruecker Once upon a time: try { tx.begin(); doA(); Do A doB(); tx.commit(); All or + } catch (Exception e) { tx.rollback(); nothing } Do B Or


slide-1
SLIDE 1

@berndruecker

Lost in transaction?

Strategies to deal with (in)consistency in distributed systems

slide-2
SLIDE 2

Do A Do B

All or nothing +

try { tx.begin(); doA(); doB(); tx.commit(); } catch (Exception e) { tx.rollback(); } @Transactional public void createCustomer(Customer cust) { // ... }

Or simply: Once upon a time:

slide-3
SLIDE 3

A C I D Atomicity Consistency Isolation Durability

slide-4
SLIDE 4

Distributed systems

slide-5
SLIDE 5

Distributed systems

slide-6
SLIDE 6

Distributed systems

slide-7
SLIDE 7

But there is two-phase commit (XA)!!

TX Coordinator Resource Managers Prepare Phase Commit Phase

slide-8
SLIDE 8

Pat Helland

Distributed Systems Guru Worked at Amazon, Microsoft & Salesforce

slide-9
SLIDE 9

Pat Helland

Grown-Ups Don’t Use Distributed T ransactions

Distributed Systems Guru Worked at Amazon, Microsoft & Salesforce

slide-10
SLIDE 10

Starbucks does not use two phase commit

https://www.enterpriseintegrationpatterns.com/ramblings/18_starbucks.html Photo by John Ingle

slide-11
SLIDE 11

Eric Brewer

Atomicity Consistency Isolation Durability

http://pld.cs.luc.edu/courses/353/spr11/notes/brewer_keynote.pdf

slide-12
SLIDE 12

That means

Do A Do B Temporarily inconsistent Eventually consistent again t Consistent Local ACID Local ACID 1 (micro-)service 1 aggregate 1 program 1 resource

Violates „I“

  • f ACID
slide-13
SLIDE 13

You might know this from:

Do A Do B Temporarily inconsistent Eventually consistent again t Consistent

Photo by Gerhard51, available under Creative Commons CC0 1.0 license.

slide-14
SLIDE 14

„Building on Quicksand“ Paper

A C I D 2.0

Pat Helland

slide-15
SLIDE 15

Associative Commutative Idempotent Distributed 2.0

(a + b) + c = a + (b + c) a + b = b + a f(x) = f( f(x) ) „Building on Quicksand“ Paper

Pat Helland

slide-16
SLIDE 16

Photo by pixabay, available under Creative Commons CC0 1.0 license.

slide-17
SLIDE 17

Requirement: Idempotency of services!

Photo by pixabay, available under Creative Commons CC0 1.0 license.

slide-18
SLIDE 18

Requirement: Idempotency of services!

Photo by Chr.Späth, available under Public Domain.

slide-19
SLIDE 19

Example

Credit Card Payment

charge

slide-20
SLIDE 20

Strategy: retry

Credit Card Payment

Charge Credit Card cardNumber amount Charge Credit Card cardNumber amount transactionId

Not Not idempotent Idempotent has to be idempotent

charge

slide-21
SLIDE 21

Distributed

slide-22
SLIDE 22

It is impossible to differentiate certain failure scenarios:

Independant of communication style!

Service Provider Client

slide-23
SLIDE 23

Strategy: Cleanup

Credit Card Payment

charge Make sure it is not charged! Cancel charge cardNumber amount transactionId Raise payment failed

slide-24
SLIDE 24

Some communication challenges require state.

slide-25
SLIDE 25

Strategy: Stateful retry

Credit Card Payment

charge

slide-26
SLIDE 26

Strategy: Stateful retry

Credit Card Payment

charge Make sure it is not charged!

slide-27
SLIDE 27

Warning: Contains Opinion

slide-28
SLIDE 28

Berlin, Germany

bernd.ruecker@camunda.com @berndruecker

Bernd Ruecker

Co-founder and Chief T echnologist of Camunda

slide-29
SLIDE 29

Let‘s use a lightweight OSS workflow engine for this:

slide-30
SLIDE 30

Payment

Stateful retry

Credit Card

REST

slide-31
SLIDE 31

Stateful retry & cleanup

Credit Card Payment

REST

Cancel charge

slide-32
SLIDE 32

Live hacking

https://github.com/flowing/flowing-retail/tree/master/rest

slide-33
SLIDE 33

Embedded Engine Example (Java)

https://blog.bernd-ruecker.com/architecture-options-to-run-a-workflow-engine-6c2419902d91

slide-34
SLIDE 34

Remote Engine Example (Polyglot)

https://blog.bernd-ruecker.com/architecture-options-to-run-a-workflow-engine-6c2419902d91

slide-35
SLIDE 35

A relatively common pattern

Service (e.g. Go) Kafka / Rabbit RDMS

  • 1. Receive
  • 4. Send additional events
  • 2. Business Logic
  • 3. Send

response ? ACK

slide-36
SLIDE 36

„Can this handle 15k requests per second?“

slide-37
SLIDE 37

„Yes.“

slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40

https://blogs.msdn.microsoft.com/pathelland/2007/05/15/memories-guesses-and-apologies/

slide-41
SLIDE 41

Compensation – the classical example Saga

book hotel book car book flight cancel hotel cancel car

1. 2. 3. 5. 6.

In case of failure trigger compensations book trip

slide-42
SLIDE 42

2 alterntive approaches: choreography & orchestration

slide-43
SLIDE 43

Event-driven choreography

Hotel Flight Car T rip

T rip booked Flight booked T rip requested Hotel booked Car booked Request trip

slide-44
SLIDE 44

Event-driven choreography

Hotel Flight Car T rip

T rip failed T rip requested Hotel booked Car booked Request trip Flight failed Car canceled Hotel canceled Perform undo (cancel car booking) Perform undo (cancel hotel)

slide-45
SLIDE 45

The danger is that it's very easy to make nicely decoupled systems with event notification, without realizing that you're losing sight of that larger-scale flow, and thus set yourself up for trouble in future years.

https://martinfowler.com/articles/201701-event-driven.html

slide-46
SLIDE 46

The danger is that it's very easy to make nicely decoupled systems with event notification, without realizing that you're losing sight of that larger-scale flow, and thus set yourself up for trouble in future years.

https://martinfowler.com/articles/201701-event-driven.html

slide-47
SLIDE 47

The danger is that it's very easy to make nicely decoupled systems with event notification, without realizing that you're losing sight of that larger-scale flow, and thus set yourself up for trouble in future years.

https://martinfowler.com/articles/201701-event-driven.html

slide-48
SLIDE 48

Classical example Saga

book hotel book car book flight cancel hotel cancel car

1. 2. 3. 5. 6.

In case of failure trigger compensations book trip

slide-49
SLIDE 49

If your transaction involves 2 to 4 steps, choreography might be a very good fit. However, this approach can rapidly become confusing if you keep adding extra steps in your transaction as it is difficult to track which services listen to which events. Moreover, it also might add a cyclic dependency between services as they have to subscribe to one another’s events.

Denis Rosa Couchbase https://blog.couchbase.com/saga-pattern-implement-business-transactions-using-microservices-part/

slide-50
SLIDE 50

Microservice pioneers have become aware

slide-51
SLIDE 51

Implementing changes in the process

Hotel Flight Car T rip

T rip failed T rip requested Hotel booked Car booked Request trip Flight failed Car canceled Hotel canceled

We have a new basic agreement with the car rental agency and can cancel for free within 1 hour – do that first!

slide-52
SLIDE 52

Implementing changes in the process

Hotel Flight Car T rip

T rip failed T rip requested Hotel booked Car booked Request trip Flight failed Car canceled Hotel canceled

You have to adjust all services and redeploy at the same time!

We have a new basic agreement with the car rental agency and can cancel for free within 1 hour – do that first!

slide-53
SLIDE 53

Photo by born1945, available under Creative Commons BY 2.0 license.

slide-54
SLIDE 54

What we wanted

Photo by Lijian Zhang, available under Creative Commons SA 2.0 License and Pedobear19 / CC BY-SA 4.0

slide-55
SLIDE 55

Orchestration

Hotel Flight Car T rip

T rip booked Request trip Book hotel Hotel booked Car booked Flight booked Book car Book flight

slide-56
SLIDE 56

Orchestration

Hotel Flight Car T rip

T rip booked Request trip Book hotel Hotel booked Car booked Flight booked Book car Book flight

We have a new basic agreement with the car rental agency and can cancel for free within 1 hour – do that first!

You have to adjust one service and redeploy only this one!

slide-57
SLIDE 57

Describe orchestration with BPMN

T rip

T rip booked Request trip

slide-58
SLIDE 58

The workflow is part of the service

T rip

slide-59
SLIDE 59

The workflow is part of the service

T rip Payment

slide-60
SLIDE 60

Caitie McCaffrey | @caitie

slide-61
SLIDE 61

Graphical models?

slide-62
SLIDE 62

Clemens Vasters Architect at Microsoft http://vasters.com/archive/Sagas.html

slide-63
SLIDE 63

Clemens Vasters Architect at Microsoft http://vasters.com/archive/Sagas.html

slide-64
SLIDE 64

Clemens Vasters Architect at Microsoft http://vasters.com/archive/Sagas.html

slide-65
SLIDE 65

BPMN

Business Process Model and Notation ISO Standard

slide-66
SLIDE 66

Living documentation for long-running behaviour

slide-67
SLIDE 67

Visual HTML reports for test cases

slide-68
SLIDE 68

BizDevOps

slide-69
SLIDE 69

Saga with AWS Step Functions

https://theburningmonk.com/2017/07/applying-the- saga-pattern-with-aws-lambda-and-step-functions/

slide-70
SLIDE 70

Thoughts on the state machine | workflow engine market

slide-71
SLIDE 71

Thoughts on the state machine | workflow engine market

OSS Workflow or Orchestration Engines Stack Vendors, Pure Play BPMS Low Code Platforms Homegrown frameworks to scratch an itch Integration Frameworks Cloud Offerings

Uber, Netflix, AirBnb, ING, … AWS Step Functions, Azure Durable Functions, … Camunda, Zeebe, jBPM, Activiti, Mistral, … PEGA, IBM, SAG, … Apache Airflow, Spring Data Flow, … Apache Camel, Balerina, …

Data Pipelines

slide-72
SLIDE 72

Does it support stateful operations? Does it support the necessary flow logic? Does it support BizDevOps? Does it scale?

slide-73
SLIDE 73

My personal pro-tip for a shortlist ;-)

OSS Workflow or Orchestration Engines Stack Vendors, Pure Play BPMS Low Code Platforms Homegrown frameworks to scratch an itch Integration Frameworks Cloud Offerings Data Pipelines

Camunda & Zeebe

slide-74
SLIDE 74

Recap

  • Grown ups don‘t use distributed transactions

but eventual consistency

  • Idempotency is super important in distributed systems
  • Some consistency challenges require state
  • Know some strategies
  • Stateful retry & cleanup
  • Saga / Compensation
  • Apologies
slide-75
SLIDE 75

Thank you!

slide-76
SLIDE 76

mail@berndruecker.io @berndruecker https://berndruecker.io https://medium.com/berndruecker https://github.com/berndruecker

https://www.infoq.com/articles/events- workflow-automation

Contact: Slides: Blog: Code:

https://www.infoworld.com/article/3254777/ application-development/ 3-common-pitfalls-of-microservices- integrationand-how-to-avoid-them.html https://thenewstack.io/5-workflow-automation- use-cases-you-might-not-have-considered/