The dos and donts of task queues EuroPython 2019 Petr Stehlk - - PowerPoint PPT Presentation

the dos and don ts of task queues
SMART_READER_LITE
LIVE PREVIEW

The dos and donts of task queues EuroPython 2019 Petr Stehlk - - PowerPoint PPT Presentation

The dos and donts of task queues EuroPython 2019 Petr Stehlk @petrstehlik $ whoami Petr Stehlk Python developer @ Kiwi.com Finance tribe Outline 1. Task queues 2. The story 3. Examples vs. reality 4. Final setup 5. How we do it


slide-1
SLIDE 1

The dos and don’ts of task queues

EuroPython 2019 Petr Stehlík @petrstehlik

slide-2
SLIDE 2

Petr Stehlík Python developer @ Kiwi.com Finance tribe

$ whoami

slide-3
SLIDE 3
  • 1. Task queues
  • 2. The story
  • 3. Examples vs. reality
  • 4. Final setup
  • 5. How we do it in Kiwi.com
  • 6. Lessons learned
  • 7. Q&A

Outline

slide-4
SLIDE 4

Task queues

slide-5
SLIDE 5

“parallel execution of discrete tasks without blocking”

  • Not just Celery
  • Major parts

○ Queue ○ Task – unit of work ○ Producer ○ Consumer

What is a task queue

Source: DENÍK/Michal Kovář

slide-6
SLIDE 6
  • Decouple long-running task from a synchronous call
  • Perform something periodically
  • Break down software to more isolated pieces (when microservice is

too big)

  • Minimize wait time, latency and/or response time
  • Increase throughput of the system

For what is a task queue

slide-7
SLIDE 7

The story

slide-8
SLIDE 8

The story

slide-9
SLIDE 9

“New is always better.”

The story

slide-10
SLIDE 10

“Think outside the box.”

The story

slide-11
SLIDE 11

“I know everything I need.”

The story

slide-12
SLIDE 12

“I can do it better.”

The story

slide-13
SLIDE 13

Examples vs. reality

why it all happened

slide-14
SLIDE 14

Example

Celery/RQ

slide-15
SLIDE 15

Reality

RQ

slide-16
SLIDE 16

Reality

Celery

slide-17
SLIDE 17

Final setup

slide-18
SLIDE 18
  • Python + PostgreSQL
  • Flask
  • Connexion
  • Celery
  • Redis on AWS
  • Multiple deploy targets
  • Logz.io & Datadog
  • Sentry
  • PagerDuty

Final setup

slide-19
SLIDE 19

How we do it in Kiwi.com

In finance tribe

slide-20
SLIDE 20
  • Python + PostgreSQL
  • Flask/AioHttp
  • Connexion
  • Celery
  • Redis on AWS
  • Multiple deploy targets
  • Logz.io & Datadog
  • Sentry
  • PagerDuty

Kiwi.com | Finance Tribe toolset

slide-21
SLIDE 21
  • Python

○ New projects always 3.6+ ○ Old projects transitioning from 2.7 to 3.6 ○ Monolith -> microservice architecture

  • Flask/AioHttp

○ Our go-to framework ○ Boilerplates ○ Quick scaffolding

  • Connexion

○ OpenAPI 3 ○ Token-based authentication & authorization

Kiwi.com | Finance Tribe toolset

slide-22
SLIDE 22
  • Celery

○ Follow the best practices (next section)

  • Redis on AWS

○ Reliability ○ Easy to deploy

Kiwi.com | Finance Tribe toolset

slide-23
SLIDE 23
  • Multiple deploy targets

○ HTTP API ○ Workers ○ Etc. ○ Internal tool for deploying from Gitlab CI

  • Logz.io & Datadog

○ Extensive logging

  • Sentry

○ When something goes wrong

  • PagerDuty

○ When something goes really wrong

Kiwi.com toolset | Finance Tribe

slide-24
SLIDE 24

Lessons learned

slide-25
SLIDE 25

Lessons learned

Use Redis or AMQP broker (never a database)

slide-26
SLIDE 26

Lessons learned

Pass simple objects to the tasks

slide-27
SLIDE 27

Lessons learned

Do not wait for tasks inside tasks

slide-28
SLIDE 28

Lessons learned

Set retry limit

slide-29
SLIDE 29

Lessons learned

Use autoretry_for

slide-30
SLIDE 30

Lessons learned

Use retry_backoff=True and retry_jitter=True

slide-31
SLIDE 31

Lessons learned

Set hard and soft time limits

slide-32
SLIDE 32

Lessons learned

Use bind for a bit of extra oomph (logs, handling, etc.)

slide-33
SLIDE 33

Lessons learned

Use separate queues for demanding tasks (set priorities)

slide-34
SLIDE 34

Lessons learned

Prefer idempotency and atomicity

"Idempotence is the property of certain

  • perations in mathematics and

computer science, that can be applied multiple times without changing the result beyond the initial application."

  • Wikipedia

“Atomic operation appears to the rest of the system to occur instantaneously. Atomicity is a guarantee of isolation from concurrent processes.

  • Wikipedia
slide-35
SLIDE 35
  • Use Redis or AMQP (RabbitMQ) broker (never a database)
  • Pass simple objects to the tasks
  • Do not wait for tasks inside tasks
  • Set retry limit
  • Use autoretry_for
  • Use retry_backoff=True and retry_jitter=True
  • Set hard and soft time limits
  • Use bind for a bit of extra oomph in tasks (logging, handling, etc.)
  • Use separate queues for demanding tasks (set priorities)
  • Prefer idempotency and atomicity

Lessons learned

slide-36
SLIDE 36
  • Sharing codebase between producer and consumer (producer must know everything about

consumer and vica versa)

  • Use celery to its full potential -> read celery’s docs
  • Scalability of 3rd party APIs

Things to consider

slide-37
SLIDE 37

More info @ meet.kiwi.com Join our Wednesday party at Europython and win flight vouchers

slide-38
SLIDE 38

Meet us at the booth #45

slide-39
SLIDE 39

Any questions?

You can find me at @petrstehlik & petr.stehlik@kiwi.com