getting comfortable in prod
play

getting comfortable in prod to improve your life in dev @cyen - PowerPoint PPT Presentation

getting comfortable in prod to improve your life in dev @cyen @honeycombio first, some background Christine DEV DEV WRITE TEST COMMIT WRITE TEST COMMIT WRITE TEST COMMIT WRITE TEST COMMIT


  1. getting comfortable in prod to improve your life in dev @cyen @honeycombio

  2. first, some background…

  3. Christine DEV

  4. DEV WRITE → TEST → COMMIT → WRITE → TEST → COMMIT → WRITE → TEST → COMMIT → WRITE → TEST → COMMIT → WRITE → TEST → COMMIT → WRITE → TEST → COMMIT → WRITE → TEST → COMMIT → WRITE → TEST → COMMIT

  5. DEV OPS WRITE → TEST → COMMIT → RELEASE 💦 → DEBUG → FIX

  6. 💦 DEV OPS "The only good "Works on my diff is a red machine" diff"

  7. "Observation 1: Change is the most common trigger" —Subbu Allamaraju, Expedia, Feb 2019 
 https://m.subbu.org/incidents-trends-from-the-trenches-e2f8497d52ed

  8. API USER BILLING GATEWAY MGMT REST REST API API APP PARTNER PAYMENT WEB UI S MGMT REST REST API API INTERNAL TXN NOTIFICATION SYSTEM WEB UI MGMT REST REST API API THEN NOW

  9. DEV OPS "The only good "Works on my diff is a red machine" diff"

  10. DEV OPS THE FIRST WAVE: getting ops folks to code teaching devs to own THE SECOND WAVE: code in production

  11. observability DEV OPS it’s all about sharing SOFTWARE OWNERSHIP

  12. observability a.k.a. understanding the behavior of a system based on knowledge of its external outputs. a.k.a. "what is my software doing, and why is it behaving that way?"

  13. 
 
 monitoring observability The system as black box The system as a living, magic. Thresholds, alerts, adaptable thing. A culture of system signals like CPU and instrumentation and metadata memory. 
 rather than strictly-defined counters. 
 Checking and rechecking for known bad behaviors. Being able to tease out previously-unknown bad behaviors and outliers.

  14. DEV OPS WRITE → TEST → COMMIT → RELEASE 💦 → DEBUG → FIX

  15. DEV OPS WRITE → TEST → COMMIT → RELEASE → OBSERVE TEST OBSERVE

  16. DEV OPS MAKE HAUNTED GRAVEYARDS LESS SCARY

  17. … why devs, again?

  18. The 
 ▸ Design documents Software ▸ Architecture review DEV ▸ Test-driven development TEST ▸ Integration tests ▸ Code review Process ▸ Continuous integration ▸ Continuous deployment ▸ 🎊🥃🍿🎋 ▸ Observe our code in production

  19. EXPECTED --- FAIL: TestUnitTest (0.00s) talk_test.go:10: — expected: 4 (type int) ACTUAL actual: 5 (type int)

  20. 💦 DEV OPS "The only good "Works on my diff is a red machine" diff"

  21. DEV PROD still 
 observability

  22. prod, part of the dev process?

  23. when deciding… The 
 ▸ Design documents Software WHAT 
 ▸ Architecture review DEV to build ▸ Test-driven development ▸ Integration tests HOW TO 
 ▸ Code review Process build it ▸ Continuous integration ▸ Continuous deployment WHETHER 
 ▸ 🎊🥃🍿🎋 it works ("test in prod") ▸ (Wait for exception 
 tracker to complain)

  24. ▸ Locally: log lines, printfs, debuggers attached to our IDEs when ▸ What’s causing our code to deviate from deciding expectations? … WHAT ▸ Stop "pulling straws"—quantify pain, and start prioritizing.

  25. ▸ Know what "normal" really is ▸ Events (instrumentation) can be when like DEBUG statements in prod deciding ▸ What and how we build should be … HOW TO informed by reality

  26. ▸ Complex systems have an infinitely long list of black swan failure scenarios when ▸ "Test in Production" to experiment and check deciding hypotheses WHETHE … ▸ Feature flags + observability = 💜 R

  27. but this is hard.

  28. make prod feel more like dev

  29. TOOLS SHOULD SPEAK MY LANGUAGE ▸ As a dev, traditional monitoring tools don't tie back to the concepts I deal with in my code $YOUR_BIZ-relevant ID AWS availability zone time to render API endpoint CPU utilization payload size kafka partition build ID client OS Cassandra hostname

  30. TOOLS SHOULD SPEAK MY LANGUAGE ▸ As a dev, traditional monitoring tools don't tie back to the concepts I deal with in my code AWS availability zone customer ID 8bd3acf2 394817e6 7e7ea1d0 1528afb3 a87fcfcd 7e7ea1d0 7e7ea1d0 394817e6 7e7ea1d0 us-east-1 fb2ff7ca 2f67a581 394817e6 8bd3acf2 eu-west-1 70efe4da 2f67a581 7e7ea1d0 2f67a581 7e7ea1d0 fb2ff7ca 7e7ea1d0 1528afb3 4e4e1207 4e4e1207 1528afb3 1528afb3 1528afb3 98f1d93f 1528afb3 394817e6 us-west-2 144afb2f 2f67a581 2f67a581 98f1d93f 7e7ea1d0 7e7ea1d0 eu-central-1 7e7ea1d0 a87fcfcd 7e7ea1d0 8bd3acf2 7e7ea1d0 1528afb3 394817e6 us-west-1 2f67a581

  31. TOOLS SHOULD SPEAK MY LANGUAGE ▸ As a dev, traditional monitoring tools don't tie back to the concepts I deal with in my code AND LET ME ITERATE

  32. SHARE PATTERNS WHERE POSSIBLE ▸ Tracing helps production feel even more familiar: can map a trace directly to my code structure

  33. PROD SHOULD FEEL LIKE DEVELOPMENT?

  34. CHANGE CAN BE INCREMENTAL 2019-01-25T01:30:23.743Z Enqueued task 2019-01-25T01:30:24.120Z Task processed, returning 42 entries 2019-01-25T01:30:24.212Z Task complete (email sent to foobar@example.com) 2019-01-25T01:30:23.743Z Enqueued task task_id=72 type=enqueue target=email 2019-01-25T01:30:29.953Z Task timed out after 6.01 seconds task_id=72 type=process Timestamp=2019-01-25T01:30:29.953Z target=email message=Task timed out after 6.01 seconds queue_dur_ms=200 task_id=72 timeout_dur_ms=6010

  35. CHANGE CAN BE INCREMENTAL 2019-01-25T01:30:23.743Z Enqueued task task=72 2019-01-25T01:30:24.120Z Enqueued task task=74 2019-01-25T01:30:24.212Z Task processed, returning 42 entries task=74 2019-01-25T01:30:26.014Z Task complete (email sent to foobar@example.com) task=74 2019-01-25T01:30:26.214Z Enqueued task task=77 2019-01-25T01:30:24.120Z Task errored: unknown constant ::Fixnum task=77 2019-01-25T01:30:29.953Z Task timed out after 6.01 seconds task=72 2019-01-25T01:30:32.762Z Enqueued task task=78 2019-01-25T01:30:34.243Z Task processed, returning 0 entries task=78 2019-01-25T01:30:34.243Z Task complete, (email sent to bazqux@example.com) task=78

  36. at the end of all of this…

  37. 💦 DEV OPS

  38. 💜 OPS DEV

  39. DEV OPS WRITE → TEST → COMMIT → RELEASE → OBSERVE TEST OBSERVE

  40. share the great responsibility 
 OPS: (and great power!) DEVS: embrace observability, bring production closer to development.

  41. ASK NEW QUESTIONS thanks! SHIP BETTER SOFTWARE @cyen @honeycombio CURIOUS? TRY play.honeycomb.io

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend