Session ID:
The Unicorn Project And The Five Ideals
The Unicorn Project And The Five Ideals Session ID: - - PowerPoint PPT Presentation
The Unicorn Project And The Five Ideals Session ID: @RealGeneKim My Definition of DevOps The architecture, technical practices, and cultural norms that enable us to increase our ability to deliver
Session ID:
The Unicorn Project And The Five Ideals
My Definition of DevOps
The architecture, technical practices, and cultural norms that enable us to… increase our ability to deliver applications and services... quickly and safely, which enables rapid experimentation and innovation, and the fastest delivery of value to our customers… while ensuring world-class security, reliability, and stability... …so that we can win in the marketplace.
Better Value, Sooner, Safer, Happier
Jon Smart, Partner, Enterprise Agility, Deloitte (@jonsmart)
The Downward Spiral
IT Ops And Dev At War
7
IT Operations
CBS Photo Archive/Star Trek: The Original Series/Getty ImagesThe Developers
The Product Managers
The Product Managers
Source: Flickr: birdsandanchorsArchitects
The Problems That Still Remain
▪ Absence of all the invisible structures needed to enable developer productivity ▪ The orthogonal problem of getting data from where it resides to where it needs to be used ▪ Strong opposition to support new ways of working ▪ Ambiguity on what behaviors needed to support during a transformation
The Five Ideals
Session ID:
The Business Value Of DevOps Is Even Higher Than We Thought
Elite Low Difference Deployment Frequency On-demand (multiple times per day) Monthly or quarterly 208x Deployment Lead Time < 1 hour 1 day to 1 week 2,555x Deploy Success Rate 0-15% 46-60% 7x Mean Time to Restore < 1 hour 1 week to 1 month 2,604x
Elite vs. Low Performers
Source: Google/DORA: 2019 State Of DevOps Report: https://cloud.google.com/devops/state-of-devops/Elite Low Difference Deployment Frequency On-demand (multiple times per day) Monthly or quarterly 208x Deployment Lead Time < 1 hour 1 week to 1 month 106x Deploy Success Rate 0-15% 46-60% 7x Mean Time to Restore < 1 hour 1 week to 1 month 2,604x
Elite vs. Low Performers
Source: Google/DORA: 2019 State Of DevOps Report: https://cloud.google.com/devops/state-of-devops/Elite Low Difference Deployment Frequency On-demand (multiple times per day) Monthly or quarterly 208x Deployment Lead Time < 1 hour 1 week to 1 month 106x Deploy Failure Rate 0-15% 46-60% 7x Mean Time to Restore < 1 hour 1 week to 1 month 2,604x
Elite vs. Low Performers
Source: Google/DORA: 2019 State Of DevOps Report: https://cloud.google.com/devops/state-of-devops/Elite Low Difference Deployment Frequency On-demand (multiple times per day) Monthly or quarterly 208x Deployment Lead Time < 1 hour 1 week to 1 month 106x Deploy Failure Rate 0-15% 46-60% 7x Mean Time to Restore < 1 hour Less than one day 2,604x
Elite vs. Low Performers
Source: Google/DORA: 2019 State Of DevOps Report: https://cloud.google.com/devops/state-of-devops/High Performers Are More Secure And Controlled
less time spent remediating security issues more time spent
High Performers Win In The Marketplace
more likely to exceed profitability, market share & productivity goals more likely to achieve
mission goals, customer satisfaction, quantity & quality goals
Source: Google/DORA: 2018 State Of DevOps Report: https://cloudplatformonline.com/2018-state-of-devops.htmlHigh Performers Win In The Marketplace
higher employee Net Promoter Score 50% higher market capitalization growth
The Opposite Of Technical Debt Is…
When we can safely, quickly, reliably, securely achieve all the goals, dreams and aspirations of our business…
The Five Ideals
Session ID:
Ideal #1: Locality and Simplicity
The Birth And Death Of Etsy Sprouter
▪ A story about teams of engineers implementing changes
▪ 2008: Devs and DBAs ▪ 2009: Devs and DBAs and Sprouter team ▪ 2010: Devs
Lesson: The Organization and The Architecture Of Our Software Must Be Congruent
Lead Time = 9 months
Source: Damon Edwards (@damonedwards)Architecture Enables Teams To…
▪ …make large scale changes to the design of its system without the permission of someone outside the team, or depending on other teams ▪ ...complete its work without fine-grained communication and coordination with people outside the team ▪ ...deploy and release its product or service on demand, independently of
▪ ...do most of its testing on demand, without requiring an integrated test environment ▪ ...perform deployments during normal business hours with negligible downtime
Source: Puppet/DORA: 2017 State Of DevOps Report: https://puppet.com/resources/whitepaper/state-of-devops-reportThe First Ideal: A Measure
▪ Bus factor ▪ Lunch factor
How Many People Do You Need To Feed?
▪ Two pizza team ▪ Feeding everyone in the building ▪ Schedule lunch with 43 different people
The First Ideal: Code
▪ Ideal: anyone can implement what they need by looking at one file or module, and make the needed change
▪ Kubernetes sidecars ▪ Spring (http-retry, Dependency Injection) ▪ Aspect Oriented Programming
▪ Not Ideal: to make your needed change, you have to understand and change all the files and modules
The First Ideal: Code
▪ Ideal: changes can be independently implemented and tested, isolated from other components (composability) ▪ Not Ideal: in order for changes to be implemented and tested, the entire system must be present (e.g., integrated test environment)
The First Ideal: Organization
▪ Ideal: every team has the expertise, capability and authority to satisfy customer needs ▪ Not Ideal: in order to satisfy customer needs, every team must escalate up two levels (and over two, and down two)
Team of Teams
▪ Story of Joint Special Forces Task Force battling a smaller, nimbler adversary in Iraq in 2004 ▪ Pushing decision making to the edges
The First Ideal: Data
▪ Ideal: every team has access to the data they need, on-demand, quickly, accurately, and securely ▪ Not Ideal: in order to get the data they need, teams must wait months, and hope that every report won’t break
▪ 35-50% of organization access or manipulate data as part of their daily work — that’s significantly more than the software developer population!
Source: Chris Bergh, CEO, DataKitchenSession ID:
Ideal #2: Focus, Flow, and Joy
Rediscovering The Joy Of Programming
▪ For decades, I self-identified as an Ops person… ▪ 2 years ago, I’ve started to self-identify as Dev
▪ Clojure / ClojureScript ▪ LISP, functional programming, immutability ▪ 3000 lines of Objective C -> 1500 lines of TypeScript/React - > 500 lines of ClojureScript
▪ Development is so fun, and these days, you can do miraculous things with so little effort
Why Functional Programming
▪ The famous French philosopher Claude Lévi-Strauss would say of certain tools, ‘is it good to think with?’ ▪ Core FP concepts
▪ Immutability ▪ Pure functions ▪ Composability
▪ Pioneered by LISP and ML. Popularized by OCaml, Haskell, Clojure, Erlang, Elm, Elixir, ReasonML, PureScript…
Interestingly, It Portends Future Of Ops
▪ Core concepts
▪ Immutability ▪ Pure functions ▪ Composability
▪ Look at…
▪ Docker, Docker Compose ▪ Kubernetes ▪ Kubernetes sidecars ▪ Event streams: Apache Kafka ▪ Git
The Second Ideal: Focus and Flow
▪ Ideal: your energy and time is focused on solving the business problem, and you’re having fun ▪ Not Ideal: all your time is spent trying to solve problems you don’t even want to solve (e.g., YAML files, Makefile and spaces in filenames, bash)
Never Have I Valued Infrastructure More
▪ Things I detest now
▪ Everything outside of my application ▪ Connecting to anything to anything ▪ Updating dependencies ▪ Secrets management ▪ Bash ▪ YAML ▪ Patching ▪ Building kubernetes deployment files (mostly by Googling) ▪ Why my cloud costs are so high
The Value Of Platforms
▪ Enable developer productivity
▪ Self-service ▪ On-demand ▪ Immediacy and fast feedback ▪ Focus and flow ▪ Joy
▪ Monitoring, deployment, environment creation, security scans, orchestration…
There’s Never Been A Better Time for Infrastructure and Operations
Flow: Dr. Mihaly Csikszentmihalyi
“What is your lead time for changes?” “How long does it take to go from code committed to code successfully running in production?”
Product Design and Development Product Delivery (Build, Test, Deploy) Create new products and services that solve customer problems using hypothesis-driven delivery, modern UX, design thinking Enable fast flow from development to production and reliable releases by standardizing work, reducing variability and batch sizes Feature design and implementation may require work that has never been done before Integration, test and deployment must be performed continuously, as quickly as possible Estimates are highly uncertain Cycle times should be well-known and predictable Outcomes are highly variable Outcomes should have low variability
Change Committed Into Version Control
Product Design and Development Product Delivery (Build, Test, Deploy) Create new products and services that solve customer problems using hypothesis-driven delivery, modern UX, design thinking Enable fast flow from development to production and reliable releases by standardizing work, reducing variability and batch sizes Feature design and implementation may require work that has never been done before Integration, test and deployment must be performed continuously, as quickly as possible Estimates are highly uncertain Cycle times should be well-known and predictable Outcomes are highly variable Outcomes should have low variability
Change Committed Into Version Control
Change Committed Into Version Control
Product Design and Development Product Delivery (Build, Test, Deploy) Create new products and services that solve customer problems using hypothesis-driven delivery, modern UX, design thinking Enable fast flow from development to production and reliable releases by standardizing work, reducing variability and batch sizes Feature design and implementation may require work that has never been done before Integration, test and deployment must be performed continuously, as quickly as possible Estimates are highly uncertain Cycle times should be well-known and predictable Outcomes are highly variable Outcomes should have low variability
What Is The One Question That Predicts Performance With Startling Accuracy?
“To what degree do we fear doing deployments?”
Source: Puppet Labs 2015 State Of DevOps: https://puppetlabs.com/2015-devops-reportThe Second Ideal: Focus and Flow
▪ Ideal: when you can implement and test your feature on your Dev laptop, and learn whether it worked in seconds ▪ Not Ideal: when the only way you can determine whether you feature worked is waiting minutes, hours, or days… or weeks…
The Second Ideal: Focus and Flow
▪ Ideal: trunk based development ▪ Not Ideal: 5 days merging, 50 people in conference rooms
Session ID:
Ideal #3: Improvement Of Daily Work
Third Ideal: Improvement of Daily Work
▪ Not Ideal: TWWADI
▪ “The Way We’ve Always Done It”
▪ Ideal: MTBTT
▪ “Make Tomorrow Better Than Today” (Google SRE Principle #2)
Not Ideal
“In manufacturing, the absence of effective feedback often contribute to major quality and safety problems. In one well-documented case at the General Motors Fremont manufacturing plant, there were no effective procedures in place to detect problems during the assembly process, nor were there explicit procedures on what to do when problems were found. “As a result, there were instances of engines being put in backward, cars missing steering wheels or tires, and cars even having to be towed off the assembly line because they wouldn’t start.”
Source: DevOps HandbookCreate as much feedback in our system, from as many areas in our system, sooner, faster, and cheaper, with as much clarity between cause and
Why? Because the more assumptions we can invalidate, the more we learn, improving our ability to fix problems and innovate.
Source: DevOps HandbookIdeal
How many times per day is the andon cord pulled in a typical day at a Toyota manufacturing plant?
3,500 times per day
Source: http://www.gembapantarei.com/2008/04/how_many_times_do_you_pull_the_andon_cord_each_day.htmlSession ID:
Greatness Isn’t Free… The Need To Pay Down Technical Debt
Fast Push To Market
Debts & Risks Features Quality Defects
Fast Push To Market — Continued
Features Defects Defect fixing dominates work Site reliability tanks Slower and slower velocity Customers leave Morale plunges Devs leave because everything is hard Quality Debts & Risks
Who hasn’t felt this? You hire a bunch of developers, but you still can’t ship the features you promised… …and maybe you even have the feeling that things are slowing down…
Risto Siilasma, NOKIA
Source: The Unicorn Project (2019) / Transforming NOKIA (2019)Near Death Experiences
2002 Microsoft Security Standdown
▪ Famously, Microsoft after SQL Slammer required every product group to freeze feature
Source: https://www.wired.com/2002/01/bill-gates-trustworthy-computing/The Feature Freeze / Standdown
Debt Features Quality Defects Features
Quote from Marty Cagan from his book Inspired
The deal [between product owners and] engineering goes like this: Product management takes 20% of the team’s capacity right off the top and gives this to engineering to spend as they see fit. They might use it to rewrite, re-architect, or re- factor problematic parts of the code base…whatever they believe is necessary to avoid ever having to come to the team and say, ‘we need to stop and rewrite [all our code].’ If you’re in really bad shape today, you might need to make this 30% or even more of the resources. However, I get nervous when I find teams that think they can get away with much less than 20%. Cagan notes that when organizations do not pay their “20% tax,” technical debt will increase to the point where an organization inevitably spends all of its cycles paying down technical debt. At some point, the services become so fragile that feature delivery grinds to a halt because all the engineers are working on reliability issues or working around problems.
The Third Ideal: Enabling Greatness
▪ Ideal: 3-5% of developers dedicated to improving developer productivity
▪ Google: likely 1,500+ devs ($1B+) ▪ Microsoft: likely over 3,000 devs
▪ Not ideal: assigned to summer interns and “people not good enough to be developers”
There cannot be a more important thing for an engineer, for a product team, than to work
So I would, any day of the week, trade off features for our own productivity. I want our best engineers to work on our engineering systems, so that we can later on come back and build all of the new concepts we want.
Breaking The Bottlenecks In The Flow
▪ Environment creation ▪ Code deployment ▪ Test setup and run (mention @rohansingh) ▪ Overly tight architecture ▪ Development ▪ Product management
"Automated tests transform fear into boredom."
Google Dev And Ops (2013)
▪ 15,000 engineers, working on 4,000+ projects ▪ All code is checked into one source tree (billions of files!) ▪ 5,500 code commits/day ▪ 75 million test cases are run daily
The Third Ideal: Improvement
▪ Not Ideal: No one cares if someone breaks the build, or checks in code that breaks our tests ▪ Ideal: When someone breaks our build or our tests, fixing it becomes the most important work of the moment
The Third Ideal: Improvement
▪ Not ideal: When someone needs a peer review, that person has to wait until someone else frees up ▪ Ideal: Whatever I’m working on, if someone needs a peer review, I drop whatever I’m doing to help
Session ID:
Ideal #4: Psychological Safety
DevOps Enterprise: Lessons Learned
▪ In 2019, we’ll hold the sixth year of the DevOps Enterprise Summit, a conference for horses, by horses ▪ Over the years, we’ve had nearly 350 leaders from:
▪ Capital One, KeyBank, Barclays, GE Capital, ING Bank, Fidelity, PNC, ADP, BofA, Western Union, BBVA ▪ Nationwide Insurance, Zurich Insurance, Allstate, Hiscox, Aviva, LV= ▪ Walmart, Nordstrom, Target, Macy’s, Marks and Spencer ▪ Nike, Adidas, Sherwin Williams ▪ Verizon, Telstra, T-Mobile, Orange, CSG ▪ Raytheon, Lockheed Martin, Northrop Grumman, CSRA, Jaguar Land Rover, Fiat/Chrysler, Cisco ▪ Disney, Ticketmaster, NBC/Universal, Comcast ▪ Kaiser Permanente ▪ US Citizenship & Immigration Services, UK HM Revenue Collection, DISA Forge.mil, NZ Ministry of Social Development, UK Welfare and Pensions, US Joint Warfare Analysis Center ▪ Amazon PrimeNow, CA, Compuware, Google Search, IBM, MicroFocus, Microsoft, SAP
One Of The Highest Predictors Of Performance
Source: Typology Of Organizational Culture (Westrum, 2004)One Of The Highest Predictors Of Performance
Source: Typology Of Organizational Culture (Westrum, 2004)One Of The Highest Predictors Of Performance
Source: Typology Of Organizational Culture (Westrum, 2004)Google: Project Aristotle, Oxygen, re:Work
Source: https://rework.withgoogle.com/blog/five-keys-to-a-successful-google-team/Great Practices Enabled
▪ Blameless post-mortems ▪ Chaos Monkeys
Modeling Continual Learning
▪ “When adult learners start trying to learn a new skill, they will often do it in private, because of the embarrassment associated with doing something they’re not good at.” ▪ We can help by saying “I don’t know"
Session ID:
Ideal #5: Customer Focus
The Fifth Ideal: Focus On The Customer
▪ Core vs. Context
▪ Enabled reallocation of $8MM back into R&D
The Fifth Ideal: Focus On The Customer
▪ Not ideal: Functional silo managers prioritize silo goals over business goals ▪ Ideal: Functional silo managers make decisions based on what the customer values, and helps ensure their teams have the skills to thrive in the long term
Why Do I Think This Is Important?
“The world is changing very fast. “Big will not beat small anymore. It will be the fast beating the slow.”
Source: Rupert MurdochThe Five Ideals
#2 WSJ Bestseller!
Want More Learn More?
To receive this presentation and the following: ▪ PDF and audio excerpts from The Unicorn Project ▪ Eight excerpts from Beyond The Phoenix Project audio series w/John Willis ▪ The 140 page excerpt of The DevOps Handbook ▪ The 140 page excerpt of The Phoenix Project ▪ Videos and slides from DevOps Enterprise 2014-2019 ▪ One hour excerpt of The Phoenix Project audiobook
Just pick up your phone, and send an email: To: realgenekim@SendYourSlides.com Subject: devops
realgenekim@SendYourSlides.com devops