lessons learned from reviewing 150 infrastructures
play

LESSONS LEARNED FROM REVIEWING 150 INFRASTRUCTURES_ JON TOPPER | - PowerPoint PPT Presentation

LESSONS LEARNED FROM REVIEWING 150 INFRASTRUCTURES_ JON TOPPER | @jtopper | he/him/his $ whoami Founder/CEO/CTO The Scale Factory Working in hosting/infrastructure for 20 years Infrastructure / AWS / DevOps @jtopper @jtopper REVIEWS RUN _


  1. LESSONS LEARNED FROM REVIEWING 150 INFRASTRUCTURES_ JON TOPPER | @jtopper | he/him/his

  2. $ whoami Founder/CEO/CTO The Scale Factory Working in hosting/infrastructure for 20 years Infrastructure / AWS / DevOps @jtopper

  3. @jtopper

  4. REVIEWS RUN _ 180 135 90 45 0 Mar-2018 May-2018 Jul-2018 Sep-2018 Nov-2018 Jan-2019 Mar-2019 May-2019 Jul-2019 Sep-2019 Nov-2019 Jan-2020 @jtopper

  5. TODAY’S AGENDA_ What is Well-Architected? What is a Well-Architected Review? Common Review Findings @jtopper

  6. WHAT IS WELL-ARCHITECTED?_ @jtopper

  7. Catalogue of emergent good practices WELL Observed by AWS Field Solutions Architects ARCHITECTED Codified and shared ORIGINS _ Platform agnostic* @jtopper

  8. ������������������������������ ��������� ������������������������������������������������������������������������������������������� ����������������������������������������������������������������������������������������������� ����������������������������������������������������������������������������������������������������� �������������������������������������� ������� ����������������������������������� White Papers Review Tool @jtopper

  9. Operational Performance Cost Excellence Security Reliability Efficiency Optimisation @jtopper

  10. Lenses High Serverless Performance IoT Applications Computing (Internet of Things) @jtopper

  11. Gap analysis / planning USING Teaching WELL-ARCHITECTED _ Team alignment @jtopper

  12. WHAT IS A WELL-ARCHITECTED REVIEW?_ @jtopper

  13. WELL Foundational questions ARCHITECTED Up to 4 hours REVIEW _ Qualitative @jtopper

  14. Operational Performance Cost Excellence Security Reliability Efficiency Optimisation Well Architected 11 8 9 46 9 9 Core Serverless 1 1 3 2 2 9 Applications High Performance 2 4 3 3 4 16 Computing IoT 10 11 35 4 6 4 (Internet of Things) @jtopper

  15. How do you determine what your priorities are? QUESTION • Evaluate external customer needs OPS 1_ • Evaluate internal customer needs • Evaluate compliance requirements • Evaluate threat landscape • Evaluate tradeoffs • Manage benefits and risks • None of these @jtopper

  16. How do you determine what your priorities are? QUESTION • Evaluate external customer needs OPS 1_ WA • Evaluate internal customer needs WA • Evaluate compliance requirements WA • Evaluate threat landscape NI • Evaluate tradeoffs NI • Manage benefits and risks NI • None of these CI @jtopper

  17. How do you determine what your priorities are? QUESTION • Evaluate external customer needs OPS 1_ WA • Evaluate internal customer needs WA • Evaluate compliance requirements WA High Risk • Evaluate threat landscape NI • Evaluate tradeoffs NI • Manage benefits and risks NI • None of these CI @jtopper

  18. How do you determine what your priorities are? QUESTION • Evaluate external customer needs OPS 1_ WA • Evaluate internal customer needs WA • Evaluate compliance requirements WA Medium Risk • Evaluate threat landscape NI • Evaluate tradeoffs NI • Manage benefits and risks NI • None of these CI @jtopper

  19. How do you determine what your priorities are? QUESTION • Evaluate external customer needs OPS 1_ WA • Evaluate internal customer needs WA • Evaluate compliance requirements WA Medium Risk • Evaluate threat landscape NI • Evaluate tradeoffs NI • Manage benefits and risks NI • None of these CI @jtopper

  20. How do you determine what your priorities are? QUESTION • Evaluate external customer needs OPS 1_ WA • Evaluate internal customer needs WA • Evaluate compliance requirements WA Well Architected • Evaluate threat landscape NI • Evaluate tradeoffs NI • Manage benefits and risks NI • None of these CI @jtopper

  21. COMMON REVIEW FINDINGS_ @jtopper

  22. THE GOOD_ @jtopper

  23. How do you determine what your priorities are? QUESTION • Evaluate external customer needs 93% OPS 1_ WA • Evaluate internal customer needs 87% WA • Evaluate compliance requirements 90% Well Architected WA 77% • Evaluate threat landscape 85% NI • Evaluate tradeoffs 89% NI WA Rank: 1 • Manage benefits and risks 89% NI • None of these 0% CI @jtopper

  24. How do you select your storage solution? QUESTION PERF 3_ • Understand storage characteristics and 84% WA requirements Well Architected • Evaluate available configuration options 78% NI 70% • Make decisions based on access 73% NI WA Rank: 2 patterns and metrics • None of these 5% CI @jtopper

  25. How do you implement change? QUESTION REL 5_ • Deploy changes in a planned manner 83% Well Architected WA 63% • Deploy changes with automation 67% NI • None of these 6% CI WA Rank: 3 @jtopper

  26. THE BAD_ @jtopper

  27. How do you plan for disaster recovery? QUESTION • Define recovery objectives for downtime 33% WA and data loss REL 9_ • Use defined recovery strategies to meet 33% WA the recovery objectives High Risk • Test disaster recovery implementation to 79% 25% WA validate the implementation (87%) • Manage configuration drift on all HRI Rank: 1 39% NI changes • Automate recovery 16% NI • None of these 31% CI @jtopper

  28. How do you respond to a [security] incident? • Identify key personnel and external 51% QUESTION WA resources SEC 11_ 27% • Identify tooling WA 39% • Develop incident response plans WA High Risk 0% • Automate containment capability NI 75% 11% • Identify forensic capabilities NI (93%) HRI Rank: 2 27% • Pre-provision access NI 10% • Pre-deploy tools NI 3% • Run game days NI 35% • None of these CI @jtopper

  29. How do you classify your data? QUESTION SEC 8_ • Define data classification requirements 61% WA • Define data protection controls 39% WA High Risk • Implement data identification 17% WA 75% • Automate identification and classification (88%) 4% NI HRI Rank: 3 • Identify the types of data 59% NI • None of these 23% CI @jtopper

  30. How do you evaluate new services? QUESTION • Establish a cost optimisation function 34% WA COST 9_ • Develop a workload review process 26% WA • Review and implement services in an 84% NI High Risk unplanned way 71% 43% • Review and analyse this workload NI (79%) regularly HRI Rank: 4 • Keep up to date with new service 63% NI releases 1% CI • None of these @jtopper

  31. How do you test resilience? QUESTION • Use playbooks for unanticipated failures 25% WA REL 8_ • Conduct root cause analysis and share 73% WA results High Risk • Inject failures to test resiliency 6% NI 67% • Conduct game days regularly 0% NI (92%) HRI Rank: 5 • None of these 16% CI @jtopper

  32. THE NOTABLE_ @jtopper

  33. How do you reduce defects, ease remediation, and improve flow into production? 90% QUESTION Use version control • WA 87% Test and validate changes • WA OPS 3_ 78% Use config management systems • NI 82% Use build/deploy systems • NI Well Architected 37% Perform patch management • NI 14% 57% Share design standards • NI 83% Implement practices to improve code quality • NI WA Rank: 23 81% Use multiple environments • NI 63% Make frequent, small, reversible changes • NI 52% Fully automate integration and deployment • NI 3% None of these • CI @jtopper

  34. How do you understand the health of your workload? QUESTION Identify key performance indicators 53% • WA OPS 6_ Define workload metrics 62% • WA Collect and analyse workload metrics 72% • WA Establish workload metric baselines 51% NI • Well Architected 46% Learn expected patterns of activity for workload 54% NI • Alert when workload outcomes are at risk 40% • NI WA Rank: 21 Alert when workload anomalies are detected 34% • NI Validate the achievement of outcomes and the 37% • NI effectiveness of KPIs and metrics 14% None of these CI • @jtopper

  35. How do you control human access? QUESTION • Define human access requirements 70% SEC 2_ WA • Grant least privileges 58% WA • Allocate unique credentials per person 90% High Risk WA 47% • Manage credentials based on lifecycle 70% NI (88%) • Automate credential management 13% NI HRI Rank: 20 • Grant access through roles or federation 62% NI • None of these 3% CI @jtopper

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend