SLIDE 1 Christian Monaghan
@monaghan_a_gram Cofounder, Nava PBC
Migrating HealthCare.gov to Terraform: Lessons Learned
SLIDE 2
What is Terraform?
SLIDE 3
A tool for building, changing, and versioning infrastructure
SLIDE 4
Manage cloud providers
SLIDE 5 Infrastructure as Code
- Declarative syntax
- Source control
- Variable support
SLIDE 6
plan before proceeding
Execution plans
SLIDE 7 Resource graph
dependency order
SLIDE 8 Resource graph
dependency order
SLIDE 9
Our project history
SLIDE 10 AWS Cloudformation
JSON interface 3,000+ lines for 1 Virtual Private Cloud (VPC) Managing dozens of VPCs
SLIDE 11 Custom tooling to interact with Cloudformation
YAML Config Custom script AWS Cloudformation
SLIDE 12
Challenges we faced with our existing tooling
SLIDE 13
- Complex
- Not unit tested
- Limited documentation, quickly out of date
- Increasing bloat
- Hard to understand
- Hard to debug
Maintaining custom code :(
SLIDE 14 Unable to incorporate manual changes
Past examples:
- Horizontally scale NATs (Network Address Translation)
- Adding a temporary second Elastic Load Balancer
- Scaling down from 3 availability zones to 1 availability zone
- Swap in new Elastic IPs
SLIDE 15 Uncertain client demands
- Must build atop partially provisioned
vpc infrastructure
- Client frequently requesting custom
architecture changes
- Client might make manual changes
that would be unrecoverable in Cloudformation
SLIDE 16
- Load testing resources
- Continuous Integration clusters
- Custom monitoring
- Graphite/Graphana
- Nessus scanning clusters
Proliferating use cases
SLIDE 17
We were trying to shoehorn all these new use cases into our existing tooling
SLIDE 18
Engineering goal
SLIDE 19
Manage all infrastructure with a single tool that is flexible, extensible, fast, and well-supported
SLIDE 20
Choosing the right tool
SLIDE 21
Tools we considered
SLIDE 22 Chef, Puppet, Ansible, SaltStack
- These are configuration management tools
- Install and manage software on existing machines
SLIDE 23
- Incorporate manual changes
- Declarative syntax, easy to read, understand, extend
- Supports multiple providers
- Separates planning and execution
- Well-supported, open-source
- Modular
Why we chose Terraform
SLIDE 24
Some Terraform basics
SLIDE 25 Changes required
How it knows what to provision
Desired state Actual state
SLIDE 26
Desired state looks like this
SLIDE 27
Actual state looks like this
SLIDE 28
Prototyping
SLIDE 29 Greenfield approach
Define Diff Apply State Updated
SLIDE 30 Reverse engineering approach
Define Diff Apply Import State
SLIDE 31 Hardcoded
Refactor to use variables
Variables
SLIDE 32 Testing
1. Successfully provision a new VPC 2. Application functional
a. Passes health checks b. Passes smoke testing
3. Infrastructure security scan
a. AWS Trusted Advisor
SLIDE 33 End result
- A configuration file (.tf) that
represents one complete vpc configuration
- A state file (.tfstate) that
represents one existing vpc
SLIDE 34
Design
SLIDE 35 How can we design this for reuse?
AppA Test AppA Staging AppA Prod AppA ... AppB Test AppB Staging AppB Prod AppB ... ... ... ... ...
SLIDE 36 Existing design
Variable inputs Assemble building blocks Building blocks
SLIDE 37
Implementation
SLIDE 38
Build new VPC's & cutover traffic
SLIDE 39
Learnings
SLIDE 40
Use shared modules sparingly
SLIDE 41 Sharing modules within applications worked well
Use shared modules sparingly
SLIDE 42 Use shared modules sparingly
Sharing modules across applications did not work well
SLIDE 43 Change the Elastic Load Balancer module
Use shared modules sparingly
SLIDE 44
Use shared modules sparingly
SLIDE 45 Use shared modules sparingly
cots
SLIDE 46 Migrating infrastructure in place
It's possible, but time consuming
SLIDE 47 Importing existing state
- Native terraform import CLI utility
○ Only imports one resource at a time ○ Requires manually finding each resource id relevant to a particular vpc
- Third party open source terraforming CLI
○ Imports all resources in a region ○ Cannot narrow scope to a specific vpc
SLIDE 48
Lock resources to a particular terraform version
SLIDE 49 Terraform needs to be managed in CI/CD
Otherwise:
- Risk losing internet connection in mid-apply
- No record of who changed what when
- Developers bump versions unintentionally
SLIDE 50 Semantically version modules with git tags
Good Bad
SLIDE 51
Terraform utilities
SLIDE 52 terraforming
Export existing AWS resources to Terraform
SLIDE 53 TODO: screenshot
tfenv
Terraform version manager inspired by rbenv
SLIDE 54 terraform fmt
Before After
SLIDE 55 terraform-docs
Generate docs from terraform modules
SLIDE 56
Thank you
@monaghan_a_gram