How to Automated testing for: terraform test docker packer - - PowerPoint PPT Presentation

how to
SMART_READER_LITE
LIVE PREVIEW

How to Automated testing for: terraform test docker packer - - PowerPoint PPT Presentation

How to Automated testing for: terraform test docker packer infrastructure kubernetes and more code Passed: 5. Failed: 0. Skipped: 0. Test run successful. The DevOps world is full of Fear Fear of outages Fear of security


slide-1
SLIDE 1

Automated testing for: ✓ terraform ✓ docker ✓ packer ✓ kubernetes ✓ and more Passed: 5. Failed: 0. Skipped: 0. Test run successful.

How to test infrastructure code

slide-2
SLIDE 2

The DevOps world is full of Fear

slide-3
SLIDE 3

Fear of outages

slide-4
SLIDE 4

Fear of security breaches

slide-5
SLIDE 5

Fear of data loss

slide-6
SLIDE 6

Fear of change

slide-7
SLIDE 7

“Fear leads to

  • anger. Anger

leads to hate. Hate leads to suffering.”

Scrum Master Yoda

slide-8
SLIDE 8

And you all know what suffering leads to, right?

slide-9
SLIDE 9
slide-10
SLIDE 10

Credit: Daniele Polencic

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13

Many DevOps teams deal with this fear in two ways:

slide-14
SLIDE 14

1) Heavy drinking and smoking

slide-15
SLIDE 15

2) Deploying less frequently

slide-16
SLIDE 16

Sadly, both of these just make the problem worse!

slide-17
SLIDE 17
slide-18
SLIDE 18

There’s a better way to deal with this fear:

slide-19
SLIDE 19

Automated tests

slide-20
SLIDE 20

Automated tests give you the confidence to make changes

slide-21
SLIDE 21

Fight fear with confidence

slide-22
SLIDE 22

We know how to write automated tests for application code…

slide-23
SLIDE 23

resource "aws_lambda_function" "web_app" { function_name = var.name role = aws_iam_role.lambda.arn # ... } resource "aws_api_gateway_integration" "proxy" { type = "AWS_PROXY" uri = aws_lambda_function.web_app.invoke_arn # ... }

But how do you test your Terraform code deploys infrastructure that works?

slide-24
SLIDE 24

apiVersion: apps/v1 kind: Deployment metadata: name: hello-world-app-deployment spec: selector: matchLabels: app: hello-world-app replicas: 1 spec: containers:

  • name: hello-world-app

image: gruntwork-io/hello-world-app:v1 ports:

  • containerPort: 8080

How do you test your Kubernetes code configures your services correctly?

slide-25
SLIDE 25

This talk is about how to write tests for your infrastructure code.

slide-26
SLIDE 26

I’m I’m Yev Yevgeni eniy Br Brikman

ybrikman.com

slide-27
SLIDE 27

Co-founder of Gruntwork

gruntwork.io

slide-28
SLIDE 28

Author

slide-29
SLIDE 29
  • 1. Static analysis
  • 2. Unit tests
  • 3. Integration tests
  • 4. End-to-end tests
  • 5. Conclusion

Outline

slide-30
SLIDE 30
  • 1. Static analysis
  • 2. Unit tests
  • 3. Integration tests
  • 4. End-to-end tests
  • 5. Conclusion

Outline

slide-31
SLIDE 31

Static analysis: test your code without deploying it.

slide-32
SLIDE 32

Stati Static an anal alysis

1. 1. Co Compiler / / par arser / / interpreter 2. 2. Li Linte ter 3. 3. Dr Dry run

slide-33
SLIDE 33

Stati Static an anal alysis

1. 1. Co Compiler / / par arser / / interpreter 2. 2. Li Linte ter 3. 3. Dr Dry run

slide-34
SLIDE 34

Statically check your code for syntactic and structural issues

slide-35
SLIDE 35

Tool Command

Terraform

terraform validate

Packer

packer validate <template>

Kubernetes

kubectl apply -f <file> --dry-run --validate=true

Examples:

slide-36
SLIDE 36

Stati Static an anal alysis

1. 1. Co Compiler / / par arser / / interpreter 2. 2. Li Linte ter 3. 3. Dr Dry run

slide-37
SLIDE 37

Statically validate your code to catch common errors

slide-38
SLIDE 38

Tool Linters

Terraform

  • 1. conftest
  • 2. terraform_validate
  • 3. tflint

Docker

  • 1. dockerfile_lint
  • 2. hadolint
  • 3. dockerfilelint

Kubernetes

  • 1. kube-score
  • 2. kube-lint
  • 3. yamllint

Examples:

slide-39
SLIDE 39

Stati Static an anal alysis

1. 1. Co Compiler / / par arser / / interpreter 2. 2. Li Linte ter 3. 3. Dr Dry run

slide-40
SLIDE 40

Partially execute the code and validate the “plan”, but don’t actually deploy

slide-41
SLIDE 41

Tool Dry run options

Terraform

  • 1. terraform plan
  • 2. HashiCorp Sentinel
  • 3. terraform-compliance

Kubernetes

kubectl apply -f <file> --server-dry-run

Examples:

slide-42
SLIDE 42
  • 1. Static analysis
  • 2. Unit tests
  • 3. Integration tests
  • 4. End-to-end tests
  • 5. Conclusion

Outline

slide-43
SLIDE 43

Unit tests: test a single “unit” works in isolation.

slide-44
SLIDE 44

Un Unit test sts

1. 1. Un Unit testing basics 2. 2. Ex Exampl ple: Terra rrafo form rm unit t te tests ts 3. 3. Ex Exampl ple: Docker/ r/Kube bern rnete tes unit t te tests ts 4. 4. Cl Clean aning up af after tests

slide-45
SLIDE 45

Un Unit test sts

1. 1. Un Unit testing basics 2. 2. Ex Exampl ple: Terra rrafo form rm unit t te tests ts 3. 3. Ex Exampl ple: Docker/ r/Kube bern rnete tes unit t te tests ts 4. 4. Cl Clean aning up af after tests

slide-46
SLIDE 46

You can’t “unit test” an entire end- to-end architecture

slide-47
SLIDE 47

Instead, break your infra code into small modules and unit test those!

module module module module module module module module module module module module module module module

slide-48
SLIDE 48

With app code, you can test units in isolation from the outside world

slide-49
SLIDE 49

resource "aws_lambda_function" "web_app" { function_name = var.name role = aws_iam_role.lambda.arn # ... } resource "aws_api_gateway_integration" "proxy" { type = "AWS_PROXY" uri = aws_lambda_function.web_app.invoke_arn # ... }

But 99% of infrastructure code is about talking to the outside world…

slide-50
SLIDE 50

resource "aws_lambda_function" "web_app" { function_name = var.name role = aws_iam_role.lambda.arn # ... } resource "aws_api_gateway_integration" "proxy" { type = "AWS_PROXY" uri = aws_lambda_function.web_app.invoke_arn # ... }

If you try to isolate a unit from the

  • utside world, you’re left with nothing!
slide-51
SLIDE 51

So you can only test infra code by deploying to a real environment

slide-52
SLIDE 52

Key takeaway: there’s no pure unit testing for infrastructure code.

slide-53
SLIDE 53

Therefore, the test strategy is:

1. Deploy real infrastructure 2. Validate it works

(e.g., via HTTP requests, API calls, SSH commands, etc.)

3. Undeploy the infrastructure

(So it’s really integration testing of a single unit!)

slide-54
SLIDE 54

Tool Deploy / Undeploy Validate Works with Terratest

Yes Yes Terraform, Kubernetes, Packer, Docker, Servers, Cloud APIs, etc.

kitchen-terraform

Yes Yes Terraform

Inspec

No Yes Servers, Cloud APIs

Serverspec

No Yes Servers

Goss

No Yes Servers

Tools that help with this strategy:

slide-55
SLIDE 55

Tool Deploy / Undeploy Validate Works with Terratest

Yes Yes Terraform, Kubernetes, Packer, Docker, Servers, Cloud APIs, etc.

kitchen-terraform

Yes Yes Terraform

Inspec

No Yes Servers, Cloud APIs

Serverspec

No Yes Servers

Goss

No Yes Servers

In this talk, we’ll use Terratest:

slide-56
SLIDE 56

Un Unit test sts

1. 1. Un Unit testing basics 2. 2. Ex Exampl ple: Terra rrafo form rm unit t te tests ts 3. 3. Ex Exampl ple: Docker/ r/Kube bern rnete tes unit t te tests ts 4. 4. Cl Clean aning up af after tests

slide-57
SLIDE 57

Sample code for this talk is at:

github.com/gruntwork-io/infrastructure-as-code-testing-talk

slide-58
SLIDE 58

An example of a Terraform module you may want to test:

slide-59
SLIDE 59

infrastructure-as-code-testing-talk └ examples └ hello-world-app └ main.tf └ outputs.tf └ variables.tf └ modules └ test └ README.md

hello-world-app: deploy a “Hello, World” web service

slide-60
SLIDE 60

resource "aws_lambda_function" "web_app" { function_name = var.name role = aws_iam_role.lambda.arn # ... } resource "aws_api_gateway_integration" "proxy" { type = "AWS_PROXY" uri = aws_lambda_function.web_app.invoke_arn # ... }

Under the hood, this example runs on top of AWS Lambda & API Gateway

slide-61
SLIDE 61

$ terraform apply Outputs: url = ruvvwv3sh1.execute-api.us-east-2.amazonaws.com $ curl ruvvwv3sh1.execute-api.us-east-2.amazonaws.com Hello, World!

When you run terraform apply, it deploys and outputs the URL

slide-62
SLIDE 62

Let’s write a unit test for hello-world-app with Terratest

slide-63
SLIDE 63

infrastructure-as-code-testing-talk └ examples └ modules └ test └ hello_world_app_test.go └ README.md

Create hello_world_app_test.go

slide-64
SLIDE 64

func TestHelloWorldAppUnit(t *testing.T) { terraformOptions := &terraform.Options{ TerraformDir: "../examples/hello-world-app", } defer terraform.Destroy(t, terraformOptions) terraform.InitAndApply(t, terraformOptions) validate(t, terraformOptions) }

The basic test structure

slide-65
SLIDE 65

func TestHelloWorldAppUnit(t *testing.T) { terraformOptions := &terraform.Options{ TerraformDir: "../examples/hello-world-app", } defer terraform.Destroy(t, terraformOptions) terraform.InitAndApply(t, terraformOptions) validate(t, terraformOptions) }

  • 1. Tell Terratest where your Terraform

code lives

slide-66
SLIDE 66

func TestHelloWorldAppUnit(t *testing.T) { terraformOptions := &terraform.Options{ TerraformDir: "../examples/hello-world-app", } defer terraform.Destroy(t, terraformOptions) terraform.InitAndApply(t, terraformOptions) validate(t, terraformOptions) }

  • 2. Run terraform init and terraform

apply to deploy your module

slide-67
SLIDE 67

func TestHelloWorldAppUnit(t *testing.T) { terraformOptions := &terraform.Options{ TerraformDir: "../examples/hello-world-app", } defer terraform.Destroy(t, terraformOptions) terraform.InitAndApply(t, terraformOptions) validate(t, terraformOptions) }

  • 3. Validate the infrastructure works.

We’ll come back to this shortly.

slide-68
SLIDE 68

func TestHelloWorldAppUnit(t *testing.T) { terraformOptions := &terraform.Options{ TerraformDir: "../examples/hello-world-app", } defer terraform.Destroy(t, terraformOptions) terraform.InitAndApply(t, terraformOptions) validate(t, terraformOptions) }

  • 4. Run terraform destroy at the end of

the test to undeploy everything

slide-69
SLIDE 69

func validate(t *testing.T, opts *terraform.Options) { url := terraform.Output(t, opts, "url") http_helper.HttpGetWithRetry(t, url, // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3 * time.Second // Time between retries ) }

The validate function

slide-70
SLIDE 70

func validate(t *testing.T, opts *terraform.Options) { url := terraform.Output(t, opts, "url") http_helper.HttpGetWithRetry(t, url, // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3 * time.Second // Time between retries ) }

  • 1. Run terraform output to get the web

service URL

slide-71
SLIDE 71

func validate(t *testing.T, opts *terraform.Options) { url := terraform.Output(t, opts, "url") http_helper.HttpGetWithRetry(t, url, // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3 * time.Second // Time between retries ) }

  • 2. Make HTTP requests to the URL
slide-72
SLIDE 72

func validate(t *testing.T, opts *terraform.Options) { url := terraform.Output(t, opts, "url") http_helper.HttpGetWithRetry(t, url, // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3 * time.Second // Time between retries ) }

  • 3. Check the response for an expected

status and body

slide-73
SLIDE 73

func validate(t *testing.T, opts *terraform.Options) { url := terraform.Output(t, opts, "url") http_helper.HttpGetWithRetry(t, url, // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3 * time.Second // Time between retries ) }

  • 4. Retry the request up to 10 times, as

deployment is asynchronous

slide-74
SLIDE 74

Note: since we’re testing a web service, we use HTTP requests to validate it.

slide-75
SLIDE 75

Infrastructure Example Validate with… Example

Web service

Dockerized web app HTTP requests Terratest http_helper package

Server

EC2 instance SSH commands Terratest ssh package

Cloud service

SQS Cloud APIs Terratest aws or gcp packages

Database

MySQL SQL queries MySQL driver for Go

Examples of other ways to validate:

slide-76
SLIDE 76

$ export AWS_ACCESS_KEY_ID=xxxx $ export AWS_SECRET_ACCESS_KEY=xxxxx

To run the test, first authenticate to AWS

slide-77
SLIDE 77

$ go test -v -timeout 15m -run TestHelloWorldAppUnit …

  • -- PASS: TestHelloWorldAppUnit (31.57s)

Then run go test. You now have a unit test you can run after every commit!

slide-78
SLIDE 78

Un Unit test sts

1. 1. Un Unit testing basics 2. 2. Ex Exampl ple: Terra rrafo form rm unit t te tests ts 3. 3. Ex Exampl ple: Docker/ r/Kube bern rnete tes unit t te tests ts 4. 4. Cl Clean aning up af after tests

slide-79
SLIDE 79

What about other tools, such as Docker + Kubernetes?

slide-80
SLIDE 80

infrastructure-as-code-testing-talk └ examples └ hello-world-app └ docker-kubernetes └ Dockerfile └ deployment.yml └ modules └ test └ README.md

docker-kubernetes: deploy a “Hello, World” web service to Kubernetes

slide-81
SLIDE 81

FROM ubuntu:18.04 EXPOSE 8080 RUN DEBIAN_FRONTEND=noninteractive apt-get update && \ apt-get install -y busybox RUN echo 'Hello, World!' > index.html CMD ["busybox", "httpd", "-f", "-p", "8080"]

Dockerfile: Dockerize a simple “Hello, World!” web service

slide-82
SLIDE 82

apiVersion: apps/v1 kind: Deployment metadata: name: hello-world-app-deployment spec: selector: matchLabels: app: hello-world-app replicas: 1 spec: containers:

  • name: hello-world-app

image: gruntwork-io/hello-world-app:v1 ports:

  • containerPort: 8080

deployment.yml: define how to deploy a Docker container in Kubernetes

slide-83
SLIDE 83

$ cd examples/docker-kubernetes $ docker build -t gruntwork-io/hello-world-app:v1 . Successfully tagged gruntwork-io/hello-world-app:v1 $ kubectl apply -f deployment.yml deployment.apps/hello-world-app-deployment created service/hello-world-app-service created $ curl localhost:8080 Hello, World!

Build the Docker image, deploy to Kubernetes, and check URL

slide-84
SLIDE 84

Let’s write a unit test for this code.

slide-85
SLIDE 85

infrastructure-as-code-testing-talk └ examples └ modules └ test └ hello_world_app_test.go └ docker_kubernetes_test.go └ README.md

Create docker_kubernetes_test.go

slide-86
SLIDE 86

func TestDockerKubernetes(t *testing.T) { buildDockerImage(t) path := "../examples/docker-kubernetes/deployment.yml"

  • ptions := k8s.NewKubectlOptions("", "", "")

defer k8s.KubectlDelete(t, options, path) k8s.KubectlApply(t, options, path) validate(t, options) }

The basic test structure

slide-87
SLIDE 87

func TestDockerKubernetes(t *testing.T) { buildDockerImage(t) path := "../examples/docker-kubernetes/deployment.yml"

  • ptions := k8s.NewKubectlOptions("", "", "")

defer k8s.KubectlDelete(t, options, path) k8s.KubectlApply(t, options, path) validate(t, options) }

  • 1. Build the Docker image. You’ll see

the buildDockerImage method shortly.

slide-88
SLIDE 88

func TestDockerKubernetes(t *testing.T) { buildDockerImage(t) path := "../examples/docker-kubernetes/deployment.yml"

  • ptions := k8s.NewKubectlOptions("", "", "")

defer k8s.KubectlDelete(t, options, path) k8s.KubectlApply(t, options, path) validate(t, options) }

  • 2. Tell Terratest where your Kubernetes

deployment is defined

slide-89
SLIDE 89

func TestDockerKubernetes(t *testing.T) { buildDockerImage(t) path := "../examples/docker-kubernetes/deployment.yml"

  • ptions := k8s.NewKubectlOptions("", "", "")

defer k8s.KubectlDelete(t, options, path) k8s.KubectlApply(t, options, path) validate(t, options) }

  • 3. Configure kubectl options to

authenticate to Kubernetes

slide-90
SLIDE 90

func TestDockerKubernetes(t *testing.T) { buildDockerImage(t) path := "../examples/docker-kubernetes/deployment.yml"

  • ptions := k8s.NewKubectlOptions("", "", "")

defer k8s.KubectlDelete(t, options, path) k8s.KubectlApply(t, options, path) validate(t, options) }

  • 4. Run kubectl apply to deploy the web

app to Kubernetes

slide-91
SLIDE 91

func TestDockerKubernetes(t *testing.T) { buildDockerImage(t) path := "../examples/docker-kubernetes/deployment.yml"

  • ptions := k8s.NewKubectlOptions("", "", "")

defer k8s.KubectlDelete(t, options, path) k8s.KubectlApply(t, options, path) validate(t, options) }

  • 5. Check the app is working. You’ll see

the validate method shortly.

slide-92
SLIDE 92

func TestDockerKubernetes(t *testing.T) { buildDockerImage(t) path := "../examples/docker-kubernetes/deployment.yml"

  • ptions := k8s.NewKubectlOptions("", "", "")

defer k8s.KubectlDelete(t, options, path) k8s.KubectlApply(t, options, path) validate(t, options) }

  • 6. At the end of the test, remove all

Kubernetes resources you deployed

slide-93
SLIDE 93

func buildDockerImage(t *testing.T) {

  • ptions := &docker.BuildOptions{

Tags: []string{"gruntwork-io/hello-world-app:v1"}, } path := "../examples/docker-kubernetes" docker.Build(t, path, options) }

The buildDockerImage method

slide-94
SLIDE 94

func validate(t *testing.T, opts *k8s.KubectlOptions) { k8s.WaitUntilServiceAvailable(t, opts, "hello-world- app-service", 10, 1*time.Second) http_helper.HttpGetWithRetry(t, serviceUrl(t, opts), // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3*time.Second // Time between retries ) }

The validate method

slide-95
SLIDE 95

func validate(t *testing.T, opts *k8s.KubectlOptions) { k8s.WaitUntilServiceAvailable(t, opts, "hello-world- app-service", 10, 1*time.Second) http_helper.HttpGetWithRetry(t, serviceUrl(t, opts), // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3*time.Second // Time between retries ) }

  • 1. Wait until the service is deployed
slide-96
SLIDE 96

func validate(t *testing.T, opts *k8s.KubectlOptions) { k8s.WaitUntilServiceAvailable(t, opts, "hello-world- app-service", 10, 1*time.Second) http_helper.HttpGetWithRetry(t, serviceUrl(t, opts), // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3*time.Second // Time between retries ) }

  • 2. Make HTTP requests
slide-97
SLIDE 97

func validate(t *testing.T, opts *k8s.KubectlOptions) { k8s.WaitUntilServiceAvailable(t, opts, "hello-world- app-service", 10, 1*time.Second) http_helper.HttpGetWithRetry(t, serviceUrl(t, opts), // URL to test 200, // Expected status code "Hello, World!", // Expected body 10, // Max retries 3*time.Second // Time between retries ) }

  • 3. Use serviceUrl method to get URL
slide-98
SLIDE 98

func serviceUrl(t *testing.T, opts *k8s.KubectlOptions) string { service := k8s.GetService(t, options, "hello-world-app-service") endpoint := k8s.GetServiceEndpoint(t, options, service, 8080) return fmt.Sprintf("http://%s", endpoint) }

The serviceUrl method

slide-99
SLIDE 99

$ kubectl config set-credentials …

To run the test, first authenticate to a Kubernetes cluster.

slide-100
SLIDE 100

Note: Kubernetes is now part of Docker Desktop. Test 100% locally!

slide-101
SLIDE 101

$ go test -v -timeout 15m -run TestDockerKubernetes …

  • -- PASS: TestDockerKubernetes (5.69s)

Run go test. You can validate your config after every commit in seconds!

slide-102
SLIDE 102

Un Unit test sts

1. 1. Un Unit testing basics 2. 2. Ex Exampl ple: Terra rrafo form rm unit t te tests ts 3. 3. Ex Exampl ple: Docker/ r/Kube bern rnete tes unit t te tests ts 4. 4. Cl Clean aning up af after tests

slide-103
SLIDE 103

Note: tests create and destroy many resources!

slide-104
SLIDE 104

Pro tip #1: run tests in completely separate “sandbox” accounts

slide-105
SLIDE 105

Tool Clouds Features

cloud-nuke

AWS (GCP planned) Delete all resources older than a certain date; in a certain region; of a certain type.

Janitor Monkey

AWS Configurable rules of what to delete. Notify owners of pending deletions.

aws-nuke

AWS Specify specific AWS accounts and resource types to target.

Azure Powershell

Azure Includes native commands to delete Resource Groups

Pro tip #2: run these tools in cron jobs to clean up left-over resources

slide-106
SLIDE 106
  • 1. Static analysis
  • 2. Unit tests
  • 3. Integration tests
  • 4. End-to-end tests
  • 5. Conclusion

Outline

slide-107
SLIDE 107

Integration tests: test multiple “units” work together.

slide-108
SLIDE 108

Int Integr gration

  • n tests

1. 1. Ex Exampl ple: Terra rrafo form rm inte tegra rati tion te tests ts 2. 2. Te Test st parallelism sm 3. 3. Te Test st st stage ges 4. 4. Te Test st retries

slide-109
SLIDE 109

Int Integr gration

  • n tests

1. 1. Ex Exampl ple: Terra rrafo form rm inte tegra rati tion te tests ts 2. 2. Te Test st parallelism sm 3. 3. Te Test st st stage ges 4. 4. Te Test st retries

slide-110
SLIDE 110

infrastructure-as-code-testing-talk └ examples └ hello-world-app └ docker-kubernetes └ proxy-app └ web-service └ modules └ test └ README.md

Let’s say you have two Terraform modules you want to test together:

slide-111
SLIDE 111

infrastructure-as-code-testing-talk └ examples └ hello-world-app └ docker-kubernetes └ proxy-app └ web-service └ modules └ test └ README.md

proxy-app: an app that acts as an HTTP proxy for other web services.

slide-112
SLIDE 112

infrastructure-as-code-testing-talk └ examples └ hello-world-app └ docker-kubernetes └ proxy-app └ web-service └ modules └ test └ README.md

web-service: a web service that you want proxied.

slide-113
SLIDE 113

variable "url_to_proxy" { description = "The URL to proxy." type = string }

proxy-app takes in the URL to proxy via an input variable

slide-114
SLIDE 114
  • utput "url" {

value = module.web_service.url }

web-service exposes its URL via an

  • utput variable
slide-115
SLIDE 115

infrastructure-as-code-testing-talk └ examples └ modules └ test └ hello_world_app_test.go └ docker_kubernetes_test.go └ proxy_app_test.go └ README.md

Create proxy_app_test.go

slide-116
SLIDE 116

func TestProxyApp(t *testing.T) { webServiceOpts := configWebService(t) defer terraform.Destroy(t, webServiceOpts) terraform.InitAndApply(t, webServiceOpts) proxyAppOpts := configProxyApp(t, webServiceOpts) defer terraform.Destroy(t, proxyAppOpts) terraform.InitAndApply(t, proxyAppOpts) validate(t, proxyAppOpts) }

The basic test structure

slide-117
SLIDE 117

func TestProxyApp(t *testing.T) { webServiceOpts := configWebService(t) defer terraform.Destroy(t, webServiceOpts) terraform.InitAndApply(t, webServiceOpts) proxyAppOpts := configProxyApp(t, webServiceOpts) defer terraform.Destroy(t, proxyAppOpts) terraform.InitAndApply(t, proxyAppOpts) validate(t, proxyAppOpts) }

  • 1. Configure options for the web

service

slide-118
SLIDE 118

func TestProxyApp(t *testing.T) { webServiceOpts := configWebService(t) defer terraform.Destroy(t, webServiceOpts) terraform.InitAndApply(t, webServiceOpts) proxyAppOpts := configProxyApp(t, webServiceOpts) defer terraform.Destroy(t, proxyAppOpts) terraform.InitAndApply(t, proxyAppOpts) validate(t, proxyAppOpts) }

  • 2. Deploy the web service
slide-119
SLIDE 119

func TestProxyApp(t *testing.T) { webServiceOpts := configWebService(t) defer terraform.Destroy(t, webServiceOpts) terraform.InitAndApply(t, webServiceOpts) proxyAppOpts := configProxyApp(t, webServiceOpts) defer terraform.Destroy(t, proxyAppOpts) terraform.InitAndApply(t, proxyAppOpts) validate(t, proxyAppOpts) }

  • 3. Configure options for the proxy app

(passing it the web service options)

slide-120
SLIDE 120

func TestProxyApp(t *testing.T) { webServiceOpts := configWebService(t) defer terraform.Destroy(t, webServiceOpts) terraform.InitAndApply(t, webServiceOpts) proxyAppOpts := configProxyApp(t, webServiceOpts) defer terraform.Destroy(t, proxyAppOpts) terraform.InitAndApply(t, proxyAppOpts) validate(t, proxyAppOpts) }

  • 4. Deploy the proxy app
slide-121
SLIDE 121

func TestProxyApp(t *testing.T) { webServiceOpts := configWebService(t) defer terraform.Destroy(t, webServiceOpts) terraform.InitAndApply(t, webServiceOpts) proxyAppOpts := configProxyApp(t, webServiceOpts) defer terraform.Destroy(t, proxyAppOpts) terraform.InitAndApply(t, proxyAppOpts) validate(t, proxyAppOpts) }

  • 5. Validate the proxy app works
slide-122
SLIDE 122

func TestProxyApp(t *testing.T) { webServiceOpts := configWebService(t) defer terraform.Destroy(t, webServiceOpts) terraform.InitAndApply(t, webServiceOpts) proxyAppOpts := configProxyApp(t, webServiceOpts) defer terraform.Destroy(t, proxyAppOpts) terraform.InitAndApply(t, proxyAppOpts) validate(t, proxyAppOpts) }

  • 6. At the end of the test, undeploy the

proxy app and the web service

slide-123
SLIDE 123

func configWebService(t *testing.T) *terraform.Options { return &terraform.Options{ TerraformDir: "../examples/web-service", } }

The configWebService method

slide-124
SLIDE 124

func configProxyApp(t *testing.T, webServiceOpts *terraform.Options) *terraform.Options { url := terraform.Output(t, webServiceOpts, "url") return &terraform.Options{ TerraformDir: "../examples/proxy-app", Vars: map[string]interface{}{ "url_to_proxy": url, }, } }

The configProxyApp method

slide-125
SLIDE 125

func configProxyApp(t *testing.T, webServiceOpts *terraform.Options) *terraform.Options { url := terraform.Output(t, webServiceOpts, "url") return &terraform.Options{ TerraformDir: "../examples/proxy-app", Vars: map[string]interface{}{ "url_to_proxy": url, }, } }

  • 1. Read the url output from the web-

service module

slide-126
SLIDE 126

func configProxyApp(t *testing.T, webServiceOpts *terraform.Options) *terraform.Options { url := terraform.Output(t, webServiceOpts, "url") return &terraform.Options{ TerraformDir: "../examples/proxy-app", Vars: map[string]interface{}{ "url_to_proxy": url, }, } }

  • 2. Pass it in as the url_to_proxy input to

the proxy-app module

slide-127
SLIDE 127

func validate(t *testing.T, opts *terraform.Options) { url := terraform.Output(t, opts, "url") http_helper.HttpGetWithRetry(t, url, // URL to test 200, // Expected status code `{"text":"Hello, World!"}`, // Expected body 10, // Max retries 3 * time.Second // Time between retries ) }

The validate method

slide-128
SLIDE 128

$ go test -v -timeout 15m -run TestProxyApp …

  • -- PASS: TestProxyApp (182.44s)

Run go test. You’re now testing multiple modules together!

slide-129
SLIDE 129

$ go test -v -timeout 15m -run TestProxyApp …

  • -- PASS: TestProxyApp (182.44s)

But integration tests can take (many) minutes to run…

slide-130
SLIDE 130

Int Integr gration

  • n tests

1. 1. Ex Exampl ple: Terra rrafo form rm inte tegra rati tion te tests ts 2. 2. Te Test st parallelism sm 3. 3. Te Test st st stage ges 4. 4. Te Test st retries

slide-131
SLIDE 131

Infrastructure tests can take a long time to run

slide-132
SLIDE 132

One way to save time: run tests in parallel

slide-133
SLIDE 133

func TestProxyApp(t *testing.T) { t.Parallel() // The rest of the test code } func TestHelloWorldAppUnit(t *testing.T) { t.Parallel() // The rest of the test code }

Enable test parallelism in Go by adding t.Parallel() as the 1st line of each test.

slide-134
SLIDE 134

$ go test -v -timeout 15m === RUN TestHelloWorldApp === RUN TestDockerKubernetes === RUN TestProxyApp

Now, if you run go test, all the tests with t.Parallel() will run in parallel

slide-135
SLIDE 135

But there’s a gotcha: resource conflicts

slide-136
SLIDE 136

resource "aws_iam_role" "role_example" { name = "example-iam-role" } resource "aws_security_group" "sg_example" { name = "security-group-example" }

Example: module with hard-coded IAM Role and Security Group names

slide-137
SLIDE 137

resource "aws_iam_role" "role_example" { name = "example-iam-role" } resource "aws_security_group" "sg_example" { name = "security-group-example" }

If two tests tried to deploy this module in parallel, the names would conflict!

slide-138
SLIDE 138

Key takeaway: you must namespace all your resources

slide-139
SLIDE 139

resource "aws_iam_role" "role_example" { name = var.name } resource "aws_security_group" "sg_example" { name = var.name }

Example: use variables in all resource names…

slide-140
SLIDE 140

uniqueId := random.UniqueId() return &terraform.Options{ TerraformDir: "../examples/proxy-app", Vars: map[string]interface{}{ "name": fmt.Sprintf("text-proxy-app-%s", uniqueId) }, }

At test time, set the variables to a randomized value to avoid conflicts

slide-141
SLIDE 141

Int Integr gration

  • n tests

1. 1. Ex Exampl ple: Terra rrafo form rm inte tegra rati tion te tests ts 2. 2. Te Test st parallelism sm 3. 3. Te Test st st stage ges 4. 4. Te Test st retries

slide-142
SLIDE 142

Consider the structure of the proxy-app integration test:

slide-143
SLIDE 143

1. Deploy web-service

  • 2. Deploy proxy-app
  • 3. Validate proxy-app
  • 4. Undeploy proxy-app
  • 5. Undeploy web-service
slide-144
SLIDE 144

1. Deploy web-service

  • 2. Deploy proxy-app
  • 3. Validate proxy-app
  • 4. Undeploy proxy-app
  • 5. Undeploy web-service

When iterating locally, you sometimes want to re-run just one of these steps.

slide-145
SLIDE 145

1. Deploy web-service

  • 2. Deploy proxy-app
  • 3. Validate proxy-app
  • 4. Undeploy proxy-app
  • 5. Undeploy web-service

But as the code is written now, you have to run all steps on each test run.

slide-146
SLIDE 146

1. Deploy web-service

  • 2. Deploy proxy-app
  • 3. Validate proxy-app
  • 4. Undeploy proxy-app
  • 5. Undeploy web-service

And that can add up to a lot of

  • verhead.

(~3 min) (~2 min) (~30 seconds) (~1 min) (~2 min)

slide-147
SLIDE 147

Key takeaway: break your tests into independent test stages

slide-148
SLIDE 148

webServiceOpts := configWebService(t) defer terraform.Destroy(t, webServiceOpts) terraform.InitAndApply(t, webServiceOpts) proxyAppOpts := configProxyApp(t, webServiceOpts) defer terraform.Destroy(t, proxyAppOpts) terraform.InitAndApply(t, proxyAppOpts) validate(t, proxyAppOpts)

The original test structure

slide-149
SLIDE 149

stage := test_structure.RunTestStage defer stage(t, "cleanup_web_service", cleanupWebService) stage(t, "deploy_web_service", deployWebService) defer stage(t, "cleanup_proxy_app", cleanupProxyApp) stage(t, "deploy_proxy_app", deployProxyApp) stage(t, "validate", validate)

The test structure with test stages

slide-150
SLIDE 150

stage := test_structure.RunTestStage defer stage(t, "cleanup_web_service", cleanupWebService) stage(t, "deploy_web_service", deployWebService) defer stage(t, "cleanup_proxy_app", cleanupProxyApp) stage(t, "deploy_proxy_app", deployProxyApp) stage(t, "validate", validate)

  • 1. RunTestStage is a helper function

from Terratest.

slide-151
SLIDE 151

stage := test_structure.RunTestStage defer stage(t, "cleanup_web_service", cleanupWebService) stage(t, "deploy_web_service", deployWebService) defer stage(t, "cleanup_proxy_app", cleanupProxyApp) stage(t, "deploy_proxy_app", deployProxyApp) stage(t, "validate", validate)

  • 2. Wrap each stage of your test with a

call to RunTestStage

slide-152
SLIDE 152

stage := test_structure.RunTestStage defer stage(t, "cleanup_web_service", cleanupWebService) stage(t, "deploy_web_service", deployWebService) defer stage(t, "cleanup_proxy_app", cleanupProxyApp) stage(t, "deploy_proxy_app", deployProxyApp) stage(t, "validate", validate)

  • 3. Define each stage in a function

(you’ll see this code shortly).

slide-153
SLIDE 153

stage := test_structure.RunTestStage defer stage(t, "cleanup_web_service", cleanupWebService) stage(t, "deploy_web_service", deployWebService) defer stage(t, "cleanup_proxy_app", cleanupProxyApp) stage(t, "deploy_proxy_app", deployProxyApp) stage(t, "validate", validate)

  • 4. Give each stage a unique name
slide-154
SLIDE 154

stage := test_structure.RunTestStage defer stage(t, "cleanup_web_service", cleanupWebService) stage(t, "deploy_web_service", deployWebService) defer stage(t, "cleanup_proxy_app", cleanupProxyApp) stage(t, "deploy_proxy_app", deployProxyApp) stage(t, "validate", validate)

Any stage foo can be skipped by setting the env var SKIP_foo=true

slide-155
SLIDE 155

$ SKIP_cleanup_web_service=true $ SKIP_cleanup_proxy_app=true

Example: on the very first test run, skip the cleanup stages.

slide-156
SLIDE 156

$ go test -v -timeout 15m -run TestProxyApp Running stage 'deploy_web_service'… Running stage 'deploy_proxy_app'… Running stage 'validate'… Skipping stage 'cleanup_proxy_app'… Skipping stage 'cleanup_web_service'…

  • -- PASS: TestProxyApp (105.73s)

That way, after the test finishes, the infrastructure will still be running.

slide-157
SLIDE 157

$ SKIP_deploy_web_service=true $ SKIP_deploy_proxy_app=true

Now, on the next several test runs, you can skip the deploy stages too.

slide-158
SLIDE 158

$ go test -v -timeout 15m -run TestProxyApp Skipping stage 'deploy_web_service’… Skipping stage 'deploy_proxy_app'… Running stage 'validate'… Skipping stage 'cleanup_proxy_app'… Skipping stage 'cleanup_web_service'…

  • -- PASS: TestProxyApp (14.22s)

This allows you to iterate on solely the validate stage…

slide-159
SLIDE 159

$ go test -v -timeout 15m -run TestProxyApp Skipping stage 'deploy_web_service’… Skipping stage 'deploy_proxy_app'… Running stage 'validate'… Skipping stage 'cleanup_proxy_app'… Skipping stage 'cleanup_web_service'…

  • -- PASS: TestProxyApp (14.22s)

Which dramatically speeds up your iteration / feedback cycle!

slide-160
SLIDE 160

$ SKIP_validate=true $ unset SKIP_cleanup_web_service $ unset SKIP_cleanup_proxy_app

When you’re done iterating, skip validate and re-enable cleanup

slide-161
SLIDE 161

$ go test -v -timeout 15m -run TestProxyApp Skipping stage 'deploy_web_service’… Skipping stage 'deploy_proxy_app’… Skipping stage 'validate’… Running stage 'cleanup_proxy_app’… Running stage 'cleanup_web_service'…

  • -- PASS: TestProxyApp (59.61s)

This cleans up everything that was left running.

slide-162
SLIDE 162

func deployWebService(t *testing.T) {

  • pts := configWebServiceOpts(t)

test_structure.SaveTerraformOptions(t, "/tmp", opts) terraform.InitAndApply(t, opts) } func cleanupWebService(t *testing.T) {

  • pts := test_structure.LoadTerraformOptions(t, "/tmp")

terraform.Destroy(t, opts) }

Note: each time you run test stages via go test, it’s a separate OS process.

slide-163
SLIDE 163

func deployWebService(t *testing.T) {

  • pts := configWebServiceOpts(t)

test_structure.SaveTerraformOptions(t, "/tmp", opts) terraform.InitAndApply(t, opts) } func cleanupWebService(t *testing.T) {

  • pts := test_structure.LoadTerraformOptions(t, "/tmp")

terraform.Destroy(t, opts) }

So to pass data between stages, one stage needs to write the data to disk…

slide-164
SLIDE 164

func deployWebService(t *testing.T) {

  • pts := configWebServiceOpts(t)

test_structure.SaveTerraformOptions(t, "/tmp", opts) terraform.InitAndApply(t, opts) } func cleanupWebService(t *testing.T) {

  • pts := test_structure.LoadTerraformOptions(t, "/tmp")

terraform.Destroy(t, opts) }

And the other stages need to read that data from disk.

slide-165
SLIDE 165

Int Integr gration

  • n tests

1. 1. Ex Exampl ple: Terra rrafo form rm inte tegra rati tion te tests ts 2. 2. Te Test st parallelism sm 3. 3. Te Test st st stage ges 4. 4. Te Test st retries

slide-166
SLIDE 166

Real infrastructure can fail for intermittent reasons

(e.g., bad EC2 instance, Apt downtime, Terraform bug)

slide-167
SLIDE 167

To avoid “flaky” tests, add retries for known errors.

slide-168
SLIDE 168

&terraform.Options{ TerraformDir: "../examples/proxy-app", RetryableTerraformErrors: map[string]string{ "net/http: TLS handshake timeout": "Terraform bug", }, MaxRetries: 3, TimeBetweenRetries: 3*time.Second, }

Example: retry up to 3 times on a known TLS error in Terraform.

slide-169
SLIDE 169
  • 1. Static analysis
  • 2. Unit tests
  • 3. Integration tests
  • 4. End-to-end tests
  • 5. Conclusion

Outline

slide-170
SLIDE 170

End-to-end tests: test your entire infrastructure works together.

slide-171
SLIDE 171

How do you test this entire thing?

slide-172
SLIDE 172

You could use the same strategy…

1. Deploy all the infrastructure 2. Validate it works

(e.g., via HTTP requests, API calls, SSH commands, etc.)

3. Undeploy all the infrastructure

slide-173
SLIDE 173

But it’s rare to write end-to- end tests this way. Here’s why:

slide-174
SLIDE 174

e2e Tests

Test pyramid

Integration Tests Unit Tests Static analysis

slide-175
SLIDE 175

e2e Tests Integration Tests Unit Tests Static analysis

Cost, brittleness, run time

slide-176
SLIDE 176

e2e Tests Integration Tests Unit Tests Static analysis

60 – 240+ minutes 5 – 60 minutes 1 – 20 minutes 1 – 60 seconds

slide-177
SLIDE 177

e2e Tests Integration Tests Unit Tests Static analysis

E2E tests are too slow to be useful

60 – 240+ minutes 5 – 60 minutes 1 – 20 minutes 1 – 60 seconds

slide-178
SLIDE 178

Another problem with E2E tests: brittleness.

slide-179
SLIDE 179

Let’s do some math:

slide-180
SLIDE 180

Assume a single resource (e.g., EC2 instance) has a 1/1000 (0.1%) chance of failure.

slide-181
SLIDE 181

Test type # of resources Chance of failure

Unit tests

10 1%

Integration tests

50 5%

End-to-end tests

500+ 40%+

The more resources your tests deploy, the flakier they will be.

slide-182
SLIDE 182

Test type # of resources Chance of failure

Unit tests

10 1%

Integration tests

50 5%

End-to-end tests

500+ 40%+

You can work around the failure rate for unit & integration tests with retries

slide-183
SLIDE 183

Test type # of resources Chance of failure

Unit tests

10 1%

Integration tests

50 5%

End-to-end tests

500+ 40%+

You can work around the failure rate for unit & integration tests with retries

slide-184
SLIDE 184

Key takeaway: E2E tests from scratch are too slow and too brittle to be useful

slide-185
SLIDE 185

Instead, you can do incremental E2E testing!

slide-186
SLIDE 186

module module module module module module module module module module module module module module module

  • 1. Deploy a persistent test

environment and leave it running.

slide-187
SLIDE 187

module module module module module module module module module module module module module module module

  • 2. Each time you update a module,

deploy & validate just that module

slide-188
SLIDE 188

module module module module module module module module module module module module module module module

  • 3. Bonus: test your deployment

process is zero-downtime too!

slide-189
SLIDE 189
  • 1. Static analysis
  • 2. Unit tests
  • 3. Integration tests
  • 4. End-to-end tests
  • 5. Conclusion

Outline

slide-190
SLIDE 190

Testing techniques compared:

slide-191
SLIDE 191

Technique Strengths Weaknesses

Static analysis

  • 1. Fast
  • 2. Stable
  • 3. No need to deploy real resources
  • 4. Easy to use
  • 1. Very limited in errors you can catch
  • 2. You don’t get much confidence in your

code solely from static analysis

Unit tests

  • 1. Fast enough (1 – 10 min)
  • 2. Mostly stable (with retry logic)
  • 3. High level of confidence in individual units
  • 1. Need to deploy real resources
  • 2. Requires writing non-trivial code

Integration tests

  • 1. Mostly stable (with retry logic)
  • 2. High level of confidence in multiple units

working together

  • 1. Need to deploy real resources
  • 2. Requires writing non-trivial code
  • 3. Slow (10 – 30 min)

End-to-end tests

  • 1. Build confidence in your entire

architecture

  • 1. Need to deploy real resources
  • 2. Requires writing non-trivial code
  • 3. Very slow (60 min – 240+ min)*
  • 4. Can be brittle (even with retry logic)*
slide-192
SLIDE 192

So which should you use?

slide-193
SLIDE 193

All of them!

They all catch different types of bugs.

slide-194
SLIDE 194

e2e Tests

Keep in mind the test pyramid

Integration Tests Unit Tests Static analysis

slide-195
SLIDE 195

e2e Tests

Lots of unit tests + static analysis

Integration Tests Unit Tests Static analysis

slide-196
SLIDE 196

e2e Tests

Fewer integration tests

Integration Tests Unit Tests Static analysis

slide-197
SLIDE 197

e2e Tests

A handful of high-value e2e tests

Integration Tests Unit Tests Static analysis

slide-198
SLIDE 198

Infrastructure code without tests is scary

slide-199
SLIDE 199

Fight the fear & build confidence in your code with automated tests

slide-200
SLIDE 200

Questions?

info@gruntwork.io