Istio A modern service mesh Louis Ryan Principal Engineer @ Google - - PowerPoint PPT Presentation

istio
SMART_READER_LITE
LIVE PREVIEW

Istio A modern service mesh Louis Ryan Principal Engineer @ Google - - PowerPoint PPT Presentation

Istio A modern service mesh Louis Ryan Principal Engineer @ Google @louiscryan My Google Career HTTP HTTP HTTP2 GRPC Reverse Proxy Reverse Proxy Reverse Proxy HTTP Control HTTP2 GRPC API Proxy HTTP Plane Stubby API Proxy v2


slide-1
SLIDE 1

Istio

A modern service mesh

Louis Ryan Principal Engineer @ Google @louiscryan

slide-2
SLIDE 2

My Google Career

Server GData Library API Proxy Server Reverse Proxy Reverse Proxy API Proxy v2 Server Reverse Proxy

HTTP HTTP HTTP2 GRPC Stubby Stubby GRPC (local)

Control Plane Centralization

Performance & Isolation HTTP HTTP HTTP2 GRPC

slide-3
SLIDE 3

Cloud → Internal & External Convergence

API Proxy v2 Server Reverse Proxy

HTTP2 GRPC Stubby GRPC (local)

Control Plane

HTTP2 GRPC

  • Network distance & bandwidth
  • Protocols
  • Isolation & Reliability
  • Security Concerns

Sidecar!

slide-4
SLIDE 4

Decoupling → Velocity

  • Operators & Developers
  • Code & Networking
  • Network Topology & Security
  • Modernization & Architecture
slide-5
SLIDE 5

What is a ‘Service Mesh’ ?

A network for services, not bytes

  • Observability
  • Resiliency
  • Traffic Control
  • Security
  • Policy Enforcement

FREE!

  • Zero code change
slide-6
SLIDE 6

What is a ‘Service Mesh’ ?

A network for services, not bytes

  • Observability
  • Resiliency
  • Traffic Control
  • Security
  • Policy Enforcement
slide-7
SLIDE 7

Weaving the mesh - Sidecars

Outbound features: ❖ Service authentication ❖ Load balancing ❖ Retry and circuit breaker ❖ Fine-grained routing ❖ Telemetry ❖ Request Tracing ❖ Fault Injection Inbound features: ❖ Service authentication ❖ Authorization ❖ Rate limits ❖ Load shedding ❖ Telemetry ❖ Request Tracing ❖ Fault Injection

svcA sidecar proxy Service A svcB sidecar Service B External Services

HTTP/1.1, HTTP/2, gRPC, TCP with or without TLS HTTP/1.1, HTTP/2, gRPC, TCP with or without TLS Internet

slide-8
SLIDE 8

Istio - Putting it all together

svcA Envoy Pod Service A svcB Envoy Service B Pilot Control Plane API Mixer Discovery & Config data to Envoys P

  • l

i c y c h e c k s , t e l e m e t r y Control flow during request processing Istio-Auth T L S c e r t s t

  • E

n v

  • y

Traffic is transparently intercepted and proxied. App is unaware of Envoy’s presence

slide-9
SLIDE 9

Our sidecar of choice - Envoy

  • A C++ based L4/L7 proxy
  • Low memory footprint
  • Battle-tested @ Lyft

○ 100+ services ○ 10,000+ VMs ○ 2M req/s Plus an awesome team willing to work with the community!

Goodies: ❖ API driven config updates → no reloads ❖ Zone-aware load balancing w/ failover ❖ Traffic routing and splitting ❖ Health checks, circuit breakers, timeouts, retry budgets, fault injection, … ❖ HTTP/2 & gRPC ❖ Transparent proxying ❖ Designed for observability

slide-10
SLIDE 10

Modeling the Service Mesh

Eureka Kubernetes Consul

Envoy

Abstract Model

Custom

Platform Adapter Envoy API Rules API

Pilot Envoy Envoy Envoy Service discovery & traffic rules

1. Environment-specific topology extraction 2. Topology is mapped to a platform-agnostic model. 3. Additional rules are layered onto the model. E.g. retries, traffic splits etc. 4. Configuration is pushed to Envoy and applied without restarts

slide-11
SLIDE 11

What is a ‘Service Mesh’ ?

A network for services, not bytes

  • Observability
  • Resiliency
  • Traffic Control
  • Security
  • Policy Enforcement
slide-12
SLIDE 12

Visibility

Monitoring & tracing should not be an afterthought in the infrastructure Goals

  • Metrics without instrumenting apps
  • Consistent metrics across fleet
  • Trace flow of requests across services
  • Portable across metric backend providers

Istio Zipkin tracing dashboard Istio - Grafana dashboard w/ Prometheus backend

slide-13
SLIDE 13

Visibility: Metrics

Mixer Envoy Service Check(attributes) Report([ ]attributes) Operator Supplied Config Zipkin Prometheus Stackdriver Statsd Traces Grafana Backends GUIs Adapters Weave Scope ServiceGraph Example Requests Requests Responses Responses

  • Mixer collects metrics

emitted by Envoys

  • Adapters in the Mixer

normalize and forward to monitoring backends

  • Metrics backend can

be swapped at runtime

slide-14
SLIDE 14

Visibility: Tracing

Mixer Envoy Service Check(attributes) Report([ ]attributes) Operator Supplied Config Zipkin Prometheus Stackdriver Statsd Traces Grafana Backends GUIs Adapters Weave Scope ServiceGraph Example Requests Requests Responses Responses

  • Applications do not have to

deal with generating spans

  • r correlating causality
  • Envoys generate spans

○ Applications need to forward context headers

  • n outbound calls
  • Envoy sends traces to

Mixer

  • Adapters at Mixer send

traces to respective backends

slide-15
SLIDE 15

What is a ‘Service Mesh’ ?

A network for services, not bytes

  • Observability
  • Resiliency
  • Traffic Control
  • Security
  • Control
slide-16
SLIDE 16

Resiliency

Istio adds fault tolerance to your application without any changes to code

Resilience features ❖ Timeouts ❖ Retries with timeout budget ❖ Circuit breakers ❖ Health checks ❖ AZ-aware load balancing w/ automatic failover ❖ Control connection pool size and request load ❖ Systematic fault injection

// Circuit breakers destination: serviceB.example.cluster.local policy:

  • tags:

version: v1 circuitBreaker: simpleCb: maxConnections: 100 httpMaxRequests: 1000 httpMaxRequestsPerConnection: 10 httpConsecutiveErrors: 7 sleepWindow: 15m httpDetectionInterval: 5m

slide-17
SLIDE 17

What is a ‘Service Mesh’ ?

A network for services, not bytes

  • Observability
  • Resiliency & Efficiency
  • Traffic Control
  • Security
  • Policy Enforcement
slide-18
SLIDE 18

Traffic Splitting

svcA Envoy Pod Service A svcB Envoy Service B

http://serviceB.example

Pod Labels: version: v1.5 env: us-prod svcB Envoy Pod Labels: version: v2.0-alpha, env:us-staging serviceB.example.cluster.local Traffic routing rules 99% 1% Rules API Pilot

Traffic control is decoupled from infrastructure scaling

// A simple traffic splitting rule destination: serviceB.example.cluster.local match: source: serviceA.example.cluster.local route:

  • tags:

version: v1.5 env: us-prod weight: 99

  • tags:

version: v2.0-alpha env: us-staging weight: 1

slide-19
SLIDE 19

svcA Service A svcB Service B

version: v1 Pod 3 Pod 2 Pod 1

Content-based traffic steering

svcA Service A svcB Service B

version: v1 Pod 3 Pod 2 Pod 1 User-agent: *Android*

svcB’

version: canary Pod 4 User-agent: *iPhone*

Traffic Steering

// Content-based traffic steering rule destination: serviceB.example.cluster.local match: httpHeaders: user-agent: regex: ^(.*?;)?(iPhone)(;.*)?$ precedence: 2 route:

  • tags:

version: canary

slide-20
SLIDE 20

What is a ‘Service Mesh’ ?

A network for services, not bytes

  • Observability
  • Resiliency & Efficiency
  • Traffic Control
  • Security
  • Policy Enforcement
slide-21
SLIDE 21

Securing Services

  • Encryption by default
  • Verifiable identity
  • Secure naming / addressing
  • Revocation
slide-22
SLIDE 22

Problem: Strong Service Security at Scale

Concerns

  • Insiders
  • Hijacked services
  • Microservice attack surface
  • Workload mobility
  • Brittle fine-grained models
  • Securing resources not just endpoints
  • Audit & Compliance

Wants

  • Workload mobility
  • Remote admin & development
  • Shared & 3rd party services
  • User & Service identity
  • Lower costs

Traditional perimeter security models are insufficient

slide-23
SLIDE 23

Istio - Security at Scale

spiffe.io

slide-24
SLIDE 24

What is a ‘Service Mesh’ ?

A network for services, not bytes

  • Observability
  • Resiliency & Efficiency
  • Traffic Control
  • Security
  • Policy Enforcement
slide-25
SLIDE 25

Putting it all together

svcA Envoy Pod Service A svcB Envoy Service B Pilot Control Plane API Mixer Discovery & Config data to Envoys P

  • l

i c y c h e c k s , t e l e m e t r y Control flow during request processing Istio-Auth T L S c e r t s t

  • E

n v

  • y
slide-26
SLIDE 26

What’s Mixer For?

  • Nexus for policy evaluation and telemetry reporting

Precondition checking

Quotas & Rate Limiting

  • Primary point of extensibility
  • Enabler for platform mobility
  • Operator-focused configuration model
slide-27
SLIDE 27

Attributes - The behavioral vocabulary

target.service = “playlist.svc.cluster.local” request.size = 345 request.time = 2017-04-12T12:34:56Z source.ip = 192.168.10.1 source.name = “music-fe.serving.cluster.local” source.user = “admin@musicstore.cluster.local” api.operation = “GetPlaylist”

slide-28
SLIDE 28

Roadmap

  • Production Readiness
  • Multi-Cloud & Multi-Environment
  • Networking - Extension models, UDP, QUIC, performance, ...
  • Moar integrations - ACLs, Telemetry, Audit, Policy, ....
  • Security - HSM, Cert & Key stores, federation, ...
  • API Management
slide-29
SLIDE 29

Thanks! Phew