DNS Belgium in the cloud ICANN Tech Day 2017-03-13 - - PowerPoint PPT Presentation

dns belgium in the cloud
SMART_READER_LITE
LIVE PREVIEW

DNS Belgium in the cloud ICANN Tech Day 2017-03-13 - - PowerPoint PPT Presentation

DNS Belgium in the cloud ICANN Tech Day 2017-03-13 maarten.bosteels@dnsbelgium.be What did we do ? Migrated to Amazon Web Services Re-built entire registration platform from code Took down the wall between Ops & Dev Main


slide-1
SLIDE 1

DNS Belgium in the cloud

ICANN Tech Day – 2017-03-13

maarten.bosteels@dnsbelgium.be

slide-2
SLIDE 2

What did we do ?

  • Migrated to Amazon Web Services
  • Re-built entire registration platform from code
  • Took down the wall between Ops & Dev
slide-3
SLIDE 3

Main drivers for change

  • Configuration drift (Test vs. Prod)
  • Long lead times (eg. patching)
  • Difficult hand-overs Dev-Ops
  • Infrequent deployments
  • Lots of fire fighting, little time for fire prevention
  • Aging hardware
slide-4
SLIDE 4

Classic model ?

Power Physical space Connectivity Networking Server Hardware Storage Host OS Hypervisor Virtual Machines Third party software Application servers Modules Registration System RAR RANT CUSTOMER SUPPORT Security Out-sourced In-house Focus Ops Focus Dev RDBMS

slide-5
SLIDE 5

Engineering = Dev + Ops + QA

  • Multi-functional agile teams
  • Focus on upper layers of the stack
  • Infrastructure-as-code => reproducible & testable
  • Continuous Delivery

=> small amounts of change & early feedback.

  • Dev & Ops both confronted with quality of their work
  • Design for failure: resilient, self-monitoring, self-healing

Strategy

slide-6
SLIDE 6

Status early 2015

  • Last hardware renewal : 2011
  • Big bang migration
  • New hardware / network design / storage solution / colo
  • Lots of vendors to manage
  • Go for another big bang ?
  • Do we really need our own hardware ?
  • Why not use the cloud ?
slide-7
SLIDE 7

Extra layer

Power Physical space Connectivity Networking Server Hardware Storage Host OS Hypervisor Virtual Machines Third party software Application servers Modules Registration System RAR RANT CUSTOMER SUPPORT Security Out-sourced In-house Focus Engineering RDBMS Security Orchestration & Config Mgmt

slide-8
SLIDE 8

Initial assessment of AWS

  • Initial tests:
  • Get to know AWS services
  • Proof-of-concept
  • Risk assessment
  • Technically feature complete ?
  • Confidentiality, Integrity, Availability ?
  • Legal risk assessment
  • Performance tests
  • Cost assessment
  • Man days
slide-9
SLIDE 9

Conclusion of assessment

  • Software-defined everything
  • Avoid configuration drift
  • Infra predictable & documented => increased security
  • Encryption all data in transit + data at rest
  • IaaS = enabler to focus on core business
  • No need for home-grown HA solutions
  • Use well-designed services with built-in redundancy
  • Underlying services keep improving ‘for free’
  • Pay what you use
  • Dev & Test environments : business hours only
  • Easily scale up / down
slide-10
SLIDE 10

Infra-as-code: building blocks

Network layout RDBMS Access rules VMs Disk volumes Load-balancers Cloudformation In-house software Third-party software Pulp (rpm repo) Configuration + Monitoring + … Puppet

Git repos:

  • Puppet modules + config
  • In-house software
  • Cloudformation templates
slide-11
SLIDE 11

Overview environments

slide-12
SLIDE 12

High availability

  • All components
  • distributed over 2 availability zones within one AWS Region
  • active-active
  • behind Elastic Load Balancers
  • Intelligent health checks
  • Share content via RDBMS or via EFS (= NFS like)
  • All RDBMS instances in multi-AZ mode
slide-13
SLIDE 13

Oracle – multi-AZ RDS

Datacenter - Nossegem Ireland node 1 node 2 Stand-by applications applications Primary RDS Stand-by RDS HQ (Leuven) DMS

  • On-prem
  • Both RAC nodes in same DC
  • Manual fail-over to stand-by instance
  • AWS: multi-AZ RDS
  • Synchronous Replication
  • Automatic & Transparent Fail-Over

Availability Zone 1 Availability Zone 2 Disaster Recovery Datacenter - Diegem Oracle RAC

slide-14
SLIDE 14

RDS & Database migration

  • Amazon RDS = enormous time saver !
  • No OS level access on Amazon RDS

=> DataGuard etc not an option

  • Amazon Database Migration Service (DMS)

too immature for the migration

  • Used complex Oracle Datapump export / import sequence

instead

  • Temporarily up-scaled Oracle instance
  • Final export / transfer / import / verify : 2.5h
slide-15
SLIDE 15

Experience so far

  • RAR’s dealt well with change of IP addresses
  • Overall satisfied with quality of service & docs
  • No performance issues
  • Not impacted by S3 outage in US
slide-16
SLIDE 16

Next step– Full DR site in another region

Ireland Primary RDS Stand-by RDS HQ (Leuven) DMS

  • Keep DB’s + git in sync with main site
  • In case of region failure

§ Create resources from code § Switch entry points via DNS

Availability Zone 1 Availability Zone 2 Disaster Recovery Frankfurt Primary RDS Stand-by RDS Availability Zone 1 Availability Zone 2 DMS

slide-17
SLIDE 17

Next steps

  • Disaster Recovery site in another region
  • Fully automate Continuous Delivery Pipeline
  • Blue / Green deployments
  • Nameservers in the cloud ?
  • Multi-cloud ?
  • Serverless architecture ?
slide-18
SLIDE 18

The team

slide-19
SLIDE 19

Questions ?