 
              Microservices reativos usando a stack do Netfmix na AWS Diego Pacheco Principal Software Architect at ilegra.com @diego_pacheco
www.ilegra.com
NetfmixOSS Stack
Why Netfmix? Netfmix My Problem  Billions Requests Per Day  Social Product  1/3 US internet  Social Network bandwidth  Video  ~10k EC2 Instances  Docs  Multi-Region  Apps  100s Microservices  Chat  Innovation + Solid  Scalability Service  Distributed T eams  SOA, Microservices and  Could reach some DevOps Benchmark Web Scale
AWS
Cloud Native
Principles  Stateless Services  SOA  Ephemeral Instances  Microservices  Everything fails all the  No Central Database  NoSQL time  Auto Scaling / Down  Lightweight Serializable Scaling Objects  Multi AZ and multi  Latency tolerant Region protocols  No SPOF  DevOps Enabler  Immutable Infrastructure  Design for Failure  Anti-Fragility (expected)
Right Set of Assumptons
Microservices
Reactive
Java Drivers X REST X
Simple View of the Architecture UI Zuul Microservice Cassandra Cluster
Stack
OSS
Zuul
Zuul
Karyon: Microbiology - Nucleus
RxNetty  Reactive Extensions + Netty Server  Lower Latency under Heavy Load  Fewer Locks, Fewer Thread Migrations  Consumes Less CPU  Lower Object Allocation Rate
Karyon: CODE
Karyon: Reactive
Karyon: Reactive
Eureka and Service Discovery http://microservices.io/patterns/server-side-discovery.html
Eureka  AWS Service Registry for Mid-tier  Load balancing and Failover  REST based  Karyon and Ribbon Integration
Eureka
Eureka and Service Discovery
Availability
Histryx
Ribbon  IPC Library  Client Side Load Balancing  Multi-Protocol (HTTP, TCP, UDP)  Caching*  Batching  Reactive
Ribbon CODE
Ribbon CODE
RX-Java  Reactive Extension of the JVM  Async/Event based programming  Observer Pattern  Less 1mb  Heavy usage by Netfmix OSS Stack
Archaius  Confjguration Management Solution  Dynamic and T yped Properties  High Throughtput and Thread Safety  Callbacks: Notifjcations of confjg changes  JMX Beans  Dynamic Confjg Sources: File, Db, DynamoDB, Zookeper  Based on Apache Commons Confjguration
Archaius + Git Central Property Internal GIT Files Microservice Microservice Slave Side Car Microservice Microservice Slave Side Car Microservice Microservice Slave Side Car File File File System System System
Asgard
Asgard
Deploys Bake/Provision JOB Create Packer Launch
Dynomite: Distributed Cache https://github.com/Netfmix/dynomite
Dynomite  Implements the Amazon Dynamo  Similar to Cassandra, Riak and DynamoDB  Strong Consistency – Quorum-like – No Data Loss  Pluggable  Scalable  Redis / Memcached  Multi-Clients with Dyno  Can use most of redis commands  Integrated with Eureka via Prana
Dynomite: Distributed Cache  Isolate Failure – Avoid cascading  Redundancy – NO SPOF  Auto-Scaling  Fault T olerance and Isolation  Recovery  Fallbacks and Degraded Experience  Protect Customer from failures – Don’t throw Failures -> Failures VS Errors
Dynomite: Internals
Multi-Region Cluster Eureka Server Oregon D1 Prana Eureka Server Oregon D2 Prana Prana N California D3
Dynomite: CODE
Dynomite Contributions https://github.com/Netflix/dynomite/pull/200 https://github.com/Netflix/dynomite/pull/207 https://github.com/Netfmix/dynomite
Caos Engineering
Gatling  Stress T esting  T ool  Scala DSL  Run on top of Akka  Simple to use
Chaos Arch ELB Eureka Zuul Zuul Microservice N1 Microservice N2 Cassandra Cluster
Running…
Chaos Results and Learnings  Retry confjguration and Timeouts in Ribbon  Right Class in Zuul 1.x (default retry only SocketException)  RequestSpecifjcRetryHandler (Httpclient Exceptions)  zuul.client.ribbon.MaxAutoRetries=1  zuul.client.ribbon.MaxAutoRetriesNextServer=1  zuul.client.ribbon.OkT oRetryOnAllOperations=true Eureka Timeouts  It Works  Everything needs to have redudancy  ASG is your friend :-)  Stateless Service FTW
Kafka / Storm :: Event System Microservice Producer
Chaos Results and Learnings  Before:  Data was not in Elastic Search  Producers was loosing data  After:  No Data Loss  It Works  Changes:  No logging on Microservice :( (Log was added)  Code that publish events on a try-catch  Retry confjg in kafka producer from 0 to 5
Main Challenges
Hacker Mindset
Next Steps  IPC  Spinnaker  Containers  Client side Aggregation  DevOps 2.0 -> Remediation / Skynet
Pocs https://github.com/diegopacheco/netflixoss-pocs http://diego-pacheco.blogspot.com.br/search/label/netflix?max-results=30
Microservices reativos usando a stack do Netfmix na AWS Obrigado! Diego Pacheco Principal Software Architect at ilegra.com @diego_pacheco
Recommend
More recommend