Real-Time Analytics Meets Kubernetes Tal Doron Director, - PowerPoint PPT Presentation

Real-Time Analytics Meets Kubernetes Tal Doron Director, Technology Innovation

ABOUT ME @taldor oron on taldor oron on84 tald ld@gigaspaces.com Ta Tal Doron Director, Technology Innovation

About GigaSpaces 300+ Direct customers We provide one of the leading in-memory computing platforms for real-time insight to action and extreme transactional processing. With GigaSpaces, enterprises 50+ / 500+ can operationalize machine learning and transactional Fortune / Organizations processing to gain real-time insights on their data and act upon them in the moment. 5,000+ Large installations in production (OEM) 25+ InsightEdge is an in-memory real- In-Memory Computing time analytics platform for instant Platform for microsecond ISVs insights to action; analyzing data scale transactional as it's born, enriching it with processing, data scalability, historical context, for smarter, and powerful event-driven faster decisions workflows

Why * Intro pictures from Wikipedia

Dinosaurs

We’ve looked up to the stars

Not without first passing through the clouds

It’s the smallest of opponents that are gamechangers

We needed to find a way to ship man there… The first flight of an airplane, the Wright Flyer on December 17, 1903

How do we become cloud native? • Ma Mana nage L Larg rge D Depl ployment nts • Cloud-ready, ZooKeeper based for large-scale and federated deployments • RE REST AP API M Management • Standards-based, utilizing • Container eriza zation and Orch chest estration • Docker, Kubernetes, OpenShift etc. • Applica cation-dri driven D n Depl ployment nt • Serverless-like user experience • Plugga ggable e Elast stic c Reso esource ce Balanci cing g • Scheduling for dynamic re-partitioning and resource allocation • Tel elem emet etry and Clust ster er In Intel elligen gence ce • Predictive maintenance / fault-tolerance over large-scale deployments

Who’s using K8s?

OVERVIEW • An overview of Kubernetes and the value it is bringing for automating deployment, scaling, and management of containerized applications • How organizations can simplify management and container deployment on Cloud, Hybrid or On-premises environments with GigaSpaces InsightEdge • 3 top open-source tools for production: HELM, Istio, and Prometheus • A Kubernetes services comparison between cloud providers: AWS vs. Azure vs. GCP

How Can You Gain the Most Value from Your Data? Near real-ti Ne time data ta is highly valuable if you act on it on time Time-cr Ti critical cal Tr Trad aditional al “ “bat atch ch” de decision business bu in intellig igence His istorical orical + near ar Value re real-ti time data ta is more ore valuable if you have the means Actionab Reactive Historical le to combine them Preventive/ Actionable Reactive Predictive Historical REAL-TIME SECONDS MINUTES HOURS DAYS MONTHS Time

InsightEdge: Real-time Analytics for Instant Insights To Action VARIOUS APPLICATION DATA SOURCES UNIFIED REAL-TIME ANALYTICS, AI & TRANSACTIONAL PROCESSING REAL-TIME INSIGHT TO ACTION DISTRIBUTED IN-MEMORY MULTI MODEL STORE RAM HOT DATA STORAGE-CLASS MEMORY DASHBOARDS SSD STORAGE WARM DATA • No ETL, reduced complexity REAL-TIME LAYER • Built-in integration with external Hadoop/Data Lakes S3-like • Fast access to historical BATCH LAYER data • Automated DEPLOY ANYWHERE COLD life-cycle management CLOUD/ON-PREMISE DATA

Kubernetes At leas ast 54% % of of the Fort ortun une 500 00 we were re hirin iring for or Ku Kubernetes s skills i in 2 2017 Aroun round d 51% % growt rowth for or Ku Kubernetes s share i in t the ma market in 2018

Kubernetes is the Winner • #1 discussed project on GitHub • Top 2 in number of contributors • ~400K users on Slack

Business Landscape • The leading orchestration tool vs. Docker Swarm, Mesos, OpenShift and Cloud Foundry and most used CNCF project • All cloud vendors have a managed Kubernetes service (EKS, AKS and GKE) • Apache Spark 2.3 has native Kubernetes support

Why Kubernetes? Desired State Scheduler Ke Key bui buildi ding bl blocks s for a “cloud ud like” HA Architecture pl platform a m as a a s service Cooperative Multi-Tenancy • Auto deployment of data services, functions and frameworks (Spark Service Account Authentication ML, SQL, Zeppelin, etc.) • Orchestration automation with RBAC Authorization cloud native solutions (auto scale, self healing)

Kubernetes – Management POD • Lookup Service (LUS) - The Lookup Service provides a mechanism for MANAGEMENT services to discover each other. For POD example, querying the LUS to find LOOKUP SERVICE GSA active GSCs. APACHE ZOOKEEPER • Apache ZooKeeper - Zookeeper is a centralized service used for space REST MANAGER leader election • REST Manager - RESTful API for managing the environment remotely from any platform NODE

Kubernetes – Data POD • Data Grid Instance - This is the fundamental unit of deployment in the DATA POD data grid. A Processing Unit instance is the actual runtime entity. DATA GRID INSTANCE #1 GSA • Each Data POD contains a single . instance to provide cloud native . . support using Kubernetes built-in . . controllers (auto scale, self healing) DATA POD DATA GRID INSTANCE #N NODE

Kubernetes – Spark POD CLIENT • Driver Pod – The Spark driver is spark-submit running within a POD. The driver DRIVER POD creates executors, connects to them, SPARK DRIVER and executes the applicative code. GSA • Executor Pod – When the application completes, the executors’ pods terminate and are cleaned up, but the master pod persists logs and remains in “completed” state EXECUTOR EXECUTOR EXECUTOR EXECUTOR POD POD POD POD SPARK SPARK SPARK SPARK EXECUTOR EXECUTOR EXECUTOR EXECUTOR NODE A NODE B

XAP High Level Overview 3,1 CLIENT CLIENT CLIENT REST SELECT MANAGEMENT MANAGEMENT MANAGEMENT POD POD POD #3 #1 #2 DATA DATA DATA DATA DATA DATA POD POD POD POD POD POD C A B B’ C’ A’ NODE 1 NODE 3 NODE 2

InsightEdge High Level Overview 3,1 CLIENT CLIENT CLIENT spark-submit SELECT MANAGEMENT MANAGEMENT MANAGEMENT ZEPPELIN SPARK POD POD POD POD DRIVER #3 #1 #2 POD SPARK SPARK SPARK EXECUTOR EXECUTOR EXECUTOR POD POD POD DATA DATA DATA DATA DATA DATA POD POD POD POD POD POD C A B B’ C’ A’ NODE 1 NODE 3 NODE 2

Kubernetes Dashboard View

“Under the Hood” Guidelines • Apply a POD Anti-Affinity using label selectors for both Data and Management PODs • For example: spread the primary and backup data pods from this service across zones • Each POD has a persistent identifier that is maintained across any rescheduling using StatefulSets • For example: automated rolling updates/scale up data pod one-by-one

Installation • HELM – The package manager for Kubernetes • Helm Charts helps you define, install and upgrade both XAP and InsightEdge # helm install gigaspaces/insightedge --version=14.0 --name demo

Installation – Define Capacity • The following Helm deploys a cluster with 3 partitions with 512MiB allocated for each partition: # helm install gigaspaces/insightedge --version=14.0 --name demo --set pu.partitions=3 ,pu.resources.limits.memory=512Mi

Installation – Define High Availability • The following Helm command deploys a cluster in a high availability topology, with anti-affinity enabled: # helm install gigaspaces/insightedge --version=14.0 --name demo --set pu.ha=true,pu.antiAffinity.enabled=true

Testing for Liveness • Use liveness probes to notify Kubernetes that your application’s processes are unhealthy and it should restart them • The probe calls a bash script livenessProbe: exec: command: - sh - -c - “data-pod-liveness 3181" initialDelaySeconds: 15 timeoutSeconds: 5

Testing for Readiness • Use readiness probes to notify Kubernetes that your application’s processes are able to process input, for example: when data is loading the pod not yet ready. • The probe calls a bash script readienssProbe: exec: command: - sh - -c - “data-pod-ready 2251" initialDelaySeconds: 15 timeoutSeconds: 5

Any Cloud Lang API WAN Gateway WAN Gateway WAN Gateway – Real-time IMDG WAN Gateway Data Replication

WAN GATEWAY POD DELEGATOR WAN Gateway SINK CLUSTER A CLUSTER B WEB UI MANAGEMENT WEB UI MANAGEMENT MANAGEMENT MANAGEMENT MANAGEMENT MANAGEMENT POD POD3 POD POD POD POD POD POD PUBLIC IP WAN GW WAN GW POD POD DATA DATA DATA DATA DATA DATA DATA DATA DATA DATA DATA DATA DATA DATA DATA DATA POD POD POD POD POD POD POD POD POD POD POD POD POD POD POD POD C C A B B’ D’ A’ B B’ D’ C’ D A C’ A’ D NODE 1 NODE 3 NODE 1 NODE 3 NODE 2 NODE 2

Real-Time Analytics Meets Kubernetes Tal Doron Director, - PowerPoint PPT Presentation

Real-Time Analytics Meets Kubernetes Tal Doron Director, Technology Innovation ABOUT ME @taldor oron on taldor oron on84 tald ld@gigaspaces.com Ta Tal Doron Director, Technology Innovation About GigaSpaces 300+ Direct customers

Airflow on Kubernetes: Containerizing your Workflows By Michael Hewitt Agenda Kubernetes

Kubernetes on ARM64 Kubernetes on ARM64 Raspberry PI 4 Kubernetes cloud for a Raspberry PI 4

Matthias Sohn Adel Zaalouk SAP From Containers to Kubernetes From Containers to Kubernetes

From Laptop to the World With Kubernetes @saturnism @googlecloud #kubernetes Ray Tsang

Contributing to kubernetes Who am I? Senior Software Engineer at Gojek Organizer at Kubernetes

Continuous Kubernetes Security @sublimino and @controlplaneio Im: - Andy - Dev-like -

Kubernetes Matthias Haeussler Mirna Alaisami Overview Overview Kubernetes is an open-source

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Real-Time in the Real World: Building a State of the Art Real-Time Analytics Platform INFORMS

Real- Real -Time Systems Time Systems Real- -Time Systems Time Systems Real

Real Real- -Time Systems Time Systems Designing a real- Designing a real -time system time

Real- Real -time systems time systems Real- Real -time programming time programming

Kubernetes & AI with Run:AI, Red Hat & Excelero AI WEBINAR Date/Time: Tuesday, June 9 |

Real graduates, Real graduates, real transitions, real transitions, real stories: real

Real Real Real Time Real-Time Time Time Model Checking Model Model Checking Model

Kubernetes Administration from Zero to (junior) Hero Lszl Budai Component Soft Ltd.

Container-based virtualization: Docker Corso di Sistemi Distribuiti e Cloud Computing A.A.

technologies Containerization workshop Infrastructures for Cloud Computing and Big Data M

Virtualization and Containerization What is Virtualization? What is Containerization? What does

1 Agenda Docker world Containers VS Virtual machine Security concerns Conclusion Whoami

C l a i m c o n t r o l o f y o u r D o c k e r i ma g e s D i i m

INTRODUCTION TO DOCKER ADRIAN MOUAT SO WHAT IS DOCKER? SIMILAR TO A LIGHTWEIGHT VM Both

DevOps with Kubernetes and Helm Jessica Deen Cloud Developer Advocate HELLO! I am Jessica

Advanced ML in Google Cloud Abhay Agarwal (MS Design 19) Agenda General Notes on

Real-Time Analytics Meets Kubernetes Tal Doron Director, - PowerPoint PPT Presentation

Real-Time Analytics Meets Kubernetes Tal Doron Director, Technology Innovation ABOUT ME @taldor oron on taldor oron on84 tald ld@gigaspaces.com Ta Tal Doron Director, Technology Innovation About GigaSpaces 300+ Direct customers

Airflow on Kubernetes: Containerizing your Workflows By Michael Hewitt Agenda Kubernetes

Kubernetes on ARM64 Kubernetes on ARM64 Raspberry PI 4 Kubernetes cloud for a Raspberry PI 4

Matthias Sohn Adel Zaalouk SAP From Containers to Kubernetes From Containers to Kubernetes

From Laptop to the World With Kubernetes @saturnism @googlecloud #kubernetes Ray Tsang

Contributing to kubernetes Who am I? Senior Software Engineer at Gojek Organizer at Kubernetes

Continuous Kubernetes Security @sublimino and @controlplaneio Im: - Andy - Dev-like -

Kubernetes Matthias Haeussler Mirna Alaisami Overview Overview Kubernetes is an open-source

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Real-Time in the Real World: Building a State of the Art Real-Time Analytics Platform INFORMS

Real- Real -Time Systems Time Systems Real- -Time Systems Time Systems Real

Real Real- -Time Systems Time Systems Designing a real- Designing a real -time system time

Real- Real -time systems time systems Real- Real -time programming time programming

Kubernetes &amp; AI with Run:AI, Red Hat &amp; Excelero AI WEBINAR Date/Time: Tuesday, June 9 |

Real graduates, Real graduates, real transitions, real transitions, real stories: real

Real Real Real Time Real-Time Time Time Model Checking Model Model Checking Model

Kubernetes Administration from Zero to (junior) Hero Lszl Budai Component Soft Ltd.

Container-based virtualization: Docker Corso di Sistemi Distribuiti e Cloud Computing A.A.

technologies Containerization workshop Infrastructures for Cloud Computing and Big Data M

Virtualization and Containerization What is Virtualization? What is Containerization? What does

1 Agenda Docker world Containers VS Virtual machine Security concerns Conclusion Whoami

C l a i m c o n t r o l o f y o u r D o c k e r i ma g e s D i i m

INTRODUCTION TO DOCKER ADRIAN MOUAT SO WHAT IS DOCKER? SIMILAR TO A LIGHTWEIGHT VM Both

DevOps with Kubernetes and Helm Jessica Deen Cloud Developer Advocate HELLO! I am Jessica

Advanced ML in Google Cloud Abhay Agarwal (MS Design 19) Agenda General Notes on

Kubernetes & AI with Run:AI, Red Hat & Excelero AI WEBINAR Date/Time: Tuesday, June 9 |