CO-ORDINATION WITH ZOOKEEPER PRESENTED BY: 1. PRATAP CHANDRA DAS - PowerPoint PPT Presentation

CO-ORDINATION WITH ZOOKEEPER PRESENTED BY: 1. PRATAP CHANDRA DAS 2. SHORAJ TOMER 3. SOUGATA BHATTACHARYA

CONTENT • Distributed Computing : A brief Introduction • Problems of manageability in distributed computing • Solution: Apache Zookeeper • What Zookeeper does? • A brief History • Framework of Zookeeper • Data Model and Hierarchical namespace of Zookeeper • Different modes for znodes • Zookeeper Quorums • Zookeeper Sessions, Requests and Transactions • Zab - ZooKeeper Atomic Broadcast protocol • Zookeeper Snapshots • Leader and Follower Prorocol • Projects which uses ZooKeeper • Tutorial – Installation, Setup, znode (creating, editing, deleting), subnode

DISTRIBUTED COMPUTING : INTRODUCTION network

INTRODUCTION  A distributed system is a model in which components located on networked computers communicate and coordinate their actions by passing messages.

WHY ‘DISTRIBUTED’?  The word distributed means data being spread out over more than one computer in a network.

SINGLE VS DISTRIBUTED Single le Machine ne Distrib ibut uted d Applicat atio ion  Complex Architecture  Simple Architecture  Huge tasks can be done within  Takes hours to complete a huge, minutes complex task  If one system crashes, other  All the processes stop if the system systems keep running and may take crashes over the faulty process

A DISTRIBUTED APPLICATION  Client Software  Server Software

CLUSTER AND NODE  Cluster: A group of systems  Node: Each System in a cluster

ADVANTAGES  Scalability: The system can easily be expanded by adding more machines as needed.  Redundancy: Several machines can provide the same services, so if one is unavailable, work does not stop.  Ease of development and maintenance  Coordination of autonomous actions

DISADVANTAGES  Complexity: More complex than centralized systems.  Network reliance: Messages can be lost in the communication network.  Security: More susceptible to external attacks.  Multiple Point of Failure: Much more prone to error due to huge number of machines.  Manageability: More effort is required for system management.

MAJOR ISSUES ON MANAGING DISTRIBUTED SYSTEMS  Race Condition: Performing two or more operations at the same time.  Deadlock: Two or more machines trying to access the same shared resources at the same time.  Partial Failure of Process: Leads to inconsistency of data.

SOLUTION : ZOOKEEPER

SOLVING THE MANAGEABILITY ISSUES  Race Condition: Serialization property of Zookeeper  Deadlock: Synchronization property of Zookeeper  Partial Failure of Process: Handled through atomicity

WHAT IS APACHE ZOOKEEPER? • Zookeeper is a centralized service for 1. Maintaining configuration information, 2. Naming, 3. Providing distributed synchronization and 4. Providing group services, for distributed applications. Zookeeper is a distributed, open-source coordination • service for distributed applications. It is also called as 'King of Coordination'

• It is centralized repository where distributed application can be put data and get data out of it. • Used to keep the distributed system functioning together as single unit, using its synchronization, Serialization and coordination goals. • It is Hadoop admin tool used for managing the jobs in the cluster. FORMA L DEFINITION: : It is a distributed open source Configuration, Synchronization service along with naming registry for distributed applications. It is used to manage and coordinate large cluster of machines.

DESIGN GOALS OF ZOOKEEPER • Must be able to tolerate failures • Must be able to recover from correlated recoverable failures (power outages) • Must be correct • Must be easy to implement correctly • Must be fast (high throughput, low latency)

WHY APACHE ZOOKEEPER? In the good old past, each application software was a single program running on a single computer with a single CPU. Today, things have changed. In the Big Data world, application software are made up of many independent programs running on an ever-changing set of computers. These applications are known as Distributed Application. A distributed application can run on multiple systems in a network simultaneously by coordinating among themselves to complete a particular task in a fast and efficient manner.

Coordinating the actions of the independent programs in a distributed systems is far more difficult than writing a single program to run on a single computer. It is easy for developers to get mired in coordination logic and lack the time to write their application logic properly or perhaps the converse, to spend little time with the coordination logic and simply to write a quick-and-dirty master coordinator that is fragile and becomes an unreliable single point of failure. Zookeeper is an important part of Hadoop that take care of these small but important issues so that developer can focus more on functionality of the application.

WHAT DOES A ZOOKEEPER DO? NAME ME SERVICE ICE:- Zoo-Keeper exposes a simple interface for Naming service which identifies the nodes in a cluster by name similar to DNS. LOCKING ING:- Zoo-Keeper provides for an easy way for you to implement distributed mutexes to allow for serialized access to a shared resource in your distributed system.

• CONFI NFIGU GURATION ION MANA NAGEM GEMEN ENT:- You can use Zoo-Keeper to centrally store and manage the configuration of your distributed system. This means that any new nodes joining will pick up the up-to-date centralized configuration from Zoo-Keeper as soon as they join the system. This also allows you to centrally change the state of your distributed system by changing the centralized configuration through one of the Zoo- Keeper clients. • LEADER ION :- Zoo-Keeper provides off-the-shelf support for leader election DER ELECT CTION which will deal with the problem of nodes going down. TION :- Hand in hand with distributed mutexes is the need for • SYNC NCHR HRON ONIZA IZATIO synchronizing access to shared resources. Whether implementing a producer-consumer queue or a barrier, Zoo-Keeper provides for a simple interface to implement that.

WORLD WITHOUT ZOOKEEPER • Previously distributed systems have implemented components like distributed lock managers or have used distributed databases for coordination. While it's possible to design and implement all of these services from • scratch, it's extra work and difficult to debug any problems, race conditions, or deadlocks. • There was a need that people shouldn't go around writing their own name services or leader election services from scratch every time they need it.

MOTIVATION BEHIND ZOOKEEPER Moreover, you could hack together a very simple group membership service relatively easily, but it would require much more work to write it to provide reliability, replication, and scalability. This led to the development and open sourcing of Apache Zoo-Keeper, an out-of-the box reliable, scalable, and high-performance coordination service for distributed systems.

Zoo-Keeper, in fact, borrows a number of concepts from these prior systems. It does • not expose a lock interface or a general purpose interface for storing data, however. The design of Zoo-Keeper is specialized and very focused on coordination tasks. • It is certainly possible to build distributed systems without using Zoo-Keeper. Zoo-Keeper, however, offers developers the possibility of focusing more on • application logic rather than on arcane distributed systems concepts. • Programming distributed systems without Zoo-Keeper is possible, but more difficult.

HISTORY The Origin of the Name “Zoo - Keeper” Zoo-Keeper was developed at Yahoo! Research. Yahoo had been working on Zoo- Keeper for a while and pitching it to other groups. At the time the Zoo-Keeper group had been working with the Hadoop team and had started a variety of projects with the names of animals, Apache Pig being the most well known. As the group started talking about different possible names, one of the group members mentioned that they should avoid another animal name because it started to sound like a zoo. That is when it clicked: distributed systems are a zoo. They are chaotic and hard to manage, and Zoo-Keeper is meant to keep them under control.

FRAMEWORK OF ZOOKEEPER It is designed to be easy to program to, and uses a data model styled after the familiar directory tree structure of file systems. It runs in Java and has bindings for both Java and C. Zoo-Keeper, while being a coordination service for distributed systems, is a distributed application on its own.

• It follows a simple client-server model where clients are nodes (i.e., machines) that make use of the service, and servers are nodes that provide the service. Applications make calls to Zoo-Keeper through a client library. The • client library is responsible for the interaction with Zoo-Keeper servers. Each client imports the client library, and then can communicate with any Zoo-Keeper node.

• Zoo-Keeper servers run in two modes: standalo dalone e and qu quor orum. • Standalone dalone mode de is pretty much what the term says: there is a single server, and Zoo-Keeper state is not replicated. • Quor orum m mode, a group of Zoo-Keeper servers, which we call a Zoo- Keeper ensemble, replicates the state, and together they serve client requests.

CO-ORDINATION WITH ZOOKEEPER PRESENTED BY: 1. PRATAP CHANDRA DAS - PowerPoint PPT Presentation

CO-ORDINATION WITH ZOOKEEPER PRESENTED BY: 1. PRATAP CHANDRA DAS 2. SHORAJ TOMER 3. SOUGATA BHATTACHARYA CONTENT Distributed Computing : A brief Introduction Problems of manageability in distributed computing Solution: Apache

Trafford Co-ordination Centre www.traffordccg.nhs.uk Trafford Co-ordination Centre Admin Team

Getting Rid of Zookeeper Jay Guo Kapil Arya Software Developer @IBM Mesos Committer

ZooKeeper: Wait-free coordination for Internet-scale systems Xuyang Zhang Zhesheng Xie What is

Summary of the TED Ordination Study Report to TOSC 1 2 3 6 January, 2014 4 5 The TED

3 4 INTRODUCTION 5 The Seventh-day Adventist Church has debated the issue of the ordination of

Theology of Ordination Report of the Biblical Research Committee South Pacific Division November

Local Area Co-ordination Presentation to Adult Social Care Scrutiny Committee 28 February 2018

Lambeth Lambeth Partnership Tasking Partnership Tasking & & Co- -ordination

Lambeth Lambeth Partnership Tasking Partnership Tasking & & Co- -ordination

Lambeth Lambeth Partnership Tasking Partnership Tasking & & Co- -ordination

Building a Name Service with ZooKeeper Albert Kim Motivation Question: Why do we want a name

ZooKeeper Wait-free coordination for Internet-scale systems Patrick Hunt and Mahadev (Yahoo!

Lecture: Google Chubby lock service ZooKeeper

Automatic failovers with Kubernetes using Orchestrator, ProxySQL and Zookeeper Continuous

Coordinating distributed systems part II Marko Vukoli Distributed Systems and Cloud Computing

Clustering Samba with Zookeeper and Cassandra Richard Sharpe Outline What Im doing

,QWURGXFWLRQ 5HVLOLHQW0XOWLFDVW6XSSRUWIRU

Introduction CS 136 Computer Security Peter Reiher January 8, 2008 Lecture 1 Page 1 CS 136,

Running Software in Albuquerque to Measure Censorship Anywhere Jeffrey Knockel Roya Ensafi

The Unexpected Responsiveness of Internet Hosts Neil Spring Me Measure the Internet to

The aftermath of Hurricane Klaus in France or one week in the life of GMES Emergency Response

Welcome to Storm ! The Storm botnet Reachability check Overnet (UDP) The Storm botnet

Ultrascale Visualization for Giga-cell Reservoir Simulation Jorge Pita, Nabil Zamel and Ali Dogru

Tips for the Scientic Programmer Michele Simionato@GEM Foundation This talk is about

CO-ORDINATION WITH ZOOKEEPER PRESENTED BY: 1. PRATAP CHANDRA DAS - PowerPoint PPT Presentation

CO-ORDINATION WITH ZOOKEEPER PRESENTED BY: 1. PRATAP CHANDRA DAS 2. SHORAJ TOMER 3. SOUGATA BHATTACHARYA CONTENT Distributed Computing : A brief Introduction Problems of manageability in distributed computing Solution: Apache

Trafford Co-ordination Centre www.traffordccg.nhs.uk Trafford Co-ordination Centre Admin Team

Getting Rid of Zookeeper Jay Guo Kapil Arya Software Developer @IBM Mesos Committer

ZooKeeper: Wait-free coordination for Internet-scale systems Xuyang Zhang Zhesheng Xie What is

Summary of the TED Ordination Study Report to TOSC 1 2 3 6 January, 2014 4 5 The TED

3 4 INTRODUCTION 5 The Seventh-day Adventist Church has debated the issue of the ordination of

Theology of Ordination Report of the Biblical Research Committee South Pacific Division November

Local Area Co-ordination Presentation to Adult Social Care Scrutiny Committee 28 February 2018

Lambeth Lambeth Partnership Tasking Partnership Tasking &amp; &amp; Co- -ordination

Lambeth Lambeth Partnership Tasking Partnership Tasking &amp; &amp; Co- -ordination

Lambeth Lambeth Partnership Tasking Partnership Tasking &amp; &amp; Co- -ordination

Building a Name Service with ZooKeeper Albert Kim Motivation Question: Why do we want a name

ZooKeeper Wait-free coordination for Internet-scale systems Patrick Hunt and Mahadev (Yahoo!

Lecture: Google Chubby lock service ZooKeeper

Automatic failovers with Kubernetes using Orchestrator, ProxySQL and Zookeeper Continuous

Coordinating distributed systems part II Marko Vukoli Distributed Systems and Cloud Computing

Clustering Samba with Zookeeper and Cassandra Richard Sharpe Outline What Im doing

,QWURGXFWLRQ 5HVLOLHQW0XOWLFDVW6XSSRUWIRU

Introduction CS 136 Computer Security Peter Reiher January 8, 2008 Lecture 1 Page 1 CS 136,

Running Software in Albuquerque to Measure Censorship Anywhere Jeffrey Knockel Roya Ensafi

The Unexpected Responsiveness of Internet Hosts Neil Spring Me Measure the Internet to

The aftermath of Hurricane Klaus in France or one week in the life of GMES Emergency Response

Welcome to Storm ! The Storm botnet Reachability check Overnet (UDP) The Storm botnet

Ultrascale Visualization for Giga-cell Reservoir Simulation Jorge Pita, Nabil Zamel and Ali Dogru

Tips for the Scientic Programmer Michele Simionato@GEM Foundation This talk is about

Lambeth Lambeth Partnership Tasking Partnership Tasking & & Co- -ordination

Lambeth Lambeth Partnership Tasking Partnership Tasking & & Co- -ordination

Lambeth Lambeth Partnership Tasking Partnership Tasking & & Co- -ordination