Y - PowerPoint PPT Presentation

Υποδομές για Yπηρεσίες ΠΠΠ γιγαντιαίας κλίμακας (Giant-scale infrastructures) Οι διαφάνειες στηρίζονται σε υλικό του Δρ . Μάριου Δικαιάκου

Ποιά είναι η αρχιτεκτονική ενός Data Center Υπηρεσιών Ιστού; Με ποιές μετρικές μετράμε την διαθεσιμότητα (availability) Υπηρεσιών Ιστού; Τι είναι η κατάτμηση ΒΔ (partitioning) και τι η αναπαραγωγή (replication) και τι επιδιώκουμε με αυτές; Πώς επηρεάζεται το Yield και το DQ από σφάλματα;

Παραδείγματα � Web portals (Yahoo, CNN,…) � e-Commerce (eBay, Amazon, AliBaba…) � Search Engines (Google, Bing,…) � Messaging and Communication (WhatsApp, iCQ, Slack…) � Geoservices (Waze, GoogleMaps,…) � Social Networks (Facebook, Twitter,…) 3 EPL344

A server room in Council Bluffs, Iowa. Photo: Google/Connie Zhou Clusters in Facebook

Clusters [συστοιχίες Η/Υ] � Collections of commodity servers that work together on a single problem, offering as main advantages : 5 EPL344

Γιατί συστοιχίες; � Absolute scalability ( επεκτασιμότητα ) . A successful network service must scale to support a substantial fraction of the world’s population. � Cost and performance � no alternative to clusters can match the required scale � hardware cost is typically dwarfed by bandwidth and operational costs. � Independent components. Users expect 24-hour service from systems that consist of thousands of hardware and software components. Transient hardware failures and software faults due to rapid system evolution are inevitable, but clusters simplify the problem by providing (largely) independent faults. 6 EPL344

Βασικές υποθέσεις υπηρεσιών κλίμακας � Service provider has limited control over the clients and the IP network � Queries drive the service [e.g. HTTP get] � Read-only queries greatly outnumber updates (queries that affect the persistent data store) 7 EPL344

Αρχιτεκτονικό Μοντέλο Πηγή : E. Brewer, IC 2001 8 EPL344

Πλεονεκτήματα Μοντέλου � Access anywhere, anytime. A ubiquitous infrastructure facilitates access from home, work, airport, and so on. � Availability via multiple devices . Infrastructure handles most of the processing => users can access services from “thin clients”, which can offer far more functionality for a given cost and battery life. � Groupware support. Centralizing data from many users allows service providers to offer group-based applications (calendars, teleconferencing systems, group-management systems). � Lower overall cost. Infrastructure services have a fundamental cost advantage over designs based on stand-alone devices: can be multiplexed across active users; end-user devices have very low utilization (less than 4 percent), while infrastructure resources often reach 80 percent utilization; centralizing the administrative burden and simplifying end devices also reduce overall cost. � Simplified service updates. Most powerful long-term advantage is the ability to upgrade existing services or offer new services without the physical distribution required by traditional applications and devices. 9 EPL344

Βασικά Δομοστοιχεία Mοντέλου � Clients ( πελάτες ) , such as Web browsers, standalone email readers, or even programs that use XML and SOAP (Simple Object Access Protocol) initiate the queries to the services. � The best-effort IP network, whether the public Internet or a private network such as an intranet, provides access to the service. � The load manager ( εξισορροπητής φορτίου ) provides a level of indirection between the service’s external name and the servers’ physical names (IP addresses) to preserve the external name’s availability in the presence of server faults. The load manager balances load among active servers. Traffic might flow through proxies or firewalls before the load manager. � Servers ( εξυπηρετητές / διακομιστές / διαθέτες ) are the system’s workers, combining CPU, memory, and disks into an easy-to-replicate unit. � The persistent data store ( βάση δεδομένων ) is a replicated or partitioned “database” that is spread across the servers’ disks. It might also include network attached storage such as external DBMSs or systems that use RAID storage. � Many services also use a backplane . This optional system-area-network handles inter server traffic such as redirecting client queries to the correct server. 10 EPL344

Εξισορρόπηση φορτίου (load balancing) � Στόχος : ισορροπημένος επιμερισμός εισερχόμενου φορτίου στους διαθέσιμους εξυπηρετητές . � Προσεγγίσεις : � Have DNS distribute different IP addresses for a single domain name among clients in a rotating fashion (“round-robin DNS”) � Combination of: � custom “layer-4” switches that understand TCP and port numbers, and can make decisions based on this information � “front-end” nodes that act as service-specific “layer-7” (application layer) switches, understand HTTP requests and parse URLs at wire speed � Include clients in the load-management process (clients know about alternative servers and can switch to them if primary server disappears) 11 EPL344

Handling Failure (διαχείριση σφαλμάτων) � Load-balancing switches: � Support hot failover to avoid the obvious single point of failure � Hot failover: the ability for one switch to take over for another automatically � Can handle very high throughputs � Detect down nodes automatically, usually by monitoring open TCP connections, and thus dynamically isolate down nodes from clients quite well 12 EPL344

Πηγή: E. Brewer, ΙΕΕΕ IC 2001 EPL344

High Availability (υψηλή διαθεσιμότητα) � Major driving requirement behind giant-scale system design, in the presence of component failures, natural disasters, and also constantly evolving features and unpredictable growth. � Α vailability Metrics ( μετρικές ): � uptime ( λειτουργικός χρόνος ) = (MTBF – MTTR)/MTBF � Fraction of time a site is handling traffic � MTBF: mean time between failures � MTTR: mean time to recover � Typically measured in nines - traditional infrastructure systems aim for 4 to 5 nines (0.9999 to 0.99999) � yield ( απόδοση ) = queries completed/queries offered � Fraction of queries that are completed successfully � harvest ( συγκομιδή ) = data available/complete data � in systems based on queries, we can measure query completeness — how much of the database is reflected in the answer � this can be extended to features supported by a service 16 EPL344

DQ (data per query) Principle Data per query x queries per second -> constant � Principle rather than a literal truth: the system’s overall capacity tends to have a particular physical bottleneck ( στενωπός ), such as total I/O bandwidth or total seeks per second � The DQ value is the total amount of data that has to be moved per second on average � it is thus bounded by the underlying physical limitation � at the high utilization level typical of giant-scale systems, the DQ value approaches this limitation � The DQ value is measurable and tunable 17 EPL344

Μeasuring and Tuning DQ � Πώς μετράμε το DQ μιας υποδομής ; � Define target workload ( φορτίο ) � Use a load generator to measure a given combination of hardware, software and db size against this workload � Given the metric and the load generator, it is easy to measure relative impact of faults � Πώς βελτιώνουμε το DQ; � DQ scales linearly with the number of nodes � We can translate future traffic predictions into future DQ requirements and this into hardware and software target - convert traffic predictions into capacity planning http://www.seleniumhq.org/ decisions 18 EPL344

Partitioning (κατάτμηση-διαμελισμός) DATASET 19 EPL344

Partitioning (κατάτμηση-διαμελισμός) DATASET 20 EPL344

Partitioning (κατάτμηση-διαμελισμός) � Persistent data is partitioned across the servers, which increases aggregate capacity DATASET 21 EPL344

Partitioning and Faults � What is the effect of failure on: � Yield? ( απόδοση ) � Harvest? ( συγκομιδή ) 22 22 22 EPL344

Replication (αναπαραγωγή) 23 EPL344

Replication (αντιγραφή-αναπαραγωγή) � Used to increase performance and availability and to improve fault tolerance – provides multiple consistent copies of data in processes running in different computers. � The traditional view of replication silently assumes that there is enough excess capacity to prevent faults from affecting yield. DATASET 24 EPL344

Y - PowerPoint PPT Presentation

Y (Giant-scale infrastructures) .

A web-based collaboration platform Oliver Kutter kutter@in.tum.de Technische Universit at M

BrickNet (contd) BrickNet (contd) Other Academic Projects Other Academic Projects

Identification and Collection Seminar on E-Discovery, February 9th, 2012, College of Information

Akonadi The KDE4 PIM Framework Tobias Koenig KDE Akademy 2006 p. 1 Overview Why a new

Q 2 1 1 TDT4250 - Modeling of Information Systems, Autumn 2006 Unified Activity Management

TAUS Moses Roundtable Welcome and Aims Rahzeb Choudhury

Digital Media Development - Media Streaming - Prof. Dr. Andreas Schrader ISNM International

CHAPTER 1: OVERVIEW OF DISTRIBUTED SYSTEMS Dr. Tr n H i Anh Outline 2 Introduction 1.

OsmocomBB A tool for GSM protocol level security Harald Welte gnumonks.org gpl-violations.org

A3/A8 & COMP128 Billy Brumley Helsinki University of Technology bbrumley@cc.hut.fi T-79.514

GENERAL SHAREHOLDERS MEETING BUSINESS YEAR 2015 0 CHAIRMAN S REPORT 1 C H A I R M A N S

outline development of cryptographic algorithms for a real life application introduction

Lecture no: 11 (Brief) history of mobile telephony Global System for Mobile

CS 8803 - Cellular and Mobile Network Security: Data Air Interface Professor Patrick Traynor

Gate-Shift Networks for Video Action Recognition Swathikiran Sudhakaran 1 Sergio Escalera 2,3

Fraunhofer Institute FOKUS APAN Hongkong February 2011 APAN SensNet Working Group meeting 23 rd

GLOBAL SYSTEM FOR MOBILE COMMUNICATION ARFCNS, CHANNELS ECE 2526 Monday, February 10, 2020 1

The Last Talk with Nothing New to Say Tilman Plehn Universit at Heidelberg Amherst, May 2014

A Low Data Complexity Attack on the GMR-2 Cipher Used in the Satellite Phones Ruilin Li, Heng Li,

Chapter 4 - Making It Work Multiple Access Radiowave Propagation Signal Processing The Network

192620010 Mobile & Wireless Networking Lecture 5: Cellular Systems (UMTS / LTE) (1/2)

Real-Time Transport Protocol (RTP) August 12, 2001 RTP 2 RTP protocol goals mixers and

From Justice for Jonathan Martinis Jenny to Justice for Senior Director for Law and Policy

Democratising Attention Data at guardian.co.uk Graham Tackley @tackers Director of

Y - PowerPoint PPT Presentation

Y (Giant-scale infrastructures) .

A web-based collaboration platform Oliver Kutter kutter@in.tum.de Technische Universit at M

BrickNet (contd) BrickNet (contd) Other Academic Projects Other Academic Projects

Identification and Collection Seminar on E-Discovery, February 9th, 2012, College of Information

Akonadi The KDE4 PIM Framework Tobias Koenig KDE Akademy 2006 p. 1 Overview Why a new

Q 2 1 1 TDT4250 - Modeling of Information Systems, Autumn 2006 Unified Activity Management

TAUS Moses Roundtable Welcome and Aims Rahzeb Choudhury

Digital Media Development - Media Streaming - Prof. Dr. Andreas Schrader ISNM International

CHAPTER 1: OVERVIEW OF DISTRIBUTED SYSTEMS Dr. Tr n H i Anh Outline 2 Introduction 1.

OsmocomBB A tool for GSM protocol level security Harald Welte gnumonks.org gpl-violations.org

A3/A8 &amp; COMP128 Billy Brumley Helsinki University of Technology bbrumley@cc.hut.fi T-79.514

GENERAL SHAREHOLDERS MEETING BUSINESS YEAR 2015 0 CHAIRMAN S REPORT 1 C H A I R M A N S

outline development of cryptographic algorithms for a real life application introduction

Lecture no: 11 (Brief) history of mobile telephony Global System for Mobile

CS 8803 - Cellular and Mobile Network Security: Data Air Interface Professor Patrick Traynor

Gate-Shift Networks for Video Action Recognition Swathikiran Sudhakaran 1 Sergio Escalera 2,3

Fraunhofer Institute FOKUS APAN Hongkong February 2011 APAN SensNet Working Group meeting 23 rd

GLOBAL SYSTEM FOR MOBILE COMMUNICATION ARFCNS, CHANNELS ECE 2526 Monday, February 10, 2020 1

The Last Talk with Nothing New to Say Tilman Plehn Universit at Heidelberg Amherst, May 2014

A Low Data Complexity Attack on the GMR-2 Cipher Used in the Satellite Phones Ruilin Li, Heng Li,

Chapter 4 - Making It Work Multiple Access Radiowave Propagation Signal Processing The Network

192620010 Mobile &amp; Wireless Networking Lecture 5: Cellular Systems (UMTS / LTE) (1/2)

Real-Time Transport Protocol (RTP) August 12, 2001 RTP 2 RTP protocol goals mixers and

From Justice for Jonathan Martinis Jenny to Justice for Senior Director for Law and Policy

Democratising Attention Data at guardian.co.uk Graham Tackley @tackers Director of

A3/A8 & COMP128 Billy Brumley Helsinki University of Technology bbrumley@cc.hut.fi T-79.514

192620010 Mobile & Wireless Networking Lecture 5: Cellular Systems (UMTS / LTE) (1/2)