Distributed Systems Principles and Paradigms Maarten van Steen VU - PowerPoint PPT Presentation

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 12: Distributed Web-Based Systems Version: December 10, 2012

Distributed Web-Based Systems 12.1 Architecture Distributed Web-based systems Essence The WWW is a huge client-server system with millions of servers; each server hosting thousands of hyperlinked documents. Documents are often represented in text (plain text, HTML, XML) Alternative types: images, audio, video, applications (PDF, PS) Documents may contain scripts, executed by client-side software 2. Server fetches Client machine Server machine document from local file Browser Web server OS 3. Response 1. Get document request (HTTP) 2 / 19

Distributed Web-Based Systems 12.1 Architecture Multi-tiered architectures Observation Already very soon, Web sites were organized into three tiers. 3. Start process to fetch document 1. Get request 4. Database interaction HTTP� CGI� request� program handler 6. Return result 5. HTML document� created Web server CGI process Database server 3 / 19

Distributed Web-Based Systems 12.1 Architecture Web services Observation At a certain point, people started recognizing that it is was more than just user ↔ site interaction: sites could offer services to other sites ⇒ standardization is then badly needed. Client machine Server machine Look up� a service Client� Server� Publish service application application Stub Stub SOAP Communication� Communication� subsystem subsystem Generate stub� Generate stub� from WSDL� from WSDL� description description Service description (WSDL) Service description (WSDL) Service description (WSDL) Directory service (UDDI) 4 / 19

Distributed Web-Based Systems 12.2 Processes Apache Web server Observation: More than 52% of all 185 million Web sites are Apache. The server is internally organized more or less according to the steps needed to process an HTTP request. Module Module Module Function ... ... ... Link between� function and hook Hook Hook Hook Hook Apache core Functions called per hook Request Response 5 / 19

Distributed Web-Based Systems 12.2 Processes Server clusters Essence To improve performance and availability, WWW servers are often clustered in a way that is transparent to clients. Web Web Web Web server server server server LAN Front end handles Front all incoming requests end and outgoing responses Request Response 6 / 19

Distributed Web-Based Systems 12.2 Processes Server clusters Problem The front end may easily get overloaded, so that special measures need to be taken. Transport-layer switching: Front end simply passes the TCP request to one of the servers, taking some performance metric into account. Content-aware distribution: Front end reads the content of the HTTP request and then selects the best server. 7 / 19

Distributed Web-Based Systems 12.2 Processes Server Clusters Question Why can content-aware distribution be so much better? 6. Server responses Web 5. Forward server 3. Hand of f other TCP connection messages Distributor Other messages Dis- Client Switch 4. Inform patcher switch Setup request Distributor 1. Pass setup request 2. Dispatcher selects to a distributor server Web server 8 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Web proxy caching Basic idea Sites install a separate proxy server that handles all outgoing requests. Proxies subsequently cache incoming documents. Cache-consistency protocols: Always verify validity by contacting server Age-based consistency: T expire = α · ( T cached − T last modified )+ T cached 9 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Web proxy caching Basic idea (cnt’d) Cooperative caching, by which you first check your neighbors on a cache miss Web server 3. Forward request to Web server 1. Look in local cache Web 2. Ask neighboring proxy caches Web proxy proxy Cache Cache Client Client Client Client Client Client Web proxy HTTP Get request Cache Client Client Client 10 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Replication in Web hosting systems Observation By-and-large, Web hosting systems are adopting replication to increase performance. Much research is done to improve their organization. Follows the lines of self-managing systems. Uncontrollable parameters (disturbance / noise) Initial configuration Corrections Observed output Web hosting system +/- +/- +/- Reference input Replica� Consistency� Request� Metric� placement enforcement routing estimation Analysis Measured output Adjustment triggers 11 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Handling flash crowds Observation We need dynamic adjustment to balance resource usage. Flash crowds introduce a serious problem. 2 days 2 days (b) (a) 6 days 2.5 days (c) (d) 12 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Server replication Content Delivery Network CDNs act as Web hosting services to replicate documents across the Internet providing their customers guarantees on high availability and performance (example: Akamai). 6. Get embedded documents� (if not already cached) CDN� Cache server 5. Get embedded� documents Return IP address� 7. Embedded documents client-best server 1. Get base document CDN DNS� Origin� 4 Client server server 2. Document with refs� to embedded documents DNS lookups 3 Regular� DNS system 13 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Replication of Web applications Observation Replication becomes more difficult when dealing with databses and such. No single best solution. Assumption Updates are carried out at origin server, and propagated to edge servers. 14 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Replication of Web applications: normal Edge-server�side Origin-server�side Client query Web Web server server response Appl Appl logic logic Content-blind Database cache copy full/partial�data�replication Content-aware Authoritative full�schema�replication/ cache database query�templates Schema Schema 15 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Replication of Web applications Alternative solutions Full replication: high read/write ratio, often in combination with complex queries. Partial replication: high read/write ratio, but in combination with simple queries Content-aware caching: Check for queries at local database, and subscribe for invalidations at the server. Works good with range queries and complex queries. Content-blind caching: Simply cache the result of previous queries. Works great with simple queries that address unique results (e.g., no range queries). Question What can be said about replication vs. performance? 16 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Replication Web apps.: full/partial replication Edge-server�side Origin-server�side Client query Web Web server server response Appl Appl logic logic Content-blind Database cache copy full/partial�data�replication Content-aware Authoritative full�schema�replication/ cache database query�templates Schema Schema 17 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Replication Web apps.: content-aware caching Edge-server�side Origin-server�side Client query Web Web server server response Appl Appl logic logic Content-blind Database cache copy full/partial�data�replication Content-aware Authoritative full�schema�replication/ cache database query�templates Schema Schema 18 / 19

Distributed Web-Based Systems 12.6 Consistency and Replication Replication Web apps.: content-blind caching Edge-server�side Origin-server�side Client query Web Web server server response Appl Appl logic logic Content-blind Database cache copy full/partial�data�replication Content-aware Authoritative full�schema�replication/ cache database query�templates Schema Schema 19 / 19

Distributed Systems Principles and Paradigms Maarten van Steen VU - PowerPoint PPT Presentation

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 12: Distributed Web-Based Systems Version: December 10, 2012 Distributed Web-Based Systems 12.1 Architecture

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

PARADIGMS PARADIGMS & & PRINCIPLES PRINCIPLES Presented By: Parakram (CSE) Ved

Distributed Computing Paradigms Distributed Application Paradigms level of abstraction high

between Cyber-Physical Systems (CPS) and Smart Systems and Smart System Integration paradigms,

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Breaking Paradigms in Control Building Design By Robert Frye Tennessee Valley Authority April 6,

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Web Caching based on: Web Caching , Geoff Huston Web Caching and Zipf-like Distributions:

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

A Socially Aware Caching Mechanism for Encounter Networks Future Internet Architectures: New

COMP 633 - Parallel Computing Lecture 10 September 15, 2020 CC-NUMA (1) CC-NUMA implementation

HTTP HTTP: HyperText Transfer Protocol Basis for fetching Web pages request Network CSE 461

1945: Vannevar Bush The Internet End-End As we may think, Atlantic The Web Monthly,

and Transitive Trust Jeff Jarmoc Sr. Security Researcher Dell SecureWorks About this talk

Web Engineering HTTP-message = Request | Response generic-message = start-line *message-header

Distributed Systems Principles and Paradigms Maarten van Steen VU - PowerPoint PPT Presentation

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 12: Distributed Web-Based Systems Version: December 10, 2012 Distributed Web-Based Systems 12.1 Architecture

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

PARADIGMS PARADIGMS &amp; &amp; PRINCIPLES PRINCIPLES Presented By: Parakram (CSE) Ved

Distributed Computing Paradigms Distributed Application Paradigms level of abstraction high

between Cyber-Physical Systems (CPS) and Smart Systems and Smart System Integration paradigms,

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Breaking Paradigms in Control Building Design By Robert Frye Tennessee Valley Authority April 6,

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals &amp; Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals &amp; Challenges

Web Caching based on: Web Caching , Geoff Huston Web Caching and Zipf-like Distributions:

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

A Socially Aware Caching Mechanism for Encounter Networks Future Internet Architectures: New

COMP 633 - Parallel Computing Lecture 10 September 15, 2020 CC-NUMA (1) CC-NUMA implementation

HTTP HTTP: HyperText Transfer Protocol Basis for fetching Web pages request Network CSE 461

1945: Vannevar Bush The Internet End-End As we may think, Atlantic The Web Monthly,

and Transitive Trust Jeff Jarmoc Sr. Security Researcher Dell SecureWorks About this talk

Web Engineering HTTP-message = Request | Response generic-message = start-line *message-header

PARADIGMS PARADIGMS & & PRINCIPLES PRINCIPLES Presented By: Parakram (CSE) Ved

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges

Distributed Systems Goals of Distributed Systems 13A. Distributed Systems: Goals & Challenges