The Web and Content The Web and Content Networks: the Big Picture - PowerPoint PPT Presentation

The Web and Content The Web and Content Networks: the Big Picture Networks: the Big Picture Jeff Chase

Services Services “Do A for me.” “OK, here’s your answer.” “ Now do B.” “OK, here.” Server Client request/response paradigm ==> client/server roles - Remote Procedure Call (RPC) - object invocation, e.g., Remote Method Invocation (RMI) - HTTP (the Web) - device protocols (e.g., SCSI)

How does the Web work? How does the Web work? The canonical example in your Web browser Click here “here” is a Uniform Resource Locator (URL) http://www-cse.ucsd.edu It names the location of an object (document) on a server. [courtesy of Geoff Voelker] voelker@cs.ucsd.edu

In Action… … In Action http://www-cse.ucsd.edu HTTP Client Server • Client uses DNS to resolves name of server ( www-cse.ucsd.edu ) • Establishes an HTTP connection with the server over TCP/IP • Sends the server the name of the object (null) • Server returns the object [Voelker]

HTTP in a Nutshell HTTP in a Nutshell GET /path/to/file/index.html HTTP/1.0 Content-type: MIME/html, Content-Length: 5000,... Server Client HTTP supports request/response message exchanges of arbitrary length. Small number of request types: basically GET and POST, with supplements. object name, + content for POST optional query string optional request headers Responses are self-typed objects ( documents ) with attributes and tags. optional cookies optional response headers

The Dynamic Web The Dynamic Web GET program-name?arg1=x&arg2=y execute program Content-type: MIME/html, Content-Length: 5000,... Server Client HTTP began as a souped-up FTP that supports hypertext URLs. Service builders rapidly began using it for dynamically-generated content. Web servers morphed into Web Application Servers . Common Gateway Interface (CGI) Java Servlets and JavaServer Pages (JSP) Microsoft Active Server Pages (ASP) “Web Services”

Multi- -tier Services tier Services Multi JNDI, JDBC,SQL relational HTTP HTTP RPC, RMI databases IIOP Clients Web DCOM, EJB, application CORBA, etc. HTML+forms, server applets, JavaScript, etc. file servers middle tiers e.g., component “middleware” transaction monitors

Web Protocols Web Protocols What kind of transport protocol should the Web use? HTTP 1.0 • One TCP connection per request • Complaints: inefficient, slow, burdensome… HTTP 1.1 • One TCP connection/many requests ( persistent connections ) • Solves all problems, right? Huge amount of complexity Clients, proxies, servers How do they compare? • Protocol differences [Krishnamurthy99], performance comparison [Nielsen97], effects on servers [Manley97], overhead of TCP connections [Caceres98] HTTPS: HTTP with authentication and encryption [Voelker]

Persistent Connections Persistent Connections There are three key performance reasons for persistent connections: • connection setup overhead • TCP slow start : just do it and get it over with • pipelining as an alternative to multiple connections And some new complexities resulting from their use, e.g.: • request/response framing and pairing • unexpected connection breakage Just ask anyone from Akamai... • large numbers of active connections How long to keep connections around? These motivations and issues manifest in HTTP, but they are fundamental for request/response messaging over TCP.

Web Service Scaling Web Service Scaling The Internet The Internet How to handle all those client requests raining on your server?

Scaling Server Sites: Clustering Scaling Server Sites: Clustering Goals server load balancing L4: TCP failure detection L7: HTTP access control filtering SSL priorities/QoS etc. request locality virtual IP transparent caching smart addresses switch Clients (VIPs) What to switch/filter on? L3 source IP and/or VIP server array L4 (TCP) ports etc. L7 URLs and/or cookies L7 SSL session IDs

Scaling Services: Replication Scaling Services: Replication Site A Site B Distribute service load across ? multiple sites. Internet Internet How to select a server site for each client or request? Is it scalable? Client

Scaling with Peer- -to to- -Peer Peer Scaling with Peer Is (e.g.) Napster a service? Is the peer-to-peer approach fundamentally more scalable? More robust? Internet Internet What does it assume about the clients? Peers

Caching for a Better Web Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely used method to improve Web performance • Duplicate requests to the same document served from cache • Hits reduce latency, bandwidth demand, server load • Misses increase latency (extra hops) Hits Internet Misses Misses Clients Proxy Cache Servers [Source: Geoff Voelker]

Proxy Caching Proxy Caching How should we build caching systems for the Web? • Seminal paper [Chankhunthod96] • Proxy caches [Duska97] • Akamai DNS interposition [Karger99] • Cooperative caching [Tewari99, Fan98, Wolman99] • Popularity distributions [Breslau99] • Proxy filtering and transcoding [Fox et al] • Consistency [Tewari,Cao et al] • Replica placement for CDNs [et al] [Voelker]

Issues for Web Caching Issues for Web Caching • Binding clients to proxies, handling failover Manual configuration, router-based “transparent caching”, WPAD (Web Proxy Automatic Discovery) • Proxy may confuse/obscure interactions between server and client. • Consistency management At first approximation the Web is a wide-area read-only file service...but it is much more than that. caching responses vs. caching documents deltas [Mogul+Bala/Douglis/Misha/others@research.att.com] • Prefetching, scale, request routing, scale, performance Web caching vs. content distribution (CDNs, e.g., Akamai)

End- -to to- -End Content Delivery End Content Delivery End request stream CDN servers hosting Internet network request surrogate distributor caches proxies server array + storage upstream downstream

Proxy Deployment and Use Proxy Deployment and Use Where to put it? How to direct user Web traffic through the proxy? Request redirection • Much more to come on this topic… Must the server consent? • Protected content • Client identity “Transparent” caching and the end-to-end principle • Must the client consent?

Interception Switches Interception Switches The client doesn’t know. The server doesn’t know. Neither side told HTTP to disable it. Is it legal? Good thing? Bad thing? ISP cache array

Shouldn’ ’t This Be Illegal? t This Be Illegal? Shouldn end end middle RFC 1122: The Internet Architecture (IPv4) specifies that each packet has a unique destination “host” address. Problems middle boxes may be subversive IPsec and SSL dynamic routing

The Web and Content The Web and Content Networks: the Big Picture - PowerPoint PPT Presentation

The Web and Content The Web and Content Networks: the Big Picture Networks: the Big Picture Jeff Chase Services Services Do A for me. OK, heres your answer. Now do B. OK, here. Server Client request/response

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

I Prefer Pi Corey Sinnamon Febuary 3, 2015 Big Day 3/14/15 Big Day 3/14/15 Themes Big

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

Web Content Audit Web Support Team Terminology Web content: is the textual, visual or aural

Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL)

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural

PRESENTATION Want big impact? USE BIG IMAGE 2 Source: The Indian Express Want big impact? USE

Crowdfunding Nico Ritschel, July 20 th August 3 rd 2018 Some History Some Theory What

Big Data Analytics: What is Big Data? H. Andrew Schwartz Stony Brook University CSE545, Fall

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

Server-side Scripting Slides courtesy of Xenia Mountrouidou URLs and web servers 2

1 Ideal Multiple Access Protocol Random Access Protocols Broadcast channel of rate R bps When

Software-Defined Networking Paul Grubbs Portions of this talk taken from:

1 Frequency Division Multiplexing Time Division Multiplexing Timeslice given frequency band

CSE 510 Web Data Engineering Introduction UB CSE 510 Web Data Engineering Staff Instructor:

CS108 Lecture 21: Web Applications Web Application Architecture Generating HTML from Python

Lecture 20 & 21 - Web Security CMPSC 443 - Spring 2012 Introduction Computer and Network

AN INTRODUCTION TO CORBA AN INTRODUCTION TO CORBA Paul Jonusaitis jonusait@ix.netcom.com Topics

The Web and Content The Web and Content Networks: the Big Picture - PowerPoint PPT Presentation

The Web and Content The Web and Content Networks: the Big Picture Networks: the Big Picture Jeff Chase Services Services Do A for me. OK, heres your answer. Now do B. OK, here. Server Client request/response

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

I Prefer Pi Corey Sinnamon Febuary 3, 2015 Big Day 3/14/15 Big Day 3/14/15 Themes Big

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

CS535 Big Data 1/22/2020 Sangmi Lee Pallickara CS535 Big Data | Computer Science Department

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

Web Content Audit Web Support Team Terminology Web content: is the textual, visual or aural

Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL)

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural

PRESENTATION Want big impact? USE BIG IMAGE 2 Source: The Indian Express Want big impact? USE

Crowdfunding Nico Ritschel, July 20 th August 3 rd 2018 Some History Some Theory What

Big Data Analytics: What is Big Data? H. Andrew Schwartz Stony Brook University CSE545, Fall

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES &amp; OPPORTUNITIES Paris Big Data

Server-side Scripting Slides courtesy of Xenia Mountrouidou URLs and web servers 2

1 Ideal Multiple Access Protocol Random Access Protocols Broadcast channel of rate R bps When

Software-Defined Networking Paul Grubbs Portions of this talk taken from:

1 Frequency Division Multiplexing Time Division Multiplexing Timeslice given frequency band

CSE 510 Web Data Engineering Introduction UB CSE 510 Web Data Engineering Staff Instructor:

CS108 Lecture 21: Web Applications Web Application Architecture Generating HTML from Python

Lecture 20 &amp; 21 - Web Security CMPSC 443 - Spring 2012 Introduction Computer and Network

AN INTRODUCTION TO CORBA AN INTRODUCTION TO CORBA Paul Jonusaitis jonusait@ix.netcom.com Topics

HOW BIG IS BIG DATA FOR AN INSURER LIKE AXA? CHALLENGES & OPPORTUNITIES Paris Big Data

Lecture 20 & 21 - Web Security CMPSC 443 - Spring 2012 Introduction Computer and Network