Application Protocols and HTTP 14-740: Fundamentals of Computer - - PowerPoint PPT Presentation

application protocols and http
SMART_READER_LITE
LIVE PREVIEW

Application Protocols and HTTP 14-740: Fundamentals of Computer - - PowerPoint PPT Presentation

Application Protocols and HTTP 14-740: Fundamentals of Computer Networks Bill Nace Material from Computer Networking: A Top Down Approach, 6 th edition. J.F. Kurose and K.W. Ross Administrivia Lab #0 due in a week Next time: Paper


slide-1
SLIDE 1

Material from Computer Networking: A Top Down Approach, 6th edition. J.F. Kurose and K.W. Ross

Application Protocols and HTTP

14-740: Fundamentals of Computer Networks Bill Nace

slide-2
SLIDE 2

Administrivia

  • Lab #0 due in a week
  • Next time: Paper Review of Mockapetris88
  • Quiz #1 approaches (29 Sep, 2 weeks away)
  • Designed for ~45 minutes
  • Multiple choice / Short answer questions
  • Covers everything up until 25 Sep lecture
  • Layered architecture, Design Principles, ISPs &

Peering, Web & HTTP , DNS, P2P , Queuing Theory

  • Note: TAs hold office hours ➙ go talk to them

2

slide-3
SLIDE 3

Last Lecture

  • ISPs, Backbones, Peering
  • Motivations to peer
  • Tier-1
  • Tier-2
  • Content / Enterprise Companies
  • Interconnections
  • Private vs Public Peering

3

slide-4
SLIDE 4

traceroute

  • Application Layer
  • Web and HTTP
  • Message format
  • Persistent connections
  • Caching

4

slide-5
SLIDE 5

In the app layer

5 TCP UDP

HTTP SMTP DNS

(queries)

VOIP

  • Abstract transport
  • Use transport services: TCP or UDP or ...
  • Think about transport as a channel for

data from client to server and back

  • TCP requires setup and teardown
  • UDP has no such requirement
slide-6
SLIDE 6

Setup Overhead

6

  • Client initiates transport

connection to server

  • API: Creates socket
  • Server accepts connection
  • Application-layer protocol

messages exchanged between browser and web server

  • Transport connection closed

Client Server initiate transport connection accept connection request file send file file received; close connection closed

slide-7
SLIDE 7

Operations

  • Mission (i.e. fundamental reason it exists)
  • Addressing
  • Network data type

7

slide-8
SLIDE 8

traceroute

  • Application Layer
  • Web and HTTP
  • Message format
  • Persistent connections
  • Caching

8

slide-9
SLIDE 9

HTTP Overview

  • HTTP: hypertext transfer protocol
  • Web’s application layer protocol
  • Client / server model
  • client: browser that requests, 


receives, renders web objects

  • server: stores objects, sends in 


response to requests

  • Many implementations in various operating

systems communicate using HTTP

9

PC running IE Mac running Safari Linux running Apache (web server) HTTP Request HTTP Request HTTP Response HTTP Response

slide-10
SLIDE 10

History Lesson

  • HTTP 0.9, circa 1990
  • Original release, first described in W3 mailing list
  • “HTTP as implemented in WWW”, by Tim Berners-Lee

http://lists.w3.org/Archives/Public/www-talk/1992JanFeb/0000.html

  • HTTP 1.0, started 1993
  • RFCs for HTML and URI published the same year
  • Informational RFC-1945, 1996
  • Not a standards document, merely common usages
  • A number of problems
  • Caching control
  • TCP overhead for short responses

10

slide-11
SLIDE 11

History (2)

  • HTTP/1.1
  • Backward compatibility big issue
  • RFC 2068, proposed standard, 1997
  • RFC 2616, draft standard, 1999
  • Some web server products claimed compliance to

HTTP 1.1 even before it became standard!

  • RFC 2616 had to be “backward compatible” with

2068

  • Pressures from vendors, technologist, etc
  • This lecture focuses on HTTP/1.1

11

slide-12
SLIDE 12

History (3)

  • HTTP/2 approved May 2015 (RFC 7540)
  • Somewhat slow adoption rate
  • Changes how data is transferred
  • Avoid head-of-line blocking
  • Compress headers
  • Allow server push
  • Violates layered architecture principles

12

slide-13
SLIDE 13

traceroute

  • Application Layer
  • Web and HTTP
  • Message format
  • Persistent connections
  • Caching

13

slide-14
SLIDE 14

HTTP Message

  • 2 types of messages
  • Requests from client to server, and
  • Responses from server to client
  • RFCs use Backus-Naur Form (BNF) to

formally specify formats (RFC 5234)

14

HTTP-message = Request | Response

  • r
slide-15
SLIDE 15

Request / Response

  • Both request and response consist of:
  • Start line, followed by ...
  • Zero or more headers, followed by ...
  • An empty line, followed by ...
  • Message body (optional)

15

generic-message = start-line *( message-header CRLF ) CRLF [ message-body ] start-line = Request-line | status-line Zero or more times

  • ptional
slide-16
SLIDE 16

Request Format

Request = Request-Line *(( general-header |request-header |entity-header ) CRLF) CRLF [ message-body ]

Zero or more of general, request or entity headers followed by CRLF, followed by an optional message body

slide-17
SLIDE 17

What are those headers?

  • Headers provide metadata about the

request or response

  • Dates/times
  • Application or Server information
  • Caching control
  • 46 defined headers
  • Host: is required on requests

17

slide-18
SLIDE 18
  • You get the idea ....

Request Format (2)

Request-Line = Method SP Request-URI SP HTTP-Version CRLF Method = “OPTIONS” | “GET” | “HEAD” | “POST” | “PUT” | “DELETE”

18

slide-19
SLIDE 19

Example: Request

  • Note: ASCII (human-readable format)

19

GET /images/logos.html HTTP/1.1 Host: www.cmu.edu User-agent: mozilla/5.0... Connection: close Accept-language: en-US ¶ ¶ (extra carriage return, line feed) Request-line (GET, POST, ... commands) message-header (x4) CRLF: Carriage return, line feed 2nd CRLF indicates no message-body, thus end of message

slide-20
SLIDE 20

Request Methods

  • GET: Retrieve an object
  • “Conditional GET” if header includes If-

Modified-Since, If-Match, etc

  • “Partial GET” if header includes a

Range field

  • Essential for restartable transfers such

as scrubbing and buffering a media stream

20

slide-21
SLIDE 21

Request Methods

  • HEAD: Retrieve metadata about an
  • bject (validity, modification time, etc)
  • Same as GET but MUST NOT return a

message body

21

slide-22
SLIDE 22

Request Methods

  • OPTIONS: Request info about the

capabilities of server (or a resource) without requesting the resource

  • POST: Upload data to server
  • E.g. posting a message to mailing list,

submitting a form, etc

22

slide-23
SLIDE 23

Example: Response

HTTP/1.1 200 OK Connection close Date: Wed, 01 Sep 2018 12:16:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 2016 ... Content-Length: 6821 Content-Type: text/html data data data data data ... start-line = Request-line | status-line Recall: status-line = HTTP-version SP Status-Code SP Reason-Phrase CRLF Defined as: Status code Header lines data, e.g. requested HTML file

slide-24
SLIDE 24

Status Code

  • In first line of response from server to client
  • 3-digit integer result code
  • 1xx: Informational – Request received, continuing

process

  • 2xx: Success – Action successful
  • 3xx: Redirection – Further action needed to complete

request

  • 4xx: Client Error – Request has bad syntax or cannot

be fulfilled

  • 5xx: Server Error – Server failed to fulfill a valid request

24

slide-25
SLIDE 25

Sample Status Codes

  • 200 OK
  • request succeeded, requested object included in

this message

  • 301 Moved Permanently
  • requested object moved, new location specified

later in this message (Location:)

  • 404 Not Found
  • requested document not found on this server
  • 505 HTTP Version Not Supported

25

slide-26
SLIDE 26

Google.com/index.html

  • HTML file (index.html) describes

layout, links, scripts, etc

  • Includes a reference to the logo

image file (logo.gif)

slide-27
SLIDE 27

Question

  • Does the following request retrieve the

logo file as well?

27

GET /index.html HTTP/1.1 <CR> Host: www.google.com <CR> <CR>

slide-28
SLIDE 28

HTTP Request

  • Each HTTP Request retrieves a single
  • bject per message
  • An object (e.g. HTML file) can contain

links to other objects (e.g. images, HTML files)

  • Client must send separate request to

retrieve each additional object

28

slide-29
SLIDE 29

traceroute

  • Application Layer
  • Web and HTTP
  • Message format
  • Persistent connections
  • Caching

29

slide-30
SLIDE 30

Connection Management

  • HTTP uses TCP as its transport protocol
  • TCP not optimized for short-lived

connections typical of HTTP message exchange

  • Often simple pages, which result in short

messages

30

slide-31
SLIDE 31

Nonpersistent HTTP

  • 1. Client initiates TCP

connection to cmu.edu on port 80

  • 3. Client sends HTTP

request message (containing URL /index.html) into connection socket

  • 2. Web server at cmu.edu

waiting for connection on port 80 “accepts” connection, responds to sender

  • 4. Server receives request

message, fetches object, formats response message, sends message into connection

Suppose user wants cmu.edu/index.html

slide-32
SLIDE 32

Nonpersistent HTTP (2)

  • 6. HTTP client receives

response message. Parses HTML file, discovering 10 referenced image files

  • 7. Repeat steps 1-5 for each
  • f 10 image objects
  • 5. Server closes connection
slide-33
SLIDE 33

Client Server initiate transport connection accept connection request file send file file received; close connection closed

Response time modeling

  • Round Trip Time (RTT): time to send a

small packet from client to server and back

  • Calculation for HTTP response time
  • One RTT to initiate TCP connection
  • One RTT for HTTP request and first

byte of response

  • file transmission time
  • response time = 2RTT + transmit time

RTT Transmit Time RTT

slide-34
SLIDE 34

Problems

  • A separate transport connection is

established to fetch each object

  • Requires at least 2 RTTs per object
  • High overhead in terms of packets in

the network

  • Long user-perceived latency

34

slide-35
SLIDE 35

Problems (2)

  • Transport protocol (TCP) is optimized for large

data transfers. Pays extra startup time to avoid congestion (slow-start, windowing, etc)

  • HTTP request for small objects never gets past

initial phase

  • Connection closed before window size can be

increased significantly

  • Available bandwidth never fully used
  • Details in Transport lecture

35

slide-36
SLIDE 36

Parallel connections?

  • Browser opens several connections in parallel, and

download embedded images separately but simultaneously

  • Early Netscape browser, circa 1994
  • Pros
  • User feels webpage is loading faster
  • Cons
  • Do not solve the TCP overhead and slow-start problems
  • Impose considerable load on network – congestion
  • Server juggles more TCP connections
  • Actually reduces effective throughput

36

slide-37
SLIDE 37

User behavior: Aborted requests

  • A page is not what we wanted (or we are just

bored), so we click “Back” button

  • Similar to TV channel surfing
  • If browser is using parallel connections, ...
  • Already started to download all embedded
  • bjects!
  • Connections must be aborted
  • But, the cost of establishing them has already

been paid, and thus is wasted

37

slide-38
SLIDE 38

Persistent HTTP

1.Reuse existing transport connection

  • Server leaves connection open after

sending response

  • Subsequent HTTP messages between

same client/server sent over open connection 2.Pipelining at application protocol level

38

slide-39
SLIDE 39
  • Persistent without pipelining:
  • client issues new request only

upon receipt of previous response

  • one RTT for each referenced
  • bject ...
  • plus one setup/close overhead

To Pipeline or not ...

Client Server accept connection request

  • bject

send object

  • bject

received; request next object send object initiate transport connection

slide-40
SLIDE 40
  • Persistent with pipelining:
  • client issues new request as

soon as it encounters a referenced object

  • default in HTTP/1.1
  • server sends objects in order
  • as little as one RTT for all

referenced objects

To Pipeline or not ...

Client Server accept connection request object send object receive object request object request object send object send object receive object receive object initiate transport connection

slide-41
SLIDE 41

Persistent HTTP: Advantages

  • Reduce transport-layer connection costs
  • Fewer setups and teardowns
  • CPU time saved in routers and hosts
  • Hosts save memory for transport state (buf, counts, ...)
  • Reduce latency by avoiding multiple TCP slow-starts
  • Do opening handshake once to establish connection
  • Do slow-start once to get to ideal sending rate
  • Avoid bandwidth wastage and reduce overall congestion
  • Fewer number of packets sent

41

slide-42
SLIDE 42

traceroute

  • Application Layer
  • Web and HTTP
  • Message format
  • Persistent connections
  • Caching

42

slide-43
SLIDE 43
  • Goal: satisfy client request without

involving origin server; reduce latency and bandwidth requirements

  • client sends requests


to cache

  • cache responds if it


has a copy, otherwise
 uses HTTP to request a copy from the

  • rigin server

Web Proxy Caching

43

Client Origin Server Proxy Server Origin Server HTTP Request HTTP Response Client Request Response

slide-44
SLIDE 44

Consistency

  • HTTP ensures correctness of caching
  • Eliminate need to send requests to origin server
  • Specifies cacheability of responses, e.g. Can I

cache this object?

  • Specifies expiration mechanisms, e.g. When

does this object become stale?

  • Eliminate need to send full responses from origin

server

  • Specifies validation mechanisms, e.g. Is this
  • bject fresh or stale?

44

slide-45
SLIDE 45

Protocol is not Policy

  • Web cache policy is separate from

protocol

  • Sample policy questions:
  • If cache is full, which object to evict?
  • Should we replace a stale object that is

very popular with a fresh object that might not be requested often?

45

slide-46
SLIDE 46

Expiration Model

  • Server-specified expiration
  • Uses Expires header or max-age directive in Cache-

Control header

  • Recommended
  • Heuristic expiration
  • Server does not specify explicit expiration times
  • Up to the web cache implementation
  • Freshness calculation
  • Is a cache entry fresh?
  • Age and Expiration calculations

46

slide-47
SLIDE 47

Validation Model

  • If response is not fresh, need to

validate with server

  • don’t send object if cache still

has up-to-date cached version

  • cache: specify date of cached copy

in HTTP request

  • If-modified-since: <date>
  • server: response contains no
  • bject if cached copy is up-to-

date:

  • HTTP/1.0 304 Not Modified

47

Cache HTTP request msg If-modified-since: <date>

  • bject

not modified

HTTP response HTTP/1.0 200 OK <data>

  • bject

modified

Server HTTP response HTTP/1.0 304 Not Modified HTTP request msg If-modified-since: <date>

slide-48
SLIDE 48

Question

  • Is a web proxy (cache) better for

performance than the browser cache?

48

slide-49
SLIDE 49

Lesson Objectives

  • Now, you should be able to:
  • describe the mission, scope, addressing

mechanism and data types of the Application Layer

  • explain the HTTP protocol, including message

format, interaction model and connection management

  • calculate response time for an HTTP request over

nonpersistent, parallel or persistent connections, including the pipelined variant

  • describe how web proxies work to cache HTTP

responses, including how they ensure consistency

49