1
play

1 Embellishments Forms and CGI programs original design of HTTP - PDF document

(World Wide) Web Web history a way to connect computers that provide information (servers) 1989: Tim Berners-Lee at CERN with computers that ask for it (clients like you and me) a way to make physics literature and uses the


  1. (World Wide) Web Web history • a way to connect computers that provide information (servers) • 1989: Tim Berners-Lee at CERN with computers that ask for it (clients like you and me) – a way to make physics literature and – uses the Internet, but it's not the same as the Internet research results accessible on the Internet • URL (uniform resource locator, e.g., http://www.amazon.com) • 1991: first software distributions – a way to specify what information to find, and where • HTTP (hypertext transfer protocol) • Feb 1993: Mosaic browser – a way to request specific information from a server and get it back – Marc Andreessen at NCSA (Univ of Illinois) • HTML (hyptertext markup language) • Mar 1994: Netscape – a language for describing information for display • browser (Firefox, Safari, Internet Explorer, Opera, Chrome, …) – first commercial browser – a program for making requests, and displaying results • technical evolution managed by World Wide Web Consortium • embellishments – non-profit organization at MIT, Berners-Lee is director – pictures, sounds, movies, ... – official definition of HTML and other web specifications – loadable software – see www.w3.org • the set of everything this provides HTTP: Hypertext transfer protocol some detail on HTTP protocal • What happens when you click on a URL? Request: • client opens TCP/IP connection to host, sends request Request line: method object protocal GET /filename HTTP/1.0 GET url Headers: many options, most optional • server returns server empty line – header info client – HTML message body (optional) HTML • since server returns the text, it can be created as needed Example methods – can contain encoded material of many different types (MIME) GET retrieval • URL format POST submiting data to be processed (in body) service://hostname/filename?other_stuff Mandatory header • filename?other_stuff part can encode HOST URL sending request to – data values from client (forms) – request to run a program on server (cgi-bin) – anything else e.g. http://www.google.com/search?q=mime &ie=utf-8&oe=utf-8&aq=t& rls=org.mozilla:en-US:official&client=firefox-a Example from Wikipedia entry for HTTP: HTTP protocal: continuing some details • Request: Response: protocal status GET /index.html HTTP/1.1 Host: www.example.com Date: Server: software information • Response Last-Modified: HTTP/1.1 200 OK Etag: determine cached version & current identical Date: Mon, 23 May 2005 22:38:34 GMT Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux) Accept-Ranges: Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT Content-Length: Etag: "3f80f-1b6-3e1cb03b" Connection: close Accept-Ranges: bytes Content –Type: Internet media type Content-Length: 438 Connection: close text of requested object Content-Type: text/html; charset=UTF-8 (A sample of header fields shown in blue) text of page 1

  2. Embellishments Forms and CGI programs • original design of HTTP just returns text to be displayed • "common gateway interface" • now includes pictures, sound, video, ... – standard way to request the server to run a program – using information provided by the client via a form – need helpers or plug-ins to display non-text content e.g., GIF, JPEG graphics; sound; movies • if the target file on server is an executable program • forms filled in by user – e.g., in /cgi-bin directory – need a program on the server to interpret the information (cgi-bin) • or if it has the right kind of name • HTTP is stateless – e.g., something.cgi • run it on the server to produce HTML to send back to client – server doesn't remember anything from one request to next – need a way to remember information on the client: cookies – using the contents of the form as input – output depends on client request: created on the fly, not just a file • active content: download code to run on the client – Javascript and other interpreters • CGI programs can be written in any programming language – Java applets – often Perl, PHP, Java – plug-ins – ActiveX Example CGI program in Perl (mailform.cgi modified) Web pages: Information passed and actions initiated • HTTP requests identify host and address: #!/usr/local/bin/perl –w – my $urcomp = $query->remote_host(); use CGI; – my $urIP = $query->remote_addr(); my $query = new CGI; print $query->header; • Initate actions with Javascript print $query->start_html(-title=>'Form results'); print "<h1> Form results </h1>\n"; – onmouseover etc my $urcomp = $query->remote_host(); my $urIP = $query->remote_addr(); • Links with “extra” print "<P> Your computer is $urcomp\n"; – Google ads print "<P> Your IP address is $urIP\n"; print "<P>\n"; foreach $name ($query->param) { print "<br> $name:"; foreach $value ($query->param($name)) { print " $value”;} print "\n"; } Cookies Cookie crumbs • get a page from xyz.com • HTTP is stateless: doesn't remember from one request to next – it contains <img src=http://doubleclick.com/advt.gif> • cookies intended to deal with stateless nature of HTTP – this causes a page to be fetched from DoubleClick.com – remember preferences, manage "shopping cart", etc. – which now knows your IP address and what page you were looking at • cookie: one line of text sent by server to be stored on client • DoubleClick sends back a suitable advertisement – stored in browser while it is running (transient) – with a cookie that identifies "you" at DoubleClick – stored in client file system when browser terminates (persistent) • next time you get any page that contains a doubleclick.com image • when client reconnects to same domain, – the last DoubleClick cookie is sent back to DoubleClick browser sends the cookie back to the server – the set of sites and images that you are viewing is used to - update the record of where you have been and what you have looked at – sent back verbatim; nothing added - send back targeted advertising (and a new cookie) – sent back only to the same domain that sent it originally • this does not necessarily identify you personally so far – contains no information that didn't originate with the server • but if you ever provide personal identification, it can be (and will be) attached • in principle, pretty benign • defenses: • but heavily used to monitor browsing habits, for commercial – turn off all cookies; turn off "third-party" cookies purposes – don't reveal information – clean up cookies regularly 2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend