Internet and Web Based Technology http://144.16.192.60/~isg/IWT/ - - PowerPoint PPT Presentation
Internet and Web Based Technology http://144.16.192.60/~isg/IWT/ - - PowerPoint PPT Presentation
Internet and Web Based Technology http://144.16.192.60/~isg/IWT/ About the Course I will be covering half the course (2 hours / week) Tuesday 9:30 AM 11:25 AM Topics to be covered How Internet works, HTML, HTTP, CGI scripts,
Internet & Web Based Technology 2
About the Course
- I will be covering half the course (2 hours / week)
– Tuesday 9:30 AM – 11:25 AM
- Topics to be covered
– How Internet works, HTML, HTTP, CGI scripts, PERL, etc. – Basic concepts of cryptology – Network security protocols, firewall, NAT, etc.
- Details would be available on the web site.
- What to expect?
– Self-study materials will be prescribed all throughout the course, from which questions will be set. – Assignments:
- In groups of two, students will be assigned a Term
Paper and a Programming Assignment.
Internet & Web Based Technology 3
- Attendance is mandatory
– If the cumulative attendance of a student falls below 75%, it will lead to immediate deregistration. – Proxy attendance, if detected, will lead to deregistration and subsequent disciplinary actions.
- Other requirements
– Satisfactory completion of Term Paper and Assignment is essential, failing which a student will get “F” grade. – Term Paper:
- A topic will be assigned to a group of two students. A
comprehensive study report of 25-30 pages (11 point, 1.5 spacing) will have to be submitted.
Internet & Web Based Technology 4
– Programming Assignment
- A non-trivial programming problem will be given to a
- group. It will have to be implemented and demonstrated.
- Typical example:
– “Design and implement a web based email client that supports attachments”.
Introduction
Internet & Web Based Technology 6
Internetworking: Basic Concepts
- Computer Network
– a communication system for connecting end-systems (hosts)
- Local Area Network (LAN)
– connects hosts within a relatively small geographical area – room, building, campus
- Wide Area Network (WAN)
– hosts may be widely dispersed – across buildings, cities, countries
Internet & Web Based Technology 7
What is Internet?
- The network formed by the co-operative
interconnection of a large number of computer networks.
– Network of Networks. – No one owns the internet
- every person who makes a connection owns a slice of
the Internet – There is no central administration to the Internet
Internet & Web Based Technology 8
Network Network Network Network BACKBONE Network Network
Internet & Web Based Technology 9
What is it actually?
- A community of people who use and develop the
networks.
- A collection of resources that can be reached from
those networks.
- A setup to facilitate collaboration among members
- f the research and educational communities,
world-wide.
- The connected networks use the TCP/IP protocol.
Internet & Web Based Technology 10
Growth of Internet
Internet & Web Based Technology 11
Internet & Web Based Technology 12
Internet & Web Based Technology 13
Internet & Web Based Technology 14
How Data Flows?
- Packet Switching
– Internet uses TCP/IP protocol. – TCP/IP uses packet switching.
- A message is broken down into smaller packets.
– A packet is a self-contained bundle of data sent over the network.
- Generally less than 1500 bytes long.
– Each packet contains
- Address of origin
- Address of destination
Internet & Web Based Technology 15
Packet
MESSAGE Packets HEADER DATA
Internet & Web Based Technology 16
World Wide Web (WWW)
- WWW is an Internet “organizer”.
– Developed in the 1980’s by the NSF. – Internet browsers (Mosaic, Netscape, Internet Explorer, etc.) developed to make use of WWW easier.
- Based on client-server technology.
– The server is a computer (hardware and software) providing access to the data. – The client is the software that allows users to access the data.
Internet & Web Based Technology 17
The Inside Story
- Interconnected web of documents.
– Billions of them around.
- Where do the documents reside?
– On web (http) servers. – http stands for Hyper Text Transport Protocol
- They are written in
– html, typically – html stands for Hyper Text Markup Language
- Documents get formatted/displayed using
– Web browsers (Netscape, Mosaic, Explorer) – WWW clients
Internet & Web Based Technology 18
Illustration
Web Servers Web Client
http response http request http request http response
Internet & Web Based Technology 19
Topics for Self-study
- Hyper Text Markup Language
– http://www.w3schools.com/html/default.asp
- Hyper Text Transport Protocol
– http://www.comptechdoc.org/independent/web/http/reference/ – http://www.jmarshall.com/easy/http/
HTML Forms
Internet & Web Based Technology 21
Introduction
- Provides two-way communication between web
servers and browsers.
– Demand for most of the emerging applications. – Provides dynamic contents.
BROWSER
WEB SERVER
Internet & Web Based Technology 22
What is a HTML FORM?
- A form basically contains boxes and buttons.
– Real-life examples:
- Search engines
- On-line purchase of items
- Registration
– The form allows a user to fill up the blank entries and send it back to the owner of the page.
- Called SUBMITTING the form.
Internet & Web Based Technology 23
FORM Example
Internet & Web Based Technology 24
FORM Tags and Attributes
- Several tags are used in connection with forms:
<form> …… </form> <input> <textarea> …… </textarea> <select> …… </select>
Internet & Web Based Technology 25
<FORM> …… </FORM>
- This tag is used to bracket a HTML form.
– Includes attributes which specify where and how to deliver filled-up information to the web server.
- Two main attributes:
– METHOD – ACTION
Internet & Web Based Technology 26
- METHOD:
– Indicates how the information in the form will be sent to the web server when the form is submitted. – Two possible values:
- POST: causes a form’s contents to be parsed one
element at a time.
- GET: concatenates all field names and values in a single
large string. – POST is the preferred method because of string size limitations in most systems.
Internet & Web Based Technology 27
- ACTION:
– Specifies the URL of a program on the origin server that will be receiving the form’s inputs. – Traditionally called Common Gateway Interface (CGI).
- Details of CGI to be discussed later.
– The specified program is executed on the server, when the form is submitted.
- Output sent back to the browser.
Internet & Web Based Technology 28
- Typical usage:
<FORM METHOD=“POST” ACTION=“cgi-bin/myprog.pl”> …….. …….. </FORM>
Internet & Web Based Technology 29
<INPUT>
- This tag defines a basic form element.
- Several attributes are possible:
– TYPE – NAME – SIZE – MAXLENGTH – VALUE – SRC – ALIGN
Internet & Web Based Technology 30
- TYPE:
– Defines the kind of element that is to be displayed in the form.
- “TEXT” – defines a text box, which provides a single
line area for entering text.
- “RADIO” – radio button, used when a choice must be
made among several alternatives (clicking on one of the buttons turns off all others in the same group).
- “CHECKBOX” – similar to the radio buttons, but each
box here can be selected independently of the
- thers.
Internet & Web Based Technology 31
- “PASSWORD” – similar to text box, but characters are not
shown as they are typed.
- “HIDDEN” – used for output only; cannot be modified
(mainly used to refer to choices that have already been made earlier).
- “IMAGE” – used for active maps. When the user clicks on
the image, the (x,y) co-ordinates are stored in variables, and are returned for further processing.
- “SUBMIT” – creates a box labeled Submit; if clicked, the
form data are passed on to the designated CGI script.
- “RESET” – creates a box labeled Reset; if clicked, clears a
form’s contents.
Internet & Web Based Technology 32
- NAME:
– Specifies a name for the input field. – The input-handling program (CGI) in reality receives a number of (name,value) pairs.
- SIZE:
– Defines the number of characters that can be displayed in a TEXT box without scrolling.
- MAXLENGTH:
– Defines the maximum number of characters a TEXT box can contain.
Internet & Web Based Technology 33
- VALUE:
– Used to submit a default value for a TEXT or HIDDEN field. – Can also be used for specifying the label of a button (renaming “Submit”, for example).
- SRC:
– Provides a pointer to an image file. – Used for clickable maps.
- ALIGN:
– Used for aligning image types. ALIGN = TOP | MIDDLE | BOTTOM
Internet & Web Based Technology 34
<TEXTAREA> … </TEXTAREA>
- Can be used to accommodate multiple text lines in a
box.
- Attributes are:
– NAME: name of the field. – ROWS: number of lines of text that can fit into the box. – COLS: width of the text area on the screen.
Internet & Web Based Technology 35
<SELECT> …. </SELECT>
- Used along with the tag <OPTION>.
- Used to define a selectable list of elements.
– The list appears as a scrollable menu or a pop-up menu (depends on browser).
- Attributes are:
– NAME: name of the field. – SIZE: specifies the number of option elements that will be displayed at a time on the menu. (If actual number exceeds SIZE, a scrollbar will appear). – MULTIPLE: specifies that multiple selections from the list can be made.
Internet & Web Based Technology 36
<FORM ………….> …….. Languages known: <SELECT NAME=“lang” SIZE=3 MULTIPLE> <OPTION> English <OPTION> Hindi <OPTION> French <OPTION> Hebrew </SELECT> </FORM>
Internet & Web Based Technology 37
Example 1
<HTML> <HEAD> <TITLE> Using HTML Forms </TITLE> </HEAD> <BODY TEXT="#FFFFFF" BGCOLOR="#0000FF" LINK="#FF9900" VLINK="#FF9900" ALINK="#FF9900"> <CENTER><H3> Student Registration Form </H3> </CENTER> Please fill up the following form about the courses you will register for this Semester.
Internet & Web Based Technology 38
<FORM METHOD="POST" ACTION="/cgi/feedback"> <P> Name: <INPUT NAME="name" TYPE="TEXT" SIZE="30" MAXLENGTH="50"> <P> Roll Number: <INPUT NAME="rollno" TYPE="TEXT" SIZE="7"> <P> Course Numbers: <INPUT NAME="course1" TYPE="TEXT" SIZE="6"> <INPUT NAME="course2" TYPE="TEXT" SIZE="6"> <INPUT NAME="course3" TYPE="TEXT" SIZE="6"> <P> <P> Press SUBMIT when done. <P> <INPUT TYPE="SUBMIT"> <INPUT TYPE="RESET"> </FORM> </BODY> </HTML>
Internet & Web Based Technology 39
Internet & Web Based Technology 40
Example 2
<HTML> <HEAD> <TITLE> Using HTML Forms </TITLE> </HEAD> <BODY TEXT="#FFFFFF" BGCOLOR="#0000FF" LINK="#FF9900" VLINK="#FF9900" ALINK="#FF9900"> <CENTER> <H3> Student Registration Form </H3> </CENTER> Please fill up the form below and press DONE when done.
Internet & Web Based Technology 41
<FORM METHOD="POST" ACTION="/cgi/feedback"> <P> Name: <INPUT NAME="name" TYPE="TEXT" SIZE="30" MAXLENGTH="50"> <P> Roll Number: <INPUT NAME="rollno" TYPE="TEXT" SIZE="7"> <P> Course Numbers: <INPUT NAME="course1" TYPE="TEXT" SIZE="6"> <INPUT NAME="course2" TYPE="TEXT" SIZE="6"> <INPUT NAME="course3" TYPE="TEXT" SIZE="6"> <P> Category: SC <INPUT NAME="cat" TYPE=RADIO> ST <INPUT NAME="cat" TYPE=RADIO> GE <INPUT NAME="cat" TYPE=RADIO>
Internet & Web Based Technology 42
<P> Mother tongue: <SELECT NAME="mtongue" SIZE="3"> <OPTION> Hindi <OPTION> Bengali <OPTION> Gujrati <OPTION> Tamil <OPTION> Oriya <OPTION> Assamese </SELECT> <P> <P> Thanks for the information. <P> <INPUT TYPE="SUBMIT" VALUE="DONE"> <INPUT TYPE="RESET" VALUE="CLEAR FORM"> </FORM> </BODY> </HTML>
Internet & Web Based Technology 43
Internet & Web Based Technology 44
Example 3
<HTML> <HEAD> <TITLE> Using HTML Forms </TITLE> </HEAD> <BODY TEXT="#FFFFFF" BGCOLOR="#0000FF" LINK="#FF9900" VLINK="#FF9900" ALINK="#FF9900"> <CENTER> <H3> Student Feedback Form </H3> </CENTER> Please fill up the following form and press DONE when finished.
Internet & Web Based Technology 45
<FORM METHOD="POST" ACTION="/cgi/feedback"> <P> Name: <INPUT NAME="name" TYPE="TEXT" SIZE="30" MAXLENGTH="50"> <P> Roll Number: <INPUT NAME="rollno" TYPE="TEXT" SIZE="7"> <P> Password: <INPUT NAME="code" TYPE=PASSWORD SIZE="10"> <P> Course Numbers: <INPUT NAME="course1" TYPE="TEXT" SIZE="6"> <INPUT NAME="course2" TYPE="TEXT" SIZE="6"> <INPUT NAME="course3" TYPE="TEXT" SIZE="6">
Internet & Web Based Technology 46
<P> Category: SC <INPUT NAME="cat" TYPE=RADIO> ST <INPUT NAME="cat" TYPE=RADIO> GE <INPUT NAME="cat" TYPE=RADIO> <P> Mother tongue: <SELECT NAME="mtongue" SIZE="3"> <OPTION> Hindi <OPTION> Bengali <OPTION> Gujrati <OPTION> Tamil <OPTION> Assamese <OPTION> Oriya </SELECT>
Internet & Web Based Technology 47
<P> Languages known: English <INPUT NAME="lang" TYPE=CHECKBOX> Hindi <INPUT NAME="lang" TYPE=CHECKBOX> <P> Scholarship holder (select for yes): <INPUT NAME="schol" TYPE=CHECKBOX> <P> General feedback: <TEXTAREA NAME="feed" ROWS=3 COLS=20> </TEXTAREA> <P> <P> Thanks for the information. <P> <INPUT TYPE="SUBMIT" VALUE="DONE"> <INPUT TYPE="RESET" VALUE="CLEAR FORM"> </FORM> </BODY> </HTML>
Internet & Web Based Technology 48
Internet & Web Based Technology 49
How to Submit a Form?
- Three different ways:
– Clicking on the Submit button. – Clicking on an active map. – Pressing <ENTER> on a TEXT box or TEXTAREA.
Internet & Web Based Technology 50
The Basic Mechanism
P
cgi
P Browser
new html page submit form
- riginal page
Internet & Web Based Technology 51
- Web page including form
– Resides on the web server in the regular folder where html files and other documents are kept.
- CGI script program handling form data
– Resides under a special folder on the web server (usually, “cgi-bin). – May be written in Perl, C, shell script, etc.
- Web page linked to the cgi script.
Internet & Web Based Technology 52
<FORM METHOD=“POST” ACTION=“cgi-bin/myprog.pl”> …….. …….. </FORM>
Internet & Web Based Technology 53
How to Write the CGI Program?
- Must know …
– How to access the form data.
- Mechanism depends on METHOD (GET or POST).
– How to return processed output back to the browser.
- HTML file created on the fly (typically).
- Details to be discussed later.
– Good idea to have a look at a typical Perl script.
Image Maps
Internet & Web Based Technology 55
Introduction
- An image map allows us to create links to different
URLs depending upon where we click on the image.
– Useful for creating links on maps, diagrams, fancy buttons, etc.
- There are two parts to an image map.
– The image. – The map file.
- The map file defines the areas of the image and the
URLs that correlate to different areas.
Internet & Web Based Technology 56
So basically …
- An image map is a single image that contains hot
spots.
– When we click on a hot spot, we go to a new location (URL). – Requires loading of only one image from the server.
- Thus requires fewer server calls.
- Is generally better looking.
Internet & Web Based Technology 57
Types of Image Maps
- Depending on the way they are configured and the
location where the processing is carried out, image maps can be classified as two types.
– Server side
- Traditional
– Client side
- More efficient; supported by all recent browsers.
Server Side Image Maps
Internet & Web Based Technology 59
Basic Functioning
- Three ingredients are required to incorporate an
image map into a HTML document.
a) Creating the image map with well-defined boundaries. b) Creating an image map configuration file.
- Contains relative pixel co-ordinates marking the
boundaries of the different clickable regions.
- Allowable geometries: circle, poly, point, rect.
c) Establish appropriate HTML information in the page to link
- the map image,
- the map configuration file, and
- an (optional) CGI script which decodes of map co-
- rdinates and selects the corresponding URL.
Internet & Web Based Technology 60
Typical Usage
<HTML> <BODY> …….. …….. <A HREF = “cgi-bin/map/menu.map”> <IMG SRC = “IMAGES/imagemap.gif” ISMAP> </A> …….. …….. </BODY> </HTML>
Internet & Web Based Technology 61
- The URL that is sent to the image map program or web server
when a user clicks the map resembles the following: http://myserver.com/menu.map?x,y where x and y are integers denoting the pixel co-ordinate of the point of click.
Internet & Web Based Technology 62
Image Map Configuration File
- There are several different formats, all similar, and
varying slightly in syntax.
a) NCSA httpd server b) APACHE httpd server c) CERN httpd server d) W3C httpd server
Internet & Web Based Technology 63
Example: APACHE server
- A sample configuration file looks like:
# An example default http://www.myserver.edu base_url http://www.iitkgp.ac.in/demo circle circle.html 45,45,80,45 rect rectangle.html 20,10,178,70 point point.html 100,50 poly polygon.html 200,60,295,60,275,10
Internet & Web Based Technology 64
- Defining the default
– Typically, the first line in the map file is a default line. – Defines the URL to which users will be taken if they click on an undefined area of the image.
- Defining circles
– A circle is defined by two co-ordinates. – The first co-ordinate is the centre point. – The second co-ordinate is any point on the circumference.
Internet & Web Based Technology 65
- Defining rectangles
– A rectangle is defined by two co-ordinates. – The first co-ordinate refers to the upper left corner. – The second co-ordinate refers to the bottom right corner.
- Defining points
– Defines by a single co-ordinate. – Clicks closest to that point on the image map will take to the specified URL.
- Defining polygons
– A polygon is defined by a series of co-ordinates that outline the area to be defined. – We can start from any vertex of the polygon. – Maximum number of vertices is 100.
Internet & Web Based Technology 66
Illustrative Example
Internet & Web Based Technology 67
An Important Point
- For each of the specified URLs, it is required to
specify the entire path.
- However, common prefix URL can be specified by
the base_url command.
base_url http://www.iitkgp.ac.in circle circle.html 45,45,80,45 rect rectangle.html 20,10,178,70
Client Side Image Maps
Internet & Web Based Technology 69
Introduction
- In client-side image maps, the map information is
contained in the HTML document itself.
- Consists of three components:
– An ordinary image file (gif, jpeg, png) – A map delimited by <MAP> tags containing the co-ordinate and URL information for each region. – The USEMAP attribute within the <IMG> tag that indicates which map to reference.
Internet & Web Based Technology 70
Advantages
- They are self-contained within the HTML document.
- No dependence on the server to handle every
client’s request for image mapping.
- Faster processing; improves response time.
- No longer required to specify a default URL.
– Clicking outside hyperlinked area will take a user nowhere.
- Complete URL information displays in the status bar
when the mouse moves over the hot spots.
– In contrast, server-side image maps show only co-
- rdinates.
Internet & Web Based Technology 71
Disadvantage
- The only disadvantage is that they are not
universally supported.
– Netscape Navigator 1.0 and Internet Explorer 2.0 do not support client-side image maps.
Internet & Web Based Technology 72
Sample Client-side Image Map
<MAP NAME = “demo_map”> <AREA SHAPE=CIRCLE COORDS=“45,45,20” HREF=“circle.html” ALT=“Circle”> <AREA SHAPE=RECT COORDS=“20,20,80,80” HREF=“rectangle.html” ALT=“Rectangle”> <AREA SHAPE=POLY COORDS=“10,10,50,50,70,100” HREF=“polygon.html” ALT=“Triangle”> </MAP>
Internet & Web Based Technology 73
- Some points:
– POINT is not supported. – CIRCLE is specified by the centre co-ordinates, followed by its radius. – Comments can be included as in HTML, using <! ……….. >
Internet & Web Based Technology 74
Linking to an Image
- This can be done using the <IMG> tag using the
USEMAP attribute.
<IMG SRC=“mymap.gif” USEMAP=“#demo_map”> – References the image “mymap.gif”. – Searches for the <MAP> element with the NAME attribute of “demo_map”.
Internet & Web Based Technology 75
A Complete Example
<HTML> <HEAD><TITLE> Client Side Image map </TITLE></HEAD> <BODY> <MAP NAME = “demo_map”> <AREA SHAPE=CIRCLE COORDS=“45,45,20” HREF=“circle.html” ALT=“Circle”> <AREA SHAPE=RECT COORDS=“20,20,80,80” HREF=“rectangle.html” ALT=“Rectangle”> <AREA SHAPE=POLY COORDS=“10,10,50,50,70,100” HREF=“polygon.html” ALT=“Triangle”> </MAP> <IMG SRC=“mymap.gif” USEMAP=“#demo_map”> </BODY> </HTML>
Internet & Web Based Technology 76
Combining the Two
- Motivation for combining client and server side
image map processing:
– Browsers ignore tags they do not understand. – Newer browsers will use client-side map. – Older browsers will use the server-side map.
- How to do this?
Internet & Web Based Technology 77
<A HREF = “http://myserver.edu/cgi-bin/map/demo_map”> <IMG SRC = “mymap.gif” USEMAP = “#demo_map” ISMAP> </A>
- USEMAP will be ignored by older browsers.
- ISMAP will be considered redundant by browsers
supporting client-side map.
Creating Image Maps
Internet & Web Based Technology 79
Available Tools
- There are several tools using which we can create
an image map.
- Some of the tools are:
– MapEdit – Macromedia Dreamweaver – Adobe GoLive
- Irrespective of the tool used, the steps required for
creation are more or less the same.
Internet & Web Based Technology 80
Creating the Map
- Typical steps:
– Open the image in the imagemap editor. – Define areas within the image that will be clickable: rectangle, circle or polygon. – Highlight an area, and enter the URL for that area. – Repeat the above steps for all the clickable areas of the image. – For server-side image maps, we also need to define a default URL. – Select the type (client or server side).
Hyper Text Transfer Protocol (HTTP)
Internet & Web Based Technology 82
What is HTTP?
- Hyper Text Transfer Protocol
– A protocol using which web clients (browsers) interact with web servers.
- It is a stateless protocol.
– Fresh connection for every item to be downloaded.
- Transfers hypertext across the Internet.
– A text with links to other text documents.
Internet & Web Based Technology 83
HTTP Protocol
- Web clients (browsers) and web servers communicate
via HTTP protocol.
- Basic steps:
– Client opens socket connection to the HTTP server.
- Typically over port 80.
– Client sends HTTP requests to server. – Server sends back response. – Server closes connection.
- HTTP is a stateless protocol.
Internet & Web Based Technology 84
Illustration
Web Servers Web Client http request http response http request http response
Internet & Web Based Technology 85
HTTP Request Format
- A client request to a server consists of:
– Request method – Path portion of the HTTP URL – Version number of the HTTP protocol – Optional request header information – Blank line – POST or PUT data if present.
Internet & Web Based Technology 86
HTTP Request Methods
- GET
– Most common HTTP method. – Returns the contents of the specified document. – Places any parameters in request header. – Can also be used to submit forms:
- The form data is URL-encoded and appended to the
GET command URL. GET /cgi-bin/myscript.cgi?Roll=1234&Sex=M HTTP/1.0
Internet & Web Based Technology 87
Illustration of GET
– A very simple HTTP connection to a server. telnet www.facweb.iitkgp.ac.in http – Client sends request for a file: GET /test.html HTTP/1.0 – The server sends back the response: HTTP/1.1 200 OK Date: Sun, 22 May 2005 09:51:42 GMT Server: Apache/1.3.33 (Win32) Last-Modified: Sun, 22 May 2005 09:51:10 GMT Accept-Ranges: bytes Content-Length: 119 Connection: close
Internet & Web Based Technology 88
Illustration of GET (contd.)
Content-Type: text/html <html> <head> <title> A test page </title> </head> <body> This is the body of the test page. </body> </html>
Internet & Web Based Technology 89
HTTP Request Methods (contd.)
- HEAD
– Returns only the header information of the specified document. – Used by clients to determine the file size, modification date, server version, etc.
Internet & Web Based Technology 90
Illustration of HEAD
- Client sends
HEAD /index.html HTTP/1.0
- Server responds back with:
HTTP/1.1 200 OK Date: Sun, 22 May 2005 10:08:37 GMT Server: Apache/1.3.33 (Win32) Last-Modified: Thu, 03 May 2001 11:30:38 GMT Accept-Ranges: bytes Content-Length: 1494 Connection: close Content-Type: text/html
Internet & Web Based Technology 91
HTTP Request Methods (contd.)
- POST
– Used to send data to the server to be processed in some way, as in a CGI script. – Basic difference from GET:
- A block of data is sent along with the request.
- Extra headers like Content-Type and Content-Length
are used for this purpose.
- The requested object is not a resource to retrieve.
Rather, it is a script that can handle the data being sent.
- The server response is not a static file; but is generated
dynamically as the program output.
Internet & Web Based Technology 92
Illustration of POST
– A typical form submission, using POST is illustrated below: POST /cgi-bin/myscript.cgi HTTP/1.0 From: isg@hotmail.com User-Agent: HTTPTool/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 32 Roll=1234&Sex=M&Age=20
Internet & Web Based Technology 93
HTTP Request Methods (contd.)
- PUT
– Replaces the contents of the specified document with data supplied along with the command. – Not used widely.
- DELETE:
– Deletes the specified document from the server. – Not used widely.
Internet & Web Based Technology 94
HTTP Request Headers
- After a HTTP request line, a client can send any
number of header fields.
– Usually optional – used to convey some information. – Some commonly used fields:
- Accept: MIME types client accepts, in order of
preference.
- Connection: connection options, close or Keep-Alive.
- Content-Length: number of bytes of data to follow.
- Content-Type: MIME type and subtype of the data that
follows.
- Pragma: “no-cache” option directs the server/proxy to
return a fresh document even though a cached copy may exist.
Internet & Web Based Technology 95
HTTP Request Data
- To be given if the request type is either PUT or
POST.
– Send the data immediately after the HTTP request header, and a blank line.
Internet & Web Based Technology 96
HTTP Response
- An initial response line.
– Also called the status line. – Consists of three parts separated by spaces
- The HTTP version
- A 3-digit response status code
- An English phrase describing the status code.
HTTP/1.0 200 OK HTTP/1.0 404 Not Found
Internet & Web Based Technology 97
HTTP Response (contd.)
- Header information, followed by a blank line, and
then the data.
HTTP/1.1 200 OK Date: Sun, 22 May 2005 09:51:42 GMT Server: Apache/1.3.33 (Win32) Last-Modified: Sun, 22 May 2005 09:51:10 GMT Content-Length: 119 Connection: close Content-Type: text/html <html> <head> <title> A test page </title> </head> <body> This is the body of the test page. </body> </html>
Internet & Web Based Technology 98
3-digit Status Code
- 1xx
– Indicates informational messages only.
- 2xx
– Indicates successful transaction.
- 3xx
– Redirects the client to another URL.
- 4xx
– Indicates client error, such as unauthorized request.
- 5xx
– Indicates internal server error.
Internet & Web Based Technology 99
Common Status Codes
- 200 OK
- 301 Moved Permanently
- 302 Moved Temporarily
- 401 Unauthorized
- 403 Forbidden
- 404 Not Found
- 500 Internal Server Error
Internet & Web Based Technology 100
HTTP Response Headers
- Common response headers include:
– Content-Length
- Size of the data in bytes.
– Content-Type
- MIME type and subtype of data being sent.
– Date
- Current date.
– Expires
- Date at which document expires.
– Last-Modified – Set-Cookie
- Name/value pair to be stored as cookie.
Internet & Web Based Technology 101
HTTP Response Data
- A blank line follows the response header, and the
data follows next.
– No upper limit on data size.
- HTTP/1.0
– Server typically closes connection after completing a transaction.
- HTTP/1.1
– Server keeps the connection open by default, across transactions.
Internet & Web Based Technology 102
HTTP version 1.1
- Current standard and widely used.
– Became IETF draft standard in 2001.
- Improvements over HTTP 1.0:
– Requires host identification.
- Allows multi-homed servers.
- More than one domain living on same server.
GET /index.html HTTP/1.1 Host: www.facweb.iitkgp.ac.in <blank line>
Internet & Web Based Technology 103
HTTP version 1.1 (contd.)
– Default support for persistent connections.
- Multiple transactions over a single connection.
– Support for content negotiation.
- Decides on the best among the available
representations.
- Server-driven or browser-driven.
– Browsers can request part of document.
- Specify the bytes using Range header.
- Browser can ask for more than one range.
- Continue interrupted downloads.
Range: bytes=1200-3500
Internet & Web Based Technology 104
HTTP version 1.1 (contd.)
– Efficient caching support
- A document caching model that allows both the server
and the client to control the level of cachability and update conditions and requirements.
- HTTP 1.1 requires several extra things from both
clients and servers.
– Mandatory to know these if one is trying to write a HTTP client or server.
Internet & Web Based Technology 105
HTTP 1.1 Client Requirements
- The clients must do the following:
– Include the Host: header with each request. – Either support persistent connections, or include the Connection: close header with each request. – Handle the 100 Continue response. – Accept responses with chunked data.
Internet & Web Based Technology 106
HTTP 1.1 Server Requirements
- The servers must do the following:
– Require the Host: header from HTTP 1.1 clients. – Accepts absolute URL’s in a request. – Accept requests with chunked data. – Include the Date: header in each response. – Support at least the GET and HEAD methods. – Support HTTP 1.0 requests. – Either support persistent connections, or include the Connection: close header with each request.
Internet & Web Based Technology 107
HTTP Proxy servers
- What is a HTTP Proxy server?
– A program that acts as an interface between a client and a server. – It receives requests from the clients, and forwards them to the server(s). – The responses are sent back in the same way. – A proxy thus acts both as a HTTP client and a server.
Internet & Web Based Technology 108
- Request from a client to a proxy server differs from
normal server requests in one way.
– The complete URL of the resource being requested must be specified. – Required by the proxy to know where to forward the request to. GET http://www.xyz.com/docs/abc.txt HTTP/1.0
Uniform Resource Locators (URL)
Internet & Web Based Technology 110
What is a URL?
- They are the mechanism by which documents are
addressed in the WWW.
- A URL contains the following information:
– Name of the site containing the resource. – The type of service to be used to access the resource (ftp, http, etc.). – The port number of the service.
- Default assumed, if omitted.
– Location of the resource (path name) in the server.
Internet & Web Based Technology 111
- URLs specify Internet addresses.
- General format for URL:
scheme://address:port/path/filename
- Examples:
http://www.rediff.com/news/ab1.html http://www.xyz.edu:2345/home/rose.jpg mailto://skdas@yahoo.co.in news:alt.rec.flowers ftp://kumar:km123@www.abc.com/docs/paper/x1.pdf ftp://www.ftpsite.com/docs/paper1.ps
Internet & Web Based Technology 112
Sending a Query String
- The mechanism can also be used to send a query
string to a specified URL.
– Used for CGI scripts. – Place a question mark at the end of the URL, followed by the query string. http://www.xyz.com/cgi-bin/xyz.pl?Roll=1234&Sex=M
CGI Scripts
Internet & Web Based Technology 114
Introduction
- CGI stands for Common Gateway Interface.
– Allows interactive web pages to be written.
- Page created dynamically, based on user request.
– CGI programs are called “scripts” because the first CGI programs were written using UNIX shell scripts, and PERL.
- Can be written in almost any language.
– Usually resides in a special directory in the web server (typically, “cgi-bin”).
Internet & Web Based Technology 115
- Apache Directory Structure: a case study
– cgi-bin
- Here most of the interactive programs will reside. These
will be written in Perl, Java, or any other programming language. – conf
- This will contain the configuration files.
– htdocs
- This will contain the actual HTML documents, and will
typically have many subdirectories. This directory is known as the DocumentRoot.
Internet & Web Based Technology 116
– icons
- This contains the icons that Apache will use when
displaying information or error messages. – images
- This will contain the image files that will be used in the
web site. – logs
- This will contain the log files: the access_log and
error_log.
Internet & Web Based Technology 117
Structure of CGI Script
- When a CGI script is invoked by the server, the
server passes information to the script in one of two ways:
a) GET b) POST
- The request method used is passed to the script
via the environment variable REQUEST_METHOD.
Internet & Web Based Technology 118
“GET” Request Method
- The GET method sends request information as
parameters appended at the end of the URL.
http://myserver.edu/cgi-bin/myprog.pl? name=niloy&rollno=7312&age=24
- The parameters are passed to the CGI program via
the environment variable QUERY_STRING.
– For the above example, QUERY_STRING will contain “name=niloy&rollno=7312&age=24”
Internet & Web Based Technology 119
“POST” Request Method
- The data gets passed from the server to the CGI
script through STDIN.
- The environment variable CONTENT_LENGTH
indicates the size in bytes of the incoming data.
- The format of the POST-ed data is:
var1=value1&var2=value2&……
- The REQUEST_METHOD environment variable must
be examined to know whether or not to read from STDIN.
Internet & Web Based Technology 120
To Summarize
- For GET
– Data are read from QUERY_STRING environment variable.
- For POST
– Data are read from STDIN. – Number of bytes to be read is obtained from CONTENT_LENGTH.
- Both data available in same format:
var1=value1&var2=value2&…… name=niloy&rollno=7312&age=24
Internet & Web Based Technology 121
URL Encoding
- For platform independence, all data passed to the
server are URL-encoded.
– Variables are separated by ‘&’. – Special characters (including ‘&’) are escaped as 2-digit hex numbers, e,g, %25 ‘%’ %20 ‘ ’ – ‘+’ sign is interpreted as a space character.
Internet & Web Based Technology 122
- The process of decoding back:
– Separate out the variables. – Replace all ‘+’ signs by spaces. – Replace all %## with the corresponding ASCII character.
Internet & Web Based Technology 123
- Which characters are encoded?
– Control characters: 0x00 through 0x1F, and 0x7F. – 8-bit characters: 0x80 through 0xFF – Characters given special importance within URLs: ; / ? : @ & = + $ , – Characters often used to delimit URLs: < > # % “ – Characters considered unsafe as they may have special meaning for other protocols: { } | \ ^ [ ] `
Internet & Web Based Technology 124
- A point to note:
– When the server passes data using the POST method, the scripts checks the environment variable CONTENT_TYPE. – If the value of CONTENT_TYPE is application/x-www-form-urlencoded the data needs to be decoded before use.
Internet & Web Based Technology 125
Basic Structure of CGI Script
- Step 1: Initialization
– Check REQUEST_METHOD. – Parse string and extract variables depending on “GET” or “POST”. – Check CONTENT_TYPE, to find out if the string is URL- encoded.
- Step 2: Processing
– Process the input data. – Output the results (MIME-type header, and the contents).
- Step 3: Termination
– Release the system resources. – Terminate the program.
Internet & Web Based Technology 126
Environment Variables Used
- CONTENT_LENGTH
– Length of URL-encoded data in bytes.
- CONTENT_TYPE
– Specifies the type of data as a MIME header.
- QUERY_STRING
– Information at the end of the URL after ‘?’.
- REMOTE_ADDR
– IP address of the client making the request.
- REMOTE_HOST
– Resolved host name of the client.
Internet & Web Based Technology 127
- REQUEST_METHOD
– “GET” or “POST”.
- SERVER_NAME
– Web server’s host name, or IP address.
- SERVER_PROTOCOL
– Say, HTTP/1.0
- SERVER_PORT
– Port number on server that received the HTTP request.
- SCRIPT_NAME
– Name of the CGI script being run.
Internet & Web Based Technology 128
Response Header
- The most common response header is Content-
Type, which is based on MIME types.
- Typical values are:
Content-Type: text/plain text/html image/gif video/avi
Internet & Web Based Technology 129
- A complete MIME header looks like this:
Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Description: Postscript
Internet & Web Based Technology 130
CGI Real-life Examples
- Search Engine
- Page-hit Counter
- Student Registration
- On-line Booking of Tickets
- On-line Purchase of Items
- E-mail Gateways
- Feedback Scripts
- Web-based Games
Internet & Web Based Technology 131
Security Issues with CGI Scripts
- A CGI script is a program that anyone in the world
can run on your machine.
- Do not trust the user input.
– In particular, do not put user data in a shell command without verifying the data carefully. – An example in next slide.
Internet & Web Based Technology 132
- An example
– Suppose that you have a CGI script that lets users run the “finger” command on your host. – In Perl, there can be a line: system “finger $username” – A malicious user may enter isg; rm –r / as the username. – The result all files will get deleted.
Internet & Web Based Technology 133
isg; rm –r /
Enter UserId
Internet & Web Based Technology 134
An Example CGI Program
- Using bash shell script:
#!/bin/sh CAT=/bin/cat echo Content-type: text/plain echo "" if [[ -x $CAT]] then $CAT $1 | sort else echo Cannot find command on this system. fi
Internet & Web Based Technology 135
- What this program does?
– Sends the contents of a file residing on the server back to the browser.
- How to invoke?
<A HREF="/cgi-bin/test1.sh? /home/user1/public_html/text-file.txt"> Click here to activate</A>
$1
Internet & Web Based Technology 136
Another Example
#!/bin/sh echo Content-type: text/html echo "" /bin/cat << EOM <HTML> <HEAD> <TITLE>File Output: /home/user1/public_html/text-file.txt </TITLE> </HEAD> <BODY bgcolor="#cccccc" text="#000000"> <HR SIZE=5> <H1>File Output: /home/user1/public_html/text-file.txt </H1> <HR SIZE=5> <P>
Internet & Web Based Technology 137
<SMALL> <PRE> EOM /bin/cat /home/user1/public_html/text-file.txt CAT << EOM </PRE> </SMALL> <P> </BODY> </HTML> EOM
Internet & Web Based Technology 138
- What this program does?
– Outputs the contents of the file “text-file.txt” as a HTML file.
- How to invoke?
– Through a dummy HTML form. – Through the following link: <A HREF="/cgi-bin/test2.sh">Click here</A>
Internet & Web Based Technology 139
E-mail Gateways: an Example
- E-mail gateways are very popular on the web.
- Allows users to send and receive mails, without
having to worry about managing a mail server.
- Can be designed using CGI scripts, or any other
similar technologies.
- Popular e-mail gateways:
– yahoo, rediffmail, hotmail, gmail, etc.
Internet & Web Based Technology 140
Internet & Web Based Technology 141
Browser Email Gateway Mail Server
Internet & Web Based Technology 142
Writing CGI Scripts using Perl
- Would be discussed later.
– After discussing the syntax and semantics of Perl. – We will see how the form data can be extracted and processed.
- Requires string manipulation.