Internet and Web Based Technology http://144.16.192.60/~isg/IWT/ - - PowerPoint PPT Presentation

internet and web based technology
SMART_READER_LITE
LIVE PREVIEW

Internet and Web Based Technology http://144.16.192.60/~isg/IWT/ - - PowerPoint PPT Presentation

Internet and Web Based Technology http://144.16.192.60/~isg/IWT/ About the Course I will be covering half the course (2 hours / week) Tuesday 9:30 AM 11:25 AM Topics to be covered How Internet works, HTML, HTTP, CGI scripts,


slide-1
SLIDE 1

Internet and Web Based Technology

http://144.16.192.60/~isg/IWT/

slide-2
SLIDE 2

Internet & Web Based Technology 2

About the Course

  • I will be covering half the course (2 hours / week)

– Tuesday 9:30 AM – 11:25 AM

  • Topics to be covered

– How Internet works, HTML, HTTP, CGI scripts, PERL, etc. – Basic concepts of cryptology – Network security protocols, firewall, NAT, etc.

  • Details would be available on the web site.
  • What to expect?

– Self-study materials will be prescribed all throughout the course, from which questions will be set. – Assignments:

  • In groups of two, students will be assigned a Term

Paper and a Programming Assignment.

slide-3
SLIDE 3

Internet & Web Based Technology 3

  • Attendance is mandatory

– If the cumulative attendance of a student falls below 75%, it will lead to immediate deregistration. – Proxy attendance, if detected, will lead to deregistration and subsequent disciplinary actions.

  • Other requirements

– Satisfactory completion of Term Paper and Assignment is essential, failing which a student will get “F” grade. – Term Paper:

  • A topic will be assigned to a group of two students. A

comprehensive study report of 25-30 pages (11 point, 1.5 spacing) will have to be submitted.

slide-4
SLIDE 4

Internet & Web Based Technology 4

– Programming Assignment

  • A non-trivial programming problem will be given to a
  • group. It will have to be implemented and demonstrated.
  • Typical example:

– “Design and implement a web based email client that supports attachments”.

slide-5
SLIDE 5

Introduction

slide-6
SLIDE 6

Internet & Web Based Technology 6

Internetworking: Basic Concepts

  • Computer Network

– a communication system for connecting end-systems (hosts)

  • Local Area Network (LAN)

– connects hosts within a relatively small geographical area – room, building, campus

  • Wide Area Network (WAN)

– hosts may be widely dispersed – across buildings, cities, countries

slide-7
SLIDE 7

Internet & Web Based Technology 7

What is Internet?

  • The network formed by the co-operative

interconnection of a large number of computer networks.

– Network of Networks. – No one owns the internet

  • every person who makes a connection owns a slice of

the Internet – There is no central administration to the Internet

slide-8
SLIDE 8

Internet & Web Based Technology 8

Network Network Network Network BACKBONE Network Network

slide-9
SLIDE 9

Internet & Web Based Technology 9

What is it actually?

  • A community of people who use and develop the

networks.

  • A collection of resources that can be reached from

those networks.

  • A setup to facilitate collaboration among members
  • f the research and educational communities,

world-wide.

  • The connected networks use the TCP/IP protocol.
slide-10
SLIDE 10

Internet & Web Based Technology 10

Growth of Internet

slide-11
SLIDE 11

Internet & Web Based Technology 11

slide-12
SLIDE 12

Internet & Web Based Technology 12

slide-13
SLIDE 13

Internet & Web Based Technology 13

slide-14
SLIDE 14

Internet & Web Based Technology 14

How Data Flows?

  • Packet Switching

– Internet uses TCP/IP protocol. – TCP/IP uses packet switching.

  • A message is broken down into smaller packets.

– A packet is a self-contained bundle of data sent over the network.

  • Generally less than 1500 bytes long.

– Each packet contains

  • Address of origin
  • Address of destination
slide-15
SLIDE 15

Internet & Web Based Technology 15

Packet

MESSAGE Packets HEADER DATA

slide-16
SLIDE 16

Internet & Web Based Technology 16

World Wide Web (WWW)

  • WWW is an Internet “organizer”.

– Developed in the 1980’s by the NSF. – Internet browsers (Mosaic, Netscape, Internet Explorer, etc.) developed to make use of WWW easier.

  • Based on client-server technology.

– The server is a computer (hardware and software) providing access to the data. – The client is the software that allows users to access the data.

slide-17
SLIDE 17

Internet & Web Based Technology 17

The Inside Story

  • Interconnected web of documents.

– Billions of them around.

  • Where do the documents reside?

– On web (http) servers. – http stands for Hyper Text Transport Protocol

  • They are written in

– html, typically – html stands for Hyper Text Markup Language

  • Documents get formatted/displayed using

– Web browsers (Netscape, Mosaic, Explorer) – WWW clients

slide-18
SLIDE 18

Internet & Web Based Technology 18

Illustration

Web Servers Web Client

http response http request http request http response

slide-19
SLIDE 19

Internet & Web Based Technology 19

Topics for Self-study

  • Hyper Text Markup Language

– http://www.w3schools.com/html/default.asp

  • Hyper Text Transport Protocol

– http://www.comptechdoc.org/independent/web/http/reference/ – http://www.jmarshall.com/easy/http/

slide-20
SLIDE 20

HTML Forms

slide-21
SLIDE 21

Internet & Web Based Technology 21

Introduction

  • Provides two-way communication between web

servers and browsers.

– Demand for most of the emerging applications. – Provides dynamic contents.

BROWSER

WEB SERVER

slide-22
SLIDE 22

Internet & Web Based Technology 22

What is a HTML FORM?

  • A form basically contains boxes and buttons.

– Real-life examples:

  • Search engines
  • On-line purchase of items
  • Registration

– The form allows a user to fill up the blank entries and send it back to the owner of the page.

  • Called SUBMITTING the form.
slide-23
SLIDE 23

Internet & Web Based Technology 23

FORM Example

slide-24
SLIDE 24

Internet & Web Based Technology 24

FORM Tags and Attributes

  • Several tags are used in connection with forms:

<form> …… </form> <input> <textarea> …… </textarea> <select> …… </select>

slide-25
SLIDE 25

Internet & Web Based Technology 25

<FORM> …… </FORM>

  • This tag is used to bracket a HTML form.

– Includes attributes which specify where and how to deliver filled-up information to the web server.

  • Two main attributes:

– METHOD – ACTION

slide-26
SLIDE 26

Internet & Web Based Technology 26

  • METHOD:

– Indicates how the information in the form will be sent to the web server when the form is submitted. – Two possible values:

  • POST: causes a form’s contents to be parsed one

element at a time.

  • GET: concatenates all field names and values in a single

large string. – POST is the preferred method because of string size limitations in most systems.

slide-27
SLIDE 27

Internet & Web Based Technology 27

  • ACTION:

– Specifies the URL of a program on the origin server that will be receiving the form’s inputs. – Traditionally called Common Gateway Interface (CGI).

  • Details of CGI to be discussed later.

– The specified program is executed on the server, when the form is submitted.

  • Output sent back to the browser.
slide-28
SLIDE 28

Internet & Web Based Technology 28

  • Typical usage:

<FORM METHOD=“POST” ACTION=“cgi-bin/myprog.pl”> …….. …….. </FORM>

slide-29
SLIDE 29

Internet & Web Based Technology 29

<INPUT>

  • This tag defines a basic form element.
  • Several attributes are possible:

– TYPE – NAME – SIZE – MAXLENGTH – VALUE – SRC – ALIGN

slide-30
SLIDE 30

Internet & Web Based Technology 30

  • TYPE:

– Defines the kind of element that is to be displayed in the form.

  • “TEXT” – defines a text box, which provides a single

line area for entering text.

  • “RADIO” – radio button, used when a choice must be

made among several alternatives (clicking on one of the buttons turns off all others in the same group).

  • “CHECKBOX” – similar to the radio buttons, but each

box here can be selected independently of the

  • thers.
slide-31
SLIDE 31

Internet & Web Based Technology 31

  • “PASSWORD” – similar to text box, but characters are not

shown as they are typed.

  • “HIDDEN” – used for output only; cannot be modified

(mainly used to refer to choices that have already been made earlier).

  • “IMAGE” – used for active maps. When the user clicks on

the image, the (x,y) co-ordinates are stored in variables, and are returned for further processing.

  • “SUBMIT” – creates a box labeled Submit; if clicked, the

form data are passed on to the designated CGI script.

  • “RESET” – creates a box labeled Reset; if clicked, clears a

form’s contents.

slide-32
SLIDE 32

Internet & Web Based Technology 32

  • NAME:

– Specifies a name for the input field. – The input-handling program (CGI) in reality receives a number of (name,value) pairs.

  • SIZE:

– Defines the number of characters that can be displayed in a TEXT box without scrolling.

  • MAXLENGTH:

– Defines the maximum number of characters a TEXT box can contain.

slide-33
SLIDE 33

Internet & Web Based Technology 33

  • VALUE:

– Used to submit a default value for a TEXT or HIDDEN field. – Can also be used for specifying the label of a button (renaming “Submit”, for example).

  • SRC:

– Provides a pointer to an image file. – Used for clickable maps.

  • ALIGN:

– Used for aligning image types. ALIGN = TOP | MIDDLE | BOTTOM

slide-34
SLIDE 34

Internet & Web Based Technology 34

<TEXTAREA> … </TEXTAREA>

  • Can be used to accommodate multiple text lines in a

box.

  • Attributes are:

– NAME: name of the field. – ROWS: number of lines of text that can fit into the box. – COLS: width of the text area on the screen.

slide-35
SLIDE 35

Internet & Web Based Technology 35

<SELECT> …. </SELECT>

  • Used along with the tag <OPTION>.
  • Used to define a selectable list of elements.

– The list appears as a scrollable menu or a pop-up menu (depends on browser).

  • Attributes are:

– NAME: name of the field. – SIZE: specifies the number of option elements that will be displayed at a time on the menu. (If actual number exceeds SIZE, a scrollbar will appear). – MULTIPLE: specifies that multiple selections from the list can be made.

slide-36
SLIDE 36

Internet & Web Based Technology 36

<FORM ………….> …….. Languages known: <SELECT NAME=“lang” SIZE=3 MULTIPLE> <OPTION> English <OPTION> Hindi <OPTION> French <OPTION> Hebrew </SELECT> </FORM>

slide-37
SLIDE 37

Internet & Web Based Technology 37

Example 1

<HTML> <HEAD> <TITLE> Using HTML Forms </TITLE> </HEAD> <BODY TEXT="#FFFFFF" BGCOLOR="#0000FF" LINK="#FF9900" VLINK="#FF9900" ALINK="#FF9900"> <CENTER><H3> Student Registration Form </H3> </CENTER> Please fill up the following form about the courses you will register for this Semester.

slide-38
SLIDE 38

Internet & Web Based Technology 38

<FORM METHOD="POST" ACTION="/cgi/feedback"> <P> Name: <INPUT NAME="name" TYPE="TEXT" SIZE="30" MAXLENGTH="50"> <P> Roll Number: <INPUT NAME="rollno" TYPE="TEXT" SIZE="7"> <P> Course Numbers: <INPUT NAME="course1" TYPE="TEXT" SIZE="6"> <INPUT NAME="course2" TYPE="TEXT" SIZE="6"> <INPUT NAME="course3" TYPE="TEXT" SIZE="6"> <P> <P> Press SUBMIT when done. <P> <INPUT TYPE="SUBMIT"> <INPUT TYPE="RESET"> </FORM> </BODY> </HTML>

slide-39
SLIDE 39

Internet & Web Based Technology 39

slide-40
SLIDE 40

Internet & Web Based Technology 40

Example 2

<HTML> <HEAD> <TITLE> Using HTML Forms </TITLE> </HEAD> <BODY TEXT="#FFFFFF" BGCOLOR="#0000FF" LINK="#FF9900" VLINK="#FF9900" ALINK="#FF9900"> <CENTER> <H3> Student Registration Form </H3> </CENTER> Please fill up the form below and press DONE when done.

slide-41
SLIDE 41

Internet & Web Based Technology 41

<FORM METHOD="POST" ACTION="/cgi/feedback"> <P> Name: <INPUT NAME="name" TYPE="TEXT" SIZE="30" MAXLENGTH="50"> <P> Roll Number: <INPUT NAME="rollno" TYPE="TEXT" SIZE="7"> <P> Course Numbers: <INPUT NAME="course1" TYPE="TEXT" SIZE="6"> <INPUT NAME="course2" TYPE="TEXT" SIZE="6"> <INPUT NAME="course3" TYPE="TEXT" SIZE="6"> <P> Category: SC <INPUT NAME="cat" TYPE=RADIO> ST <INPUT NAME="cat" TYPE=RADIO> GE <INPUT NAME="cat" TYPE=RADIO>

slide-42
SLIDE 42

Internet & Web Based Technology 42

<P> Mother tongue: <SELECT NAME="mtongue" SIZE="3"> <OPTION> Hindi <OPTION> Bengali <OPTION> Gujrati <OPTION> Tamil <OPTION> Oriya <OPTION> Assamese </SELECT> <P> <P> Thanks for the information. <P> <INPUT TYPE="SUBMIT" VALUE="DONE"> <INPUT TYPE="RESET" VALUE="CLEAR FORM"> </FORM> </BODY> </HTML>

slide-43
SLIDE 43

Internet & Web Based Technology 43

slide-44
SLIDE 44

Internet & Web Based Technology 44

Example 3

<HTML> <HEAD> <TITLE> Using HTML Forms </TITLE> </HEAD> <BODY TEXT="#FFFFFF" BGCOLOR="#0000FF" LINK="#FF9900" VLINK="#FF9900" ALINK="#FF9900"> <CENTER> <H3> Student Feedback Form </H3> </CENTER> Please fill up the following form and press DONE when finished.

slide-45
SLIDE 45

Internet & Web Based Technology 45

<FORM METHOD="POST" ACTION="/cgi/feedback"> <P> Name: <INPUT NAME="name" TYPE="TEXT" SIZE="30" MAXLENGTH="50"> <P> Roll Number: <INPUT NAME="rollno" TYPE="TEXT" SIZE="7"> <P> Password: <INPUT NAME="code" TYPE=PASSWORD SIZE="10"> <P> Course Numbers: <INPUT NAME="course1" TYPE="TEXT" SIZE="6"> <INPUT NAME="course2" TYPE="TEXT" SIZE="6"> <INPUT NAME="course3" TYPE="TEXT" SIZE="6">

slide-46
SLIDE 46

Internet & Web Based Technology 46

<P> Category: SC <INPUT NAME="cat" TYPE=RADIO> ST <INPUT NAME="cat" TYPE=RADIO> GE <INPUT NAME="cat" TYPE=RADIO> <P> Mother tongue: <SELECT NAME="mtongue" SIZE="3"> <OPTION> Hindi <OPTION> Bengali <OPTION> Gujrati <OPTION> Tamil <OPTION> Assamese <OPTION> Oriya </SELECT>

slide-47
SLIDE 47

Internet & Web Based Technology 47

<P> Languages known: English <INPUT NAME="lang" TYPE=CHECKBOX> Hindi <INPUT NAME="lang" TYPE=CHECKBOX> <P> Scholarship holder (select for yes): <INPUT NAME="schol" TYPE=CHECKBOX> <P> General feedback: <TEXTAREA NAME="feed" ROWS=3 COLS=20> </TEXTAREA> <P> <P> Thanks for the information. <P> <INPUT TYPE="SUBMIT" VALUE="DONE"> <INPUT TYPE="RESET" VALUE="CLEAR FORM"> </FORM> </BODY> </HTML>

slide-48
SLIDE 48

Internet & Web Based Technology 48

slide-49
SLIDE 49

Internet & Web Based Technology 49

How to Submit a Form?

  • Three different ways:

– Clicking on the Submit button. – Clicking on an active map. – Pressing <ENTER> on a TEXT box or TEXTAREA.

slide-50
SLIDE 50

Internet & Web Based Technology 50

The Basic Mechanism

P

cgi

P Browser

new html page submit form

  • riginal page
slide-51
SLIDE 51

Internet & Web Based Technology 51

  • Web page including form

– Resides on the web server in the regular folder where html files and other documents are kept.

  • CGI script program handling form data

– Resides under a special folder on the web server (usually, “cgi-bin). – May be written in Perl, C, shell script, etc.

  • Web page linked to the cgi script.
slide-52
SLIDE 52

Internet & Web Based Technology 52

<FORM METHOD=“POST” ACTION=“cgi-bin/myprog.pl”> …….. …….. </FORM>

slide-53
SLIDE 53

Internet & Web Based Technology 53

How to Write the CGI Program?

  • Must know …

– How to access the form data.

  • Mechanism depends on METHOD (GET or POST).

– How to return processed output back to the browser.

  • HTML file created on the fly (typically).
  • Details to be discussed later.

– Good idea to have a look at a typical Perl script.

slide-54
SLIDE 54

Image Maps

slide-55
SLIDE 55

Internet & Web Based Technology 55

Introduction

  • An image map allows us to create links to different

URLs depending upon where we click on the image.

– Useful for creating links on maps, diagrams, fancy buttons, etc.

  • There are two parts to an image map.

– The image. – The map file.

  • The map file defines the areas of the image and the

URLs that correlate to different areas.

slide-56
SLIDE 56

Internet & Web Based Technology 56

So basically …

  • An image map is a single image that contains hot

spots.

– When we click on a hot spot, we go to a new location (URL). – Requires loading of only one image from the server.

  • Thus requires fewer server calls.
  • Is generally better looking.
slide-57
SLIDE 57

Internet & Web Based Technology 57

Types of Image Maps

  • Depending on the way they are configured and the

location where the processing is carried out, image maps can be classified as two types.

– Server side

  • Traditional

– Client side

  • More efficient; supported by all recent browsers.
slide-58
SLIDE 58

Server Side Image Maps

slide-59
SLIDE 59

Internet & Web Based Technology 59

Basic Functioning

  • Three ingredients are required to incorporate an

image map into a HTML document.

a) Creating the image map with well-defined boundaries. b) Creating an image map configuration file.

  • Contains relative pixel co-ordinates marking the

boundaries of the different clickable regions.

  • Allowable geometries: circle, poly, point, rect.

c) Establish appropriate HTML information in the page to link

  • the map image,
  • the map configuration file, and
  • an (optional) CGI script which decodes of map co-
  • rdinates and selects the corresponding URL.
slide-60
SLIDE 60

Internet & Web Based Technology 60

Typical Usage

<HTML> <BODY> …….. …….. <A HREF = “cgi-bin/map/menu.map”> <IMG SRC = “IMAGES/imagemap.gif” ISMAP> </A> …….. …….. </BODY> </HTML>

slide-61
SLIDE 61

Internet & Web Based Technology 61

  • The URL that is sent to the image map program or web server

when a user clicks the map resembles the following: http://myserver.com/menu.map?x,y where x and y are integers denoting the pixel co-ordinate of the point of click.

slide-62
SLIDE 62

Internet & Web Based Technology 62

Image Map Configuration File

  • There are several different formats, all similar, and

varying slightly in syntax.

a) NCSA httpd server b) APACHE httpd server c) CERN httpd server d) W3C httpd server

slide-63
SLIDE 63

Internet & Web Based Technology 63

Example: APACHE server

  • A sample configuration file looks like:

# An example default http://www.myserver.edu base_url http://www.iitkgp.ac.in/demo circle circle.html 45,45,80,45 rect rectangle.html 20,10,178,70 point point.html 100,50 poly polygon.html 200,60,295,60,275,10

slide-64
SLIDE 64

Internet & Web Based Technology 64

  • Defining the default

– Typically, the first line in the map file is a default line. – Defines the URL to which users will be taken if they click on an undefined area of the image.

  • Defining circles

– A circle is defined by two co-ordinates. – The first co-ordinate is the centre point. – The second co-ordinate is any point on the circumference.

slide-65
SLIDE 65

Internet & Web Based Technology 65

  • Defining rectangles

– A rectangle is defined by two co-ordinates. – The first co-ordinate refers to the upper left corner. – The second co-ordinate refers to the bottom right corner.

  • Defining points

– Defines by a single co-ordinate. – Clicks closest to that point on the image map will take to the specified URL.

  • Defining polygons

– A polygon is defined by a series of co-ordinates that outline the area to be defined. – We can start from any vertex of the polygon. – Maximum number of vertices is 100.

slide-66
SLIDE 66

Internet & Web Based Technology 66

Illustrative Example

slide-67
SLIDE 67

Internet & Web Based Technology 67

An Important Point

  • For each of the specified URLs, it is required to

specify the entire path.

  • However, common prefix URL can be specified by

the base_url command.

base_url http://www.iitkgp.ac.in circle circle.html 45,45,80,45 rect rectangle.html 20,10,178,70

slide-68
SLIDE 68

Client Side Image Maps

slide-69
SLIDE 69

Internet & Web Based Technology 69

Introduction

  • In client-side image maps, the map information is

contained in the HTML document itself.

  • Consists of three components:

– An ordinary image file (gif, jpeg, png) – A map delimited by <MAP> tags containing the co-ordinate and URL information for each region. – The USEMAP attribute within the <IMG> tag that indicates which map to reference.

slide-70
SLIDE 70

Internet & Web Based Technology 70

Advantages

  • They are self-contained within the HTML document.
  • No dependence on the server to handle every

client’s request for image mapping.

  • Faster processing; improves response time.
  • No longer required to specify a default URL.

– Clicking outside hyperlinked area will take a user nowhere.

  • Complete URL information displays in the status bar

when the mouse moves over the hot spots.

– In contrast, server-side image maps show only co-

  • rdinates.
slide-71
SLIDE 71

Internet & Web Based Technology 71

Disadvantage

  • The only disadvantage is that they are not

universally supported.

– Netscape Navigator 1.0 and Internet Explorer 2.0 do not support client-side image maps.

slide-72
SLIDE 72

Internet & Web Based Technology 72

Sample Client-side Image Map

<MAP NAME = “demo_map”> <AREA SHAPE=CIRCLE COORDS=“45,45,20” HREF=“circle.html” ALT=“Circle”> <AREA SHAPE=RECT COORDS=“20,20,80,80” HREF=“rectangle.html” ALT=“Rectangle”> <AREA SHAPE=POLY COORDS=“10,10,50,50,70,100” HREF=“polygon.html” ALT=“Triangle”> </MAP>

slide-73
SLIDE 73

Internet & Web Based Technology 73

  • Some points:

– POINT is not supported. – CIRCLE is specified by the centre co-ordinates, followed by its radius. – Comments can be included as in HTML, using <! ……….. >

slide-74
SLIDE 74

Internet & Web Based Technology 74

Linking to an Image

  • This can be done using the <IMG> tag using the

USEMAP attribute.

<IMG SRC=“mymap.gif” USEMAP=“#demo_map”> – References the image “mymap.gif”. – Searches for the <MAP> element with the NAME attribute of “demo_map”.

slide-75
SLIDE 75

Internet & Web Based Technology 75

A Complete Example

<HTML> <HEAD><TITLE> Client Side Image map </TITLE></HEAD> <BODY> <MAP NAME = “demo_map”> <AREA SHAPE=CIRCLE COORDS=“45,45,20” HREF=“circle.html” ALT=“Circle”> <AREA SHAPE=RECT COORDS=“20,20,80,80” HREF=“rectangle.html” ALT=“Rectangle”> <AREA SHAPE=POLY COORDS=“10,10,50,50,70,100” HREF=“polygon.html” ALT=“Triangle”> </MAP> <IMG SRC=“mymap.gif” USEMAP=“#demo_map”> </BODY> </HTML>

slide-76
SLIDE 76

Internet & Web Based Technology 76

Combining the Two

  • Motivation for combining client and server side

image map processing:

– Browsers ignore tags they do not understand. – Newer browsers will use client-side map. – Older browsers will use the server-side map.

  • How to do this?
slide-77
SLIDE 77

Internet & Web Based Technology 77

<A HREF = “http://myserver.edu/cgi-bin/map/demo_map”> <IMG SRC = “mymap.gif” USEMAP = “#demo_map” ISMAP> </A>

  • USEMAP will be ignored by older browsers.
  • ISMAP will be considered redundant by browsers

supporting client-side map.

slide-78
SLIDE 78

Creating Image Maps

slide-79
SLIDE 79

Internet & Web Based Technology 79

Available Tools

  • There are several tools using which we can create

an image map.

  • Some of the tools are:

– MapEdit – Macromedia Dreamweaver – Adobe GoLive

  • Irrespective of the tool used, the steps required for

creation are more or less the same.

slide-80
SLIDE 80

Internet & Web Based Technology 80

Creating the Map

  • Typical steps:

– Open the image in the imagemap editor. – Define areas within the image that will be clickable: rectangle, circle or polygon. – Highlight an area, and enter the URL for that area. – Repeat the above steps for all the clickable areas of the image. – For server-side image maps, we also need to define a default URL. – Select the type (client or server side).

slide-81
SLIDE 81

Hyper Text Transfer Protocol (HTTP)

slide-82
SLIDE 82

Internet & Web Based Technology 82

What is HTTP?

  • Hyper Text Transfer Protocol

– A protocol using which web clients (browsers) interact with web servers.

  • It is a stateless protocol.

– Fresh connection for every item to be downloaded.

  • Transfers hypertext across the Internet.

– A text with links to other text documents.

slide-83
SLIDE 83

Internet & Web Based Technology 83

HTTP Protocol

  • Web clients (browsers) and web servers communicate

via HTTP protocol.

  • Basic steps:

– Client opens socket connection to the HTTP server.

  • Typically over port 80.

– Client sends HTTP requests to server. – Server sends back response. – Server closes connection.

  • HTTP is a stateless protocol.
slide-84
SLIDE 84

Internet & Web Based Technology 84

Illustration

Web Servers Web Client http request http response http request http response

slide-85
SLIDE 85

Internet & Web Based Technology 85

HTTP Request Format

  • A client request to a server consists of:

– Request method – Path portion of the HTTP URL – Version number of the HTTP protocol – Optional request header information – Blank line – POST or PUT data if present.

slide-86
SLIDE 86

Internet & Web Based Technology 86

HTTP Request Methods

  • GET

– Most common HTTP method. – Returns the contents of the specified document. – Places any parameters in request header. – Can also be used to submit forms:

  • The form data is URL-encoded and appended to the

GET command URL. GET /cgi-bin/myscript.cgi?Roll=1234&Sex=M HTTP/1.0

slide-87
SLIDE 87

Internet & Web Based Technology 87

Illustration of GET

– A very simple HTTP connection to a server. telnet www.facweb.iitkgp.ac.in http – Client sends request for a file: GET /test.html HTTP/1.0 – The server sends back the response: HTTP/1.1 200 OK Date: Sun, 22 May 2005 09:51:42 GMT Server: Apache/1.3.33 (Win32) Last-Modified: Sun, 22 May 2005 09:51:10 GMT Accept-Ranges: bytes Content-Length: 119 Connection: close

slide-88
SLIDE 88

Internet & Web Based Technology 88

Illustration of GET (contd.)

Content-Type: text/html <html> <head> <title> A test page </title> </head> <body> This is the body of the test page. </body> </html>

slide-89
SLIDE 89

Internet & Web Based Technology 89

HTTP Request Methods (contd.)

  • HEAD

– Returns only the header information of the specified document. – Used by clients to determine the file size, modification date, server version, etc.

slide-90
SLIDE 90

Internet & Web Based Technology 90

Illustration of HEAD

  • Client sends

HEAD /index.html HTTP/1.0

  • Server responds back with:

HTTP/1.1 200 OK Date: Sun, 22 May 2005 10:08:37 GMT Server: Apache/1.3.33 (Win32) Last-Modified: Thu, 03 May 2001 11:30:38 GMT Accept-Ranges: bytes Content-Length: 1494 Connection: close Content-Type: text/html

slide-91
SLIDE 91

Internet & Web Based Technology 91

HTTP Request Methods (contd.)

  • POST

– Used to send data to the server to be processed in some way, as in a CGI script. – Basic difference from GET:

  • A block of data is sent along with the request.
  • Extra headers like Content-Type and Content-Length

are used for this purpose.

  • The requested object is not a resource to retrieve.

Rather, it is a script that can handle the data being sent.

  • The server response is not a static file; but is generated

dynamically as the program output.

slide-92
SLIDE 92

Internet & Web Based Technology 92

Illustration of POST

– A typical form submission, using POST is illustrated below: POST /cgi-bin/myscript.cgi HTTP/1.0 From: isg@hotmail.com User-Agent: HTTPTool/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 32 Roll=1234&Sex=M&Age=20

slide-93
SLIDE 93

Internet & Web Based Technology 93

HTTP Request Methods (contd.)

  • PUT

– Replaces the contents of the specified document with data supplied along with the command. – Not used widely.

  • DELETE:

– Deletes the specified document from the server. – Not used widely.

slide-94
SLIDE 94

Internet & Web Based Technology 94

HTTP Request Headers

  • After a HTTP request line, a client can send any

number of header fields.

– Usually optional – used to convey some information. – Some commonly used fields:

  • Accept: MIME types client accepts, in order of

preference.

  • Connection: connection options, close or Keep-Alive.
  • Content-Length: number of bytes of data to follow.
  • Content-Type: MIME type and subtype of the data that

follows.

  • Pragma: “no-cache” option directs the server/proxy to

return a fresh document even though a cached copy may exist.

slide-95
SLIDE 95

Internet & Web Based Technology 95

HTTP Request Data

  • To be given if the request type is either PUT or

POST.

– Send the data immediately after the HTTP request header, and a blank line.

slide-96
SLIDE 96

Internet & Web Based Technology 96

HTTP Response

  • An initial response line.

– Also called the status line. – Consists of three parts separated by spaces

  • The HTTP version
  • A 3-digit response status code
  • An English phrase describing the status code.

HTTP/1.0 200 OK HTTP/1.0 404 Not Found

slide-97
SLIDE 97

Internet & Web Based Technology 97

HTTP Response (contd.)

  • Header information, followed by a blank line, and

then the data.

HTTP/1.1 200 OK Date: Sun, 22 May 2005 09:51:42 GMT Server: Apache/1.3.33 (Win32) Last-Modified: Sun, 22 May 2005 09:51:10 GMT Content-Length: 119 Connection: close Content-Type: text/html <html> <head> <title> A test page </title> </head> <body> This is the body of the test page. </body> </html>

slide-98
SLIDE 98

Internet & Web Based Technology 98

3-digit Status Code

  • 1xx

– Indicates informational messages only.

  • 2xx

– Indicates successful transaction.

  • 3xx

– Redirects the client to another URL.

  • 4xx

– Indicates client error, such as unauthorized request.

  • 5xx

– Indicates internal server error.

slide-99
SLIDE 99

Internet & Web Based Technology 99

Common Status Codes

  • 200 OK
  • 301 Moved Permanently
  • 302 Moved Temporarily
  • 401 Unauthorized
  • 403 Forbidden
  • 404 Not Found
  • 500 Internal Server Error
slide-100
SLIDE 100

Internet & Web Based Technology 100

HTTP Response Headers

  • Common response headers include:

– Content-Length

  • Size of the data in bytes.

– Content-Type

  • MIME type and subtype of data being sent.

– Date

  • Current date.

– Expires

  • Date at which document expires.

– Last-Modified – Set-Cookie

  • Name/value pair to be stored as cookie.
slide-101
SLIDE 101

Internet & Web Based Technology 101

HTTP Response Data

  • A blank line follows the response header, and the

data follows next.

– No upper limit on data size.

  • HTTP/1.0

– Server typically closes connection after completing a transaction.

  • HTTP/1.1

– Server keeps the connection open by default, across transactions.

slide-102
SLIDE 102

Internet & Web Based Technology 102

HTTP version 1.1

  • Current standard and widely used.

– Became IETF draft standard in 2001.

  • Improvements over HTTP 1.0:

– Requires host identification.

  • Allows multi-homed servers.
  • More than one domain living on same server.

GET /index.html HTTP/1.1 Host: www.facweb.iitkgp.ac.in <blank line>

slide-103
SLIDE 103

Internet & Web Based Technology 103

HTTP version 1.1 (contd.)

– Default support for persistent connections.

  • Multiple transactions over a single connection.

– Support for content negotiation.

  • Decides on the best among the available

representations.

  • Server-driven or browser-driven.

– Browsers can request part of document.

  • Specify the bytes using Range header.
  • Browser can ask for more than one range.
  • Continue interrupted downloads.

Range: bytes=1200-3500

slide-104
SLIDE 104

Internet & Web Based Technology 104

HTTP version 1.1 (contd.)

– Efficient caching support

  • A document caching model that allows both the server

and the client to control the level of cachability and update conditions and requirements.

  • HTTP 1.1 requires several extra things from both

clients and servers.

– Mandatory to know these if one is trying to write a HTTP client or server.

slide-105
SLIDE 105

Internet & Web Based Technology 105

HTTP 1.1 Client Requirements

  • The clients must do the following:

– Include the Host: header with each request. – Either support persistent connections, or include the Connection: close header with each request. – Handle the 100 Continue response. – Accept responses with chunked data.

slide-106
SLIDE 106

Internet & Web Based Technology 106

HTTP 1.1 Server Requirements

  • The servers must do the following:

– Require the Host: header from HTTP 1.1 clients. – Accepts absolute URL’s in a request. – Accept requests with chunked data. – Include the Date: header in each response. – Support at least the GET and HEAD methods. – Support HTTP 1.0 requests. – Either support persistent connections, or include the Connection: close header with each request.

slide-107
SLIDE 107

Internet & Web Based Technology 107

HTTP Proxy servers

  • What is a HTTP Proxy server?

– A program that acts as an interface between a client and a server. – It receives requests from the clients, and forwards them to the server(s). – The responses are sent back in the same way. – A proxy thus acts both as a HTTP client and a server.

slide-108
SLIDE 108

Internet & Web Based Technology 108

  • Request from a client to a proxy server differs from

normal server requests in one way.

– The complete URL of the resource being requested must be specified. – Required by the proxy to know where to forward the request to. GET http://www.xyz.com/docs/abc.txt HTTP/1.0

slide-109
SLIDE 109

Uniform Resource Locators (URL)

slide-110
SLIDE 110

Internet & Web Based Technology 110

What is a URL?

  • They are the mechanism by which documents are

addressed in the WWW.

  • A URL contains the following information:

– Name of the site containing the resource. – The type of service to be used to access the resource (ftp, http, etc.). – The port number of the service.

  • Default assumed, if omitted.

– Location of the resource (path name) in the server.

slide-111
SLIDE 111

Internet & Web Based Technology 111

  • URLs specify Internet addresses.
  • General format for URL:

scheme://address:port/path/filename

  • Examples:

http://www.rediff.com/news/ab1.html http://www.xyz.edu:2345/home/rose.jpg mailto://skdas@yahoo.co.in news:alt.rec.flowers ftp://kumar:km123@www.abc.com/docs/paper/x1.pdf ftp://www.ftpsite.com/docs/paper1.ps

slide-112
SLIDE 112

Internet & Web Based Technology 112

Sending a Query String

  • The mechanism can also be used to send a query

string to a specified URL.

– Used for CGI scripts. – Place a question mark at the end of the URL, followed by the query string. http://www.xyz.com/cgi-bin/xyz.pl?Roll=1234&Sex=M

slide-113
SLIDE 113

CGI Scripts

slide-114
SLIDE 114

Internet & Web Based Technology 114

Introduction

  • CGI stands for Common Gateway Interface.

– Allows interactive web pages to be written.

  • Page created dynamically, based on user request.

– CGI programs are called “scripts” because the first CGI programs were written using UNIX shell scripts, and PERL.

  • Can be written in almost any language.

– Usually resides in a special directory in the web server (typically, “cgi-bin”).

slide-115
SLIDE 115

Internet & Web Based Technology 115

  • Apache Directory Structure: a case study

– cgi-bin

  • Here most of the interactive programs will reside. These

will be written in Perl, Java, or any other programming language. – conf

  • This will contain the configuration files.

– htdocs

  • This will contain the actual HTML documents, and will

typically have many subdirectories. This directory is known as the DocumentRoot.

slide-116
SLIDE 116

Internet & Web Based Technology 116

– icons

  • This contains the icons that Apache will use when

displaying information or error messages. – images

  • This will contain the image files that will be used in the

web site. – logs

  • This will contain the log files: the access_log and

error_log.

slide-117
SLIDE 117

Internet & Web Based Technology 117

Structure of CGI Script

  • When a CGI script is invoked by the server, the

server passes information to the script in one of two ways:

a) GET b) POST

  • The request method used is passed to the script

via the environment variable REQUEST_METHOD.

slide-118
SLIDE 118

Internet & Web Based Technology 118

“GET” Request Method

  • The GET method sends request information as

parameters appended at the end of the URL.

http://myserver.edu/cgi-bin/myprog.pl? name=niloy&rollno=7312&age=24

  • The parameters are passed to the CGI program via

the environment variable QUERY_STRING.

– For the above example, QUERY_STRING will contain “name=niloy&rollno=7312&age=24”

slide-119
SLIDE 119

Internet & Web Based Technology 119

“POST” Request Method

  • The data gets passed from the server to the CGI

script through STDIN.

  • The environment variable CONTENT_LENGTH

indicates the size in bytes of the incoming data.

  • The format of the POST-ed data is:

var1=value1&var2=value2&……

  • The REQUEST_METHOD environment variable must

be examined to know whether or not to read from STDIN.

slide-120
SLIDE 120

Internet & Web Based Technology 120

To Summarize

  • For GET

– Data are read from QUERY_STRING environment variable.

  • For POST

– Data are read from STDIN. – Number of bytes to be read is obtained from CONTENT_LENGTH.

  • Both data available in same format:

var1=value1&var2=value2&…… name=niloy&rollno=7312&age=24

slide-121
SLIDE 121

Internet & Web Based Technology 121

URL Encoding

  • For platform independence, all data passed to the

server are URL-encoded.

– Variables are separated by ‘&’. – Special characters (including ‘&’) are escaped as 2-digit hex numbers, e,g, %25 ‘%’ %20 ‘ ’ – ‘+’ sign is interpreted as a space character.

slide-122
SLIDE 122

Internet & Web Based Technology 122

  • The process of decoding back:

– Separate out the variables. – Replace all ‘+’ signs by spaces. – Replace all %## with the corresponding ASCII character.

slide-123
SLIDE 123

Internet & Web Based Technology 123

  • Which characters are encoded?

– Control characters: 0x00 through 0x1F, and 0x7F. – 8-bit characters: 0x80 through 0xFF – Characters given special importance within URLs: ; / ? : @ & = + $ , – Characters often used to delimit URLs: < > # % “ – Characters considered unsafe as they may have special meaning for other protocols: { } | \ ^ [ ] `

slide-124
SLIDE 124

Internet & Web Based Technology 124

  • A point to note:

– When the server passes data using the POST method, the scripts checks the environment variable CONTENT_TYPE. – If the value of CONTENT_TYPE is application/x-www-form-urlencoded the data needs to be decoded before use.

slide-125
SLIDE 125

Internet & Web Based Technology 125

Basic Structure of CGI Script

  • Step 1: Initialization

– Check REQUEST_METHOD. – Parse string and extract variables depending on “GET” or “POST”. – Check CONTENT_TYPE, to find out if the string is URL- encoded.

  • Step 2: Processing

– Process the input data. – Output the results (MIME-type header, and the contents).

  • Step 3: Termination

– Release the system resources. – Terminate the program.

slide-126
SLIDE 126

Internet & Web Based Technology 126

Environment Variables Used

  • CONTENT_LENGTH

– Length of URL-encoded data in bytes.

  • CONTENT_TYPE

– Specifies the type of data as a MIME header.

  • QUERY_STRING

– Information at the end of the URL after ‘?’.

  • REMOTE_ADDR

– IP address of the client making the request.

  • REMOTE_HOST

– Resolved host name of the client.

slide-127
SLIDE 127

Internet & Web Based Technology 127

  • REQUEST_METHOD

– “GET” or “POST”.

  • SERVER_NAME

– Web server’s host name, or IP address.

  • SERVER_PROTOCOL

– Say, HTTP/1.0

  • SERVER_PORT

– Port number on server that received the HTTP request.

  • SCRIPT_NAME

– Name of the CGI script being run.

slide-128
SLIDE 128

Internet & Web Based Technology 128

Response Header

  • The most common response header is Content-

Type, which is based on MIME types.

  • Typical values are:

Content-Type: text/plain text/html image/gif video/avi

slide-129
SLIDE 129

Internet & Web Based Technology 129

  • A complete MIME header looks like this:

Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Description: Postscript

slide-130
SLIDE 130

Internet & Web Based Technology 130

CGI Real-life Examples

  • Search Engine
  • Page-hit Counter
  • Student Registration
  • On-line Booking of Tickets
  • On-line Purchase of Items
  • E-mail Gateways
  • Feedback Scripts
  • Web-based Games
slide-131
SLIDE 131

Internet & Web Based Technology 131

Security Issues with CGI Scripts

  • A CGI script is a program that anyone in the world

can run on your machine.

  • Do not trust the user input.

– In particular, do not put user data in a shell command without verifying the data carefully. – An example in next slide.

slide-132
SLIDE 132

Internet & Web Based Technology 132

  • An example

– Suppose that you have a CGI script that lets users run the “finger” command on your host. – In Perl, there can be a line: system “finger $username” – A malicious user may enter isg; rm –r / as the username. – The result all files will get deleted.

slide-133
SLIDE 133

Internet & Web Based Technology 133

isg; rm –r /

Enter UserId

slide-134
SLIDE 134

Internet & Web Based Technology 134

An Example CGI Program

  • Using bash shell script:

#!/bin/sh CAT=/bin/cat echo Content-type: text/plain echo "" if [[ -x $CAT]] then $CAT $1 | sort else echo Cannot find command on this system. fi

slide-135
SLIDE 135

Internet & Web Based Technology 135

  • What this program does?

– Sends the contents of a file residing on the server back to the browser.

  • How to invoke?

<A HREF="/cgi-bin/test1.sh? /home/user1/public_html/text-file.txt"> Click here to activate</A>

$1

slide-136
SLIDE 136

Internet & Web Based Technology 136

Another Example

#!/bin/sh echo Content-type: text/html echo "" /bin/cat << EOM <HTML> <HEAD> <TITLE>File Output: /home/user1/public_html/text-file.txt </TITLE> </HEAD> <BODY bgcolor="#cccccc" text="#000000"> <HR SIZE=5> <H1>File Output: /home/user1/public_html/text-file.txt </H1> <HR SIZE=5> <P>

slide-137
SLIDE 137

Internet & Web Based Technology 137

<SMALL> <PRE> EOM /bin/cat /home/user1/public_html/text-file.txt CAT << EOM </PRE> </SMALL> <P> </BODY> </HTML> EOM

slide-138
SLIDE 138

Internet & Web Based Technology 138

  • What this program does?

– Outputs the contents of the file “text-file.txt” as a HTML file.

  • How to invoke?

– Through a dummy HTML form. – Through the following link: <A HREF="/cgi-bin/test2.sh">Click here</A>

slide-139
SLIDE 139

Internet & Web Based Technology 139

E-mail Gateways: an Example

  • E-mail gateways are very popular on the web.
  • Allows users to send and receive mails, without

having to worry about managing a mail server.

  • Can be designed using CGI scripts, or any other

similar technologies.

  • Popular e-mail gateways:

– yahoo, rediffmail, hotmail, gmail, etc.

slide-140
SLIDE 140

Internet & Web Based Technology 140

slide-141
SLIDE 141

Internet & Web Based Technology 141

Browser Email Gateway Mail Server

slide-142
SLIDE 142

Internet & Web Based Technology 142

Writing CGI Scripts using Perl

  • Would be discussed later.

– After discussing the syntax and semantics of Perl. – We will see how the form data can be extracted and processed.

  • Requires string manipulation.