CIS 330: Applied Database Systems Lecture 7: Technologies at the - - PowerPoint PPT Presentation
CIS 330: Applied Database Systems Lecture 7: Technologies at the - - PowerPoint PPT Presentation
CIS 330: Applied Database Systems Lecture 7: Technologies at the Three Tiers Alan Demers ademers@cs.cornell.edu Overview Internet concepts URIs The HTTP Protocol The presentation layer HTML, HTML Forms Cookies
Overview
- Internet concepts
- URIs
- The HTTP Protocol
- The presentation layer
- HTML, HTML Forms
- Cookies
- JavaScript
- Style Sheets
- The middle tier
- Application servers
- Servlets and JSP
- Maintaining state: Session tracking
Internet Concepts
- URIs
- The HTTP Protocol
- HTTP Overview
- Example HTTP Session
- HTTP 1.0 v. 1.1
- Live Demo via HTTP Tracer Plus
- Structure of Client Requests/Server
Responses
Uniform Resource Identifiers
- Uniform naming schema to identify resources on
the Internet
- A resource can be anything:
- Index.html
- mysong.mp3
- picture.jpg
- Example URIs:
http://www.cs.wisc.edu/~dbbook/index.html mailto:webmaster@bookstore.com
Structure of URIs
http://www.cs.wisc.edu/~dbbook/index.html
- URI has three parts:
- Naming schema (http)
- Name of the host computer (www.cs.wisc.edu)
- Name of the resource (~dbbook/index.html)
- URLs are a subset of URIs
HTTP Overview
- HTTP: HyperText Transfer Protocol
- Developed by Tim Berners Lee, 1990
- Client/Server Architecture:
- Client requests a document
- Example clients: IE, Netscape, etc.
- Server returns the document
- Example servers: Apache, IIS
Watch HTTP
- Telnet:
- telnet www.yahoo.com 80
- GET /
- See your requests:
- http://www.schroepl.net/cgi-bin/http_trace.pl
- Trace your HTTP traffic:
- http://www.sstinc.com/
Example HTTP Session
§ Client sends request à Server sends response § Client requests the following URL: http:// www.cs.cornell.edu:80/ § Anatomy of the Request:
§ http:// HyperText Transfer Protocol; other options: ftp, mailto. § www.cs.cornell.edu : host name § :80: Port Number. 80 is reserved for HTTP. Ports can range from: 1-65,535 § / Root document
The Client Request § Actual Browser Request:
GET / HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/ jpeg, image/pjpeg, */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Host: www.cs.cornell.edu Connection: Keep-Alive
Anatomy of the Client Request
- GET / HTTP/1.1
- Requests the root / document.
- Specifies HTTP version 1.1.
- HTTP Versions: 1.0 and 1.1 (more on this later…)
- Accept: image/gif, image/x-xbitmap, image/
jpeg, image/pjpeg, */*
- Indicates what type of media the browser will accept.
- Accept-Language: en-us
- Browser’s preferred language
- Accept-Encoding: gzip, deflate
- Accepts compressed data (speeds download times.)
Anatomy of the Client Request
- User-Agent: Mozilla/4.0 (compatible; MSIE 5.01;
Windows NT)
- Indicates the browser type.
§ Host: www.cs.cornell.edu
- Required for HTTP 1.1
- Optional for HTTP 1.0
- A Server may host multiple hostnames. Hence, the
browser indicates the host name here.
§ Connection: Keep-Alive
- Enables “persistent connections”. Faster
performance (more later…)
Server Response
HTTP/1.1 200 OK Date: Mon, 24 Sept 2001 20:54:26 GMT Server: Apache/1.3.6 (Unix) Last-Modified: Mon, 24 Sept 2001 14:06:11 GMT Content-length: 327 Connection: close Content-type: text/html <title>Sample Homepage</title> <img src="/images/oreilly_mast.gif"> <h1>Welcome</h2>This is the webpage of ...
Anatomy of Server Response
§ HTTP/1.1 200 OK
- Server Status Code
- Code 200: Document was found
- We will examine other status codes shortly.
§ Date: Mon, 24 Sept 2001 20:54:26 GMT
- Date on the server.
- GMT (Greenwich Mean Time)
§ Last-Modified: Mon, 24 Sept 2001 14:06:11 GMT
- Indicates the time when the document was last modified.
- Very useful for browser caching.
- If a browser already has the page in its cache, it may not need
to request the whole document again (more later…)
Anatomy of Server Response
§ Content-length: 327
- Number of bytes in the document response.
§ Connection: close
- Indicates that the server will close the connection.
- If the client wants to send another request, it will
need to open another connection to the server.
§ Content-type: text/html
- Indicates the MIME Type of the return document.
- Multi-Purpose Internet Mail Extensions
- Enables web servers to return binary or text files.
- Other MIME Categories:
§ audio, video, images, xml
Anatomy of Server Response The actual HTML document:
<title>Sample Homepage</title> <img src="/images/oreilly_mast.gif"> <h1>Welcome</h2>This is the web page of ...
HTTP 1.0 v. 1.1: Getting Objects
§ Once a browser receives an HTML page, it makes separate connections to retrieve different objects within the page.
Client Web Browser Web Server Give me /index.html Here you go... Now, give me logo.gif Here you go...
HTTP 1.0 v. 1.1
§ HTTP 1.0:
- For each request, you must open a new
connection with the server.
§ HTTP 1.1
- For each request, the default action is to
maintain an open connection with the server.
- Faster, Persistent Connections
- Supported by most browsers and servers.
Example: HTTP 1.0 v. 1.1
§ HTTP 1.0: Get HTML Page plus Images
- Open Connection: GET /index.html
- Open Connection: GET /logo.gif
- Open Connection: GET /button.gif
§ HTTP 1.1: Get HTML Page plus Images
- Open Persistent Connection: GET /
index.html
- GET /logo.gif
- GET /button.gif
Client Requests
§ Every client request includes three parts:
- Method: Used to indicate type of request,
HTTP Version and name of requested document.
- Header Information: Used to specify browser
version, language, etc.
- Entity Body: Used to specify form data for
POST requests.
Client Methods
- GET and POST: We will see them later when we
discuss HTML forms.
- HEAD:
- Similar to GET, except that the method requests only
the header information.
- Server will return date-modified, but will not return the
data portion of the requested document.
- Useful for browser caching.
- For example:
- If browser contains a cached version of a page, it issues a
head request.
- If document has not been modified recently, use cached
version.
Server Responses
§ Every server response includes three parts:
- Response line: HTTP version number, three
digit status code, and status message.
- Header: Information about the server
- Entity Body: The actual data.
Server Status Codes
§ 100-199 Informational § 200-299 Client Request Successful § 300-399 Client Request Redirected § 400-499 Client Request Incomplete § 500-599 Server Errors
Some Important Status Codes
§ 200: OK
§ Request was successful.
§ 301: Moved Permanently
§ Server redirects client to a new URL.
§ 404 Not Found
§ Document does not exist
§ 500 Internal Server Error
§ Error within the Web Server
HTTP Is Stateless
- What does this mean:
- No “sessions”
- Every message is completely self-contained
- No previous interaction is “remembered” by the protocol
- Tradeoff between ease of implementation and ease of
application development: Other functionality has to be built on top
- Implications for applications:
- Any state information (shopping carts, user login-information)
need to be encoded in every HTTP request and response!
- Popular methods on how to maintain state:
- Cookies (later this lecture)
- Dynamically generate unique URL’s at the server level (later this
lecture)
Overview
- Internet concepts
- The presentation tier
- HTML, HTML Forms
- Cookies
- JavaScript
- Style Sheets
- The middle tier
Web Data Formats
- HTML
- The presentation language for the Internet
- XML
- A self-describing, hierarchal data model
- We will cover XML and associated query
and transformation languages (XPath, XSLT) later.
HTML: An Example
<HTML> <HEAD></HEAD> <BODY> <h1>Barns and Nobble Internet Bookstore</h1> Our inventory: <h3>Science</h3> <b>The Character of Physical Law</b> <UL> <LI>Author: Richard Feynman</ LI> <LI>Published 1980</LI> <LI>Hardcover</LI> </UL>
<h3>Fiction</h3> <b>Waiting for the Mahatma</ b> <UL> <LI>Author: R.K. Narayan</ LI> <LI>Published 1981</LI> </UL> <b>The English Teacher</b> <UL> <LI>Author: R.K. Narayan</ LI> <LI>Published 1980</LI> <LI>Paperback</LI> </UL> </BODY> </HTML>
HTML: A Short Introduction
- HTML is a markup language
- Commands are tags:
- Start tag and end tag
- Examples:
- <HTML> … </HTML>
- <UL> … </UL>
- Many editors automatically generate HTML
directly from your document (e.g., Microsoft Word has an “Save as html” facility)
HTML: Sample Commands
- <HTML>:
- <UL>: unordered list
- <LI>: list entry
- <h1>: largest heading
- <h2>: second-level heading, <h3>, <h4>
analogous
- <B>Title</B>: Bold
Overview
- Internet concepts
- The presentation tier
- HTML, HTML Forms
- Cookies
- JavaScript
- Style Sheets
- The middle tier
Forms Are Everywhere
§ Forms provide a simple mechanism for collecting user data and submitting it to a web server. § Every web application uses forms. § Example
- User is prompted to enter first name, last
name and password.
- Data is submitted to the middle tier
<HTML> <HEAD> <TITLE>Form Example 1.0</TITLE> </HEAD> <BODY> <CENTER> ... <FORM ACTION=http://www.ecerami.com/servlet/FormServlet METHOD="POST"> First Name: <INPUT TYPE=TEXT NAME=first SIZE=20 MAXLENGTH=20><BR> Last Name: <INPUT TYPE=TEXT NAME=last SIZE=20 MAXLENGTH=20><BR> Password: <INPUT TYPE=PASSWORD NAME=password SIZE=20 MAXLENGTH=20> <BR><INPUT TYPE=SUBMIT VALUE="Submit"> </FORM> </CENTER> </BODY> </HTML>
Start of Form Tag End of Form Tag
Example 1: Overview
§ Every form must have a start <form> tag and an end </form> tag. <FORM> … </FORM> § Note that the form tag also has two attributes:
- Method
- Action
Form Tag Attributes
- Action (Required):
- indicates where to submit user data.
- Usually indicates a CGI program or a Java Servlet.
- In Example 1, data is submitted to a Java Servlet:
- http://www.ecerami.com/servlet/FormServlet
- Method: indicates the way in which user data is
submitted via the HTTP protocol.
- GET: browser sends user data as part of URL.
- http://www.ecerami.com/servlet/FormServlet?
first=Ethan&last=Cerami&password=blue
- POST: browser sends user data as part of the HTTP
Body.
More on the Form Method
§ Historical meaning:
- POST: Used to “post” new messages.
- GET: Given an ID, go “get” the new
message. § This is now confusing, since you can use either one. So, why use one over the
- ther? Let’s look at another example….
Example 2.0
§ What it does:
- This code does the exact same thing as
example 1.0.
- The only difference is that Example 1.0 sends
data via POST, whereas Example 2.0 sends data via GET.
- Regardless of the Method, the same servlet
still echoes the same data.
- Let’s take a look at the code…
<HTML> <HEAD> <TITLE>Form Example 2.0</TITLE> </HEAD> <BODY> <CENTER> ... First Name: <INPUT TYPE=TEXT NAME=first SIZE=20 MAXLENGTH=20><BR> Last Name: <INPUT TYPE=TEXT NAME=last SIZE=20 MAXLENGTH=20><BR> Password: <INPUT TYPE=PASSWORD NAME=password SIZE=20 MAXLENGTH=20> <BR> <INPUT TYPE=SUBMIT VALUE="Submit"> </FORM> …
<FORM ACTION="http://www.ecerami.com/servlet/ FormServlet" METHOD="GET">
Example 2.0
Example 2.0
§ When submitting data for Example 2.0, note that the user data is now appended to the URL: § http://www.ecerami.com/servlet/FormServlet? first=Ethan&last=Cerami&password=blue § URLEncoding:
- Path and first piece of user data are separated with a ?
character.
- After that, each piece of user data is separated with an &
character.
URL Encoding
§ Because the GET Method passes user data within the URL, you can easily create links that act like forms. § For example: § http://www.ecerami.com/servlet/FormServlet? first=Bill&last=Gates&password=microsoft
GET v. POST
§ If GET and POST end up with the same result, why choose one over the other? § Some servers limit the length of GET requests to 240 characters. § Security? You see the arguments with a GET. § If you want to create links that act like forms, use GET.
Basic Form Controls
§ There are half dozen basic form controls:
- Text boxes, password boxes
- Pull-Down menus
- Radio Buttons
- Checkboxes
- Text Areas
- Hidden Fields
§ We will now examine each of these in detail.
Basic Input Syntax
§ Every form control is specified with the <INPUT> tag. § As usual, every start <INPUT> tag must have an end </INPUT> tag. § Every <INPUT> tag must also have a name attribute. § The name enables your server side program to extract the relevant data.
Text Controls
§ Text boxes enable you to capture small amounts
- f text, such as a person’s name or an email
address. <INPUT TYPE=TEXT NAME=first SIZE=20 MAXLENGTH=20> § size: size of the text box § maxlength: maximum length of string user can enter
Password Controls
§ Same as Text Box Controls. Only difference is that data is masked, so that someone cannot easily peer over your shoulder. <INPUT TYPE=PASSWORD NAME=password SIZE=20 MAXLENGTH=20> § Note: Data is not encrypted
Submit/Reset Buttons
§ Submit Button: Submits the form data <INPUT TYPE=SUBMIT VALUE="Click Here!"> § Reset Button: Clears the form <INPUT TYPE=RESET VALUE="Clear"> § Value: text you want to appear on the button.
Check Boxes § Check boxes enable users to select
- ne or more options.
<input type=checkbox name="options" value="images"> Image <input type=checkbox name="options" value="video"> Video <input type=checkbox name="options" value="mp3"> MP3
§ value: the text submitted to server. § Let’s take a look at Example 3.0
Example 3.0
§ This example simulates a search system. § Users can enter a search keyword. § Users can also select one or more search criteria: images, videos or mp3s.
<HTML> <HEAD><TITLE>Form Example 3.0</TITLE></HEAD> <BODY><CENTER> ... <FORM ACTION="http://www.ecerami.com/servlet/FormServlet" METHOD="GET"> Search for: <INPUT TYPE=TEXT NAME=target SIZE=10 MAXLENGTH=20> <BR>Pages must include: <BR><input type=checkbox name="options" value="images"> Image <BR><input type=checkbox name="options" value="video"> Video <BR><input type=checkbox name="options" value="mp3"> MP3 <BR> <INPUT TYPE=SUBMIT VALUE="Search"> </FORM></CENTER></BODY></HTML>
Example 3.0
Radio Buttons
§ As opposed to check boxes, you can select only one radio button.
<input type=radio name="options" value="images"> Image <input type=radio name="options" value="video"> Video <input type=radio name="options" value="mp3"> MP3
§ value: the text submitted to server. § Let’s take a look at Example 4.0
<HTML> <HEAD><TITLE>Form Example 4.0</TITLE></HEAD> <BODY><CENTER> ... <FORM ACTION="http://www.ecerami.com/servlet/FormServlet" METHOD="POST"> Search for: <INPUT TYPE=TEXT NAME=target SIZE=10 MAXLENGTH=20> <BR>Pages must include: <BR><input type=radio name="options" value="images"> Image <BR><input type=radio name="options" value="video"> Video <BR><input type=radio name="options" value="mp3"> MP3 <BR><INPUT TYPE=SUBMIT VALUE="Search"> </FORM></CENTER></BODY> </HTML>
Example 4
Hidden Fields
§ Sometimes it is useful to send hidden data fields. § These fields are hidden from user, but submitted to the server just like any other piece of data.
- <INPUT TYPE=HIDDEN NAME="username" value=“cs330">
§ These are actually very useful for all sorts of applications: shopping carts, store checkouts, etc. § Let’s take a look at Example 5.0
<HTML> <HEAD><TITLE>Form Example 5.0</TITLE></HEAD> <BODY><CENTER> ... <FORM ACTION="http://www.ecerami.com/servlet/FormServlet" METHOD="POST"> <INPUT TYPE=HIDDEN NAME="username" value=“cs330"> Enter Password: <INPUT TYPE=TEXT NAME="password" SIZE=20 MAXLENGTH=20> <BR> <INPUT TYPE=SUBMIT VALUE="Login"> </FORM> </CENTER> </BODY> </HTML>
Example 5
Text Boxes
§ Used to submit large blocks of text.
<TEXTAREA NAME="bio" COLS=40 ROWS=10> Write a short bio here </TEXTAREA>
§ COLS: Width of the text Box § ROWS: Height of the text Box § Text: Represents default text that will automatically appear. § Let’s take a look at Example 6.0
<HTML> <HEAD><TITLE>Form Example 6.0</TITLE></HEAD> <BODY> <CENTER> ... <FORM ACTION="http://www.ecerami.com/servlet/FormServlet" METHOD="POST"> Tell us about yourself: <BR> <TEXTAREA NAME="bio" COLS=40 ROWS=10> Write a short bio here</TEXTAREA> <BR><INPUT TYPE=SUBMIT VALUE="Submit My Bio"> </FORM> </CENTER></BODY> </HTML>
Example 6
Select Options
§ Creates pull-down and selection lists.
<SELECT NAME="major"> <OPTION VALUE="cs">Computer Science</OPTION> <OPTION VALUE="is">Information Systems</OPTION> <OPTION VALUE="psych">Psychology</OPTION> <OPTION VALUE="hist">History</OPTION> <OPTION VALUE="bio">Biology</OPTION> </SELECT>
§ There are several variations to the select
- control. Example 7.0 examines three different
- ptions.
Form Validation
§ Every computer program must be prepared to deal with human errors and unexpected conditions. § Hence, while it is easy to create forms, most of the hard work goes into validating the user input. § For example:
- Did the user enter all the required fields?
- Did the user enter a search keyword (try yahoo.com
without a keyword)
- Did the user enter a valid zip code?
- Did the user enter a valid email address?
Form Validation Options
§ There are two broad options for form validation:
- Option 1: Client Side Validation
§ Performed via the JavaScript language. § Good for checking that required fields are filled in. § Immediately prompts the user with the error.
- Option 2: Server Side Validation
§ Performed via a JSP or Java Servlet § Upon submission, web server checks form data, and then returns a refreshed page with errors denoted. § An example: Registering for Yahoo.com § This is the option we will explore for Java Servlets
Overview
- Internet concepts
- The presentation tier
- HTML, HTML Forms
- Cookies
- JavaScript
- Style Sheets
- The middle tier
Sites that know you...
§ Just a few common examples:
- my.yahoo.com
- www.amazon.com
§ Each time I return to these sites, they remember who I am.
- Yahoo remembers my news, bookmarks, etc.
- Amazon.com remembers what books I have
browsed and makes recommendations. § How do they do that?
What is a Cookie?
§ Small piece of data generated by a web server, stored on the client’s hard drive. § Serves as an add-on to the HTTP specification (remember, HTTP by itself is stateless.) § Controversial, as it enables web sites to track web users and their habits (more later…)
Example Cookie Use
§ Web Site Acme.com wants to track the number
- f unique visitors who access its site.
§ If Acme.com checks the HTTP Server logs, it can determine the number of “hits”, but cannot determine the number of unique visitors.* § That’s because HTTP is stateless. It retains no memory regarding individual users. § Cookies provide a mechanism to solve this problem.
* Actually, you could check the log files for IP addresses, but you would still have the problem of Internet proxies.
Tracking Unique Visitors
§ Step 1: Person A requests home page for acme.com § Step 2: Acme.com Web Server generates a new unique ID. § Step 3: Server returns home page plus a cookie set to the unique ID. § Step 4: Each time Person A returns to acme.com, the browser automatically sends the cookie along with the GET request.
Cookie Conversation
Browser Server
Give me the home page! Here’s the home page plus a cookie. Now, give me the news page (cookie is sent automatically) I’ve seen you before… Here’s the news page.
Cookie Notes
§ Created in 1994 for Netscape 1.1 § Cookies cannot be larger than 4K § No domain (netscape.com, microsoft.com) can have more than 20 cookies. § Cookies stay on your machine until:
- they automatically expire
- they are explicitly deleted
§ Cookies work the same on all browsers. No cross-browser problems here!
Magic Cookies
§ The term cookie comes from an old programming hack, called Magic Cookies. § If a programmer needed to make two programs communicate, he would create a “magic cookie”, a small text file containing data to transfer between program parts.
Cookie Standards
§ Version 0 (Netscape):
- The original cookie specification
- Implemented by all browsers and servers
- We will focus on this Version
§ Version 1
- A proposed Internet Engineering Task Force (IETF)
standard - RFC 2109
- Not very widely used (we will stick to Version 0.)
Why use Cookies?
§ Tracking unique visitors § Creating personalized web sites § Shopping Carts § Tracking users across your site:
- e.g. do users that visit your sports news page
also visit your sports store?
Cookie Anatomy
§ Version 0 specifies six cookie parts:
- Name
- Domain
- Path
- Expires
- Secure
Cookie Parts: Name/Value
§ Name
- Name of your cookie (Required)
- Cannot contain whitespaces, semicolons or
commas. § Value
- Value of your cookie (Required)
- Cannot contain whitespaces, semicolons or
commas.
Cookie Parts: Domain
§ Only pages from the domain which created a cookie are allowed to read the cookie. § For example, amazon.com cannot read yahoo.com’s cookies (imagine the security flaws if this were otherwise!) § By default, the domain is set to the full domain
- f the web server that served the web page.
- For example, myserver.mydomain.com would
automatically set the domain to .myserver.mydomain.com
Cookie Parts: Domain
§ Note that domains are always prepended with a dot.
- This is a security precaution: all domains must have
at least two periods. § You can however, set a higher level domain
- For example, myserver.mydomain.com can set the
domain to .mydomain.com. This way hisserver.mydomain.com and herserver.mydomain.com can all access the same cookies. § No matter what, you cannot set a domain other than your own.
Cookie Parts: Path
§ Restricts cookie usage within the site. § By default, the path is set to the path of the page that created the cookie. § Example: user requests page from mymall.com/
- storea. By default, cookie will only be returned
to pages for or under /storea. § If you specify the path to / the cookie will be returned to all pages (a common practice.)
Cookie Parts: Expires
§ Specifies when the cookie will expire. § Specified in Greenwich Mean Time (GMT):
- Wdy DD-Mon-YYYY HH:MM:SS GMT
§ If you leave this value blank, browser will delete the cookie when the user exits the browser.
- This is known as a session cookies, as
- pposed to a persistent cookie.
Cookie Parts: Secure
§ The specification says that the secure flag is designed to encrypt cookies while in transit. § A secure cookie will only be sent over a secure connection (such as SSL.) § In other words, if a cookie is set to secure, and you connect using a non-secure connection, the cookie will not be sent.
Cookie Block Software
§ Cookie Central has pointers to lots of cookie blocking software.
- Cookie Pal
- Cookie Crusher
- Cookie Cruncher
- etc.