Web Engineering availability and reliability = money large global - - PowerPoint PPT Presentation

web engineering
SMART_READER_LITE
LIVE PREVIEW

Web Engineering availability and reliability = money large global - - PowerPoint PPT Presentation

Availability E-Commerce (simplified) server down = sales down Web Engineering availability and reliability = money large global enterprises, e.g. Prof. Dr. Dr. h.c. mult. Gerhard Krger, Albrecht Schmidt www.amazone.com the


slide-1
SLIDE 1

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 1

Web Engineering

  • Prof. Dr. Dr. h.c. mult. Gerhard Krüger, Albrecht Schmidt

Universität Karlsruhe Fakultät für Informatik Institut für Telematik Wintersemester 2000/2001

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 2

Web Engineering

Chapter 4: Architecture and Platform

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 3

Availability

E-Commerce (simplified) server down = sales down availability and reliability = money large global enterprises, e.g.

www.amazone.com

the shop is populated by customers 24h a day customers are geographically distributed at specific times large numbers of customers access the shop availability in % 98,3% ~ the server is about 6,2 days per year not available 99,9% ~ the server is about 8,76 hours per year not available 99,999% ~ the server is about 5,3 minutes per year not available

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 4

Why are Servers not available?

server failure

  • Hardware
  • perating system

software failure

  • Web

server software

  • modules
  • databases

network failure

  • parts of the internet are not available
  • network partitions

content failure

  • e.g. file not found 404
  • documents are only partly transferred

request load to high (to many requests)

  • network bandwidth too small
  • server performance to small (RAM, processor,

hard drives, ... )

maintenance

slide-2
SLIDE 2

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 5

Solutions: Hardware/Software/Content Failure

redundant servers server cluster, geographically distributed redundancy in server hardware uninterruptible power supply (UPS) hard drives (RAID) network interfaces hot swap (change broken parts while system is running) monitor systems reply time, completeness of answer automated action in the case of a problem (mount other drive, reboot the system, start an other machine, ...) monitor the environment temperature staff must not be a „single point of failure“ system administrator, webmaster, editor

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 6

Solution: Request Load is to High

increase server performance split the domains into sub-domains

put each sub-domain on a separate machine

example 1: on content transform www.shop.com into www.cd.shop.com www.video.shop.com www.software.shop.com example 2: on location transform www.company.com to www.us.company.com www.asia.company.com www.eu.company.com make a cluster of servers – web-farms, load balancing

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 7

Solution: Network Partitions

replicate content geographically distributed servers server on different continents

  • rganizational distributed servers

server in different backbones servers distributed in different judicial systems ???

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 8

Load Balancing

algorithm

s

  • Round

Robin

  • predefined
  • rder
  • Least

Connection

  • server

with minimal number

  • f

connections is selected

  • Observed
  • n

each server runs an agent that monitors the load – server with minimal load is selected

  • Priority
  • request/content

has an assigned priority (e.g. preference for

  • rders,

low priority for support)

  • Ratio
  • server

and request have assigned weights (e.g. capabilities/performance

  • f

a server, cost to handle a request)

  • Fastest
  • server

that react first gets the request

  • Predictive
  • using

statistics the load

  • f

a server is predicted – based on that a server is selected geographical distribution

  • the nearest server is selected
  • the server with the cheapest link is selected
slide-3
SLIDE 3

Prof. Dr. Dr. h.c. mult. Ge rhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page 9

Load Balancing and Sessions

session / state / cookies in distributed architectures preserve

sessions – approaches:

do not use / support sessions requests from one client are always sent to the same server (at least during a session) replicating session / state information within the server cluster or farm specific distribution hardware that supports state / cookies

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page10

Server Farm

a number of servers provide one side to cope with a large number of requests to minimize the mean reply time to increase fault tolerance replicating the side group of servers with identical resources load balanced dispatching of incoming requests distribution algorithms partitioning the side group of servers with disjunctive / exclusive resources dispatching based on the requested URL

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page11

Server Farm – Basis Architecture

Web Server 1 File Server File Server File Server Web Server 2 Web Server 3 File Server Internet LAN Farm LAN

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page12

Solution: Example Big/IP

http://www.f5labs.com/bigip/index.html 8000 connections/s 90 Mbps

slide-4
SLIDE 4

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page13

Solution: Example 3DNS Geographically Distributed Server I

http://www.f5labs.com/3dns/index.html

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page14

Solution: Example 3DNS Geographically Distributed Server II

http://www.f5labs.com/3dns/index.html

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page15

Proxy Server

server and client looks like a server for the browser looks like a client for the server cache reduce the transfer volume keep a log file (who has visited which sites? when?) check and filter certain content is no handed on (e.g. based on URLs, keyword, ...) documents are filtered, parts removed e.g. www.webwasher.com virus check for downloads on the fly pages are extended with additional information and functionality (e.g. advertisements, banners, ...)

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page16

Proxy Performance

caching hard drive RAM processor performance (comparison) firewall, virus detection processing power I/O throughput

slide-5
SLIDE 5

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page17

Table of Content

  • 1. Architecture
  • 2. Web Server
  • 3. Web Client
  • 4. Performance and Efficiency

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page18

What does a Web Client I

reading and parsing the URL extract server name, get server address (DNS) extract file / resource name setup a TCP-connection to the server or to the proxy create HTTP request sent HTTP request wait for HTTP reply

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page19

What does a Web Client II

receive HTTP reply analyze and interpret HTTP reply HTTP parser HTML parser process / visualize content received parsing HTML and other media types rendering execute programs or run scripts delegate handling to extensions (if necessary)

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page20

Web Browser

specific Web client interface between the human user and the Web visualizing HTML Objects and often other standard media formats (e.g. gif, jpeg) action by clicking on URLs and input of URLs triggers HTTP request/reply browser extensions helper application Netscape Plug-in Java Virtual Machine, Microsoft ActiveX

slide-6
SLIDE 6

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page21

Functional Components

communication request documents from server provide as stream or object data analyses parsing header pares DTD and documents data structure to represent the document presentation visualizing parts of the document (HTML, GIF, JPEG, ...) keep track of the location of presented components interaction when interacting with components trigger appropriate action

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page22

Graphical Browsers, e.g. IE5, Amaya, Mosaic

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page23

Graphical Browser - WebTV

More on the Viewer & Download http://developer.webtv.net/

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page24

Other Graphical Browser

slide-7
SLIDE 7

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page25

Text Browser, e.g. Lynx

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page26

Audio/Voice Browser

acoustic “rendering“ content of a page is spoken (synthesized) e.g. pwWebSpeak (http://www.prodworks.com/pwwebspeak/) Browser for blind and visually impaired navigation based on semantic of content (sentence, paragraph, ...) uses speech synthesizer support for forms (E-commerce support) support for client-side maps no support for Java and JavaScript discontinued Web Accessibility Initiative (WAI) Enable/ease access to the Web for disabled users http://www.w3.org/WAI/

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page27

Browser Ideology?

„Anyone who slaps a 'this page is best viewed with Browser X' label on a Web page appears to be yearning for the bad old days, before the Web, when you had very little chance of reading a document written on another computer, another word processor, or another network.“

  • Tim Berners-Lee in Technology Review, July 1996

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page28

Functions of a Standard Browser

navigation

  • pen, back, forward, ...

history bookmarks

  • ptions on presentation

font, colors, ... images on/off browser style sheet services printing saving, HTML, text, including images (external) services: email, conference, news, ...

slide-8
SLIDE 8

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page29

Further Design Aspects

single threaded client

  • ne request after another

no concurrency multi-threaded/concurrent client per document

components with the document (e.g. images) are processed

with a separate thread

multi-session client

multiple sessions in parallel usually in separate windows Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page30

Telnet as Browser

#>telnet www.teco.edu 80 [RETURN] Trying 129.13.170.1... Connected to teco01a.teco.uni-karlsruhe.de. Escape character is '^]'. GET /index.html HTTP/1.0 [RETURN] [RETURN] HTTP/1.1 200 OK Date: Mon, 08 Mar 1999 20:56:14 GMT Server: Apache/1.2.1 Connection: close Content-Type: text/html <html> <head> <title>Telecooperation Office (TecO)</title> <SCRIPT LANGUAGE="JavaScript"> ...

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page31

Browser/Client - Java

class library in Java - examples

java.net.URL

getHost() getPort() getProtocol()

  • penConnection()

java.net.URLConnection

getContent() getHeaderField (String) getContentType()

java.net.URLEncoder

deal with specific characters in a URL Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page32

Java-Client Example I

import java.net.*; import java.io.*; public class viewsource2 { public static void main (String args[]) { String thisLine; URL u; URLConnection uc; if (args.length > 0) { //Open the URL for reading try { u = new URL(args[0]); try { uc = u.openConnection(); // now turn the URLConnection into a DataInputStream DataInputStream theHTML = new DataInputStream(uc.getInputStream());

slide-9
SLIDE 9

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page33

Java-Client Example II

try { while ((thisLine = theHTML.readLine()) != null) { System.out.println(thisLine); } // while loop ends here } // end try catch (Exception e) { System.err.println(e); } } // end try catch (Exception e) { System.err.println(e); } } // end try catch (MalformedURLException e) { System.err.println(args[0] + " is not a parseable URL"); System.err.println(e); } } // end if } // end main } // end viewsource2

source: E.R. Harold. Java Network Programming. O‘Reilly 1997.

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page34

several libraries Socket Library LWP Library simple

programming

#!/usr/bin/perl use LWP::Simple; print (get $ARGV[0]);

useful for automation of clients, e.g. recursive programs to mirror sites implementing robots for search engines simple load test software scripts to check/monitor servers

  • C. Wong. Web Client Programming with Perl.

O‘Reily. 1997.

Browser/Client – in Perl

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page35

Browser/Client – VisualBasic I

to implement a graphical web browser with

specific functionality

VisualBasic-program based on the

Microsoft Internet Controls

component: WebBrowser selected functions

Browser.Navigate2 URL Browser.GoBack Browser_BeforeNavigate2 Browser_NavigateComplete2

http://www.teco.edu/lehre/webe/beispiele/webbrowser1.zip

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page36

Browser/Client – VisualBasic II

slide-10
SLIDE 10

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page37

Browser/Client – Unix Netscape Remote Control

(remote) control for Netscape browser netscape –remote 'openURL(http://www.teco.edu)' selection of commands:

  • penURL (URL)
  • penURL (URL, new-window)

saveAs ( ) saveAs (Output-File) mailto (a, b, c) addBookmark (URL, Title)

  • http://home.netscape.com/newsref/std/x-remote.html

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page38

Browser Software

collection of Browser software

  • http://wdvl.internet.com/Software/Browsers/
  • http://browserwatch.internet.com/browsers.html
  • www.browser.org

most widely used browsers

(figures estimated 01/01 from log files of several servers)

  • Microsoft Internet Explorer ~ 60%
  • Netscape Navigator / Communicator ~30%

Amaya

  • W3C reference implementation
  • Browser and editor
  • works strictly on DTD and the tree structure of the HTML/XML document

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page39

Other Clients

programs that sent HTTP requests

  • meta search programs, e.g.

WebFerret http://www.ferretsoft.com/netferret/index.html

  • index agents, robots
  • mirror tools, http-grep

part of the proxies

  • acts as server towards the user
  • acts as client towards other servers
  • ften only implements parts of the HTTP protocol

applications

  • filters (on content, URL, ...)
  • post processing of content

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page40

Specific Clients

for other content types, e.g.

  • streaming Audio, streaming Video
  • VRML
  • MathML
  • ...
  • n different devices
  • PDAs, e.g Palmpliot
  • Psion

, HandheldPC, WinCE

  • Phones e.g. Nokia Communicator

web appliances

  • augmented reality

Active

Badge, intelligent environments

  • web enabled consumer devices

TV,

..., fridge

slide-11
SLIDE 11

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page41

Table of Content

  • 1. Architecture
  • 2. Web Server
  • 3. Web Client
  • 4. Performance and Efficiency

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page42

Increasing Efficiency

server side multi threading hardware caching server farms client side multi threading caching pre-fetch in the proxy caching filtering the protocol HTTP-NG FTP compression

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page43

Caching - Metrics

Hit-Rate Percentage of documents delivered from the cache Miss-Rate 1-Hitrate saved bandwidth/download volume download volume (e.g. in Mbytes) save because

  • f the cache

improvement of the access time (median)

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page44

Caching at the Server

in the space of the server process keep the file names and the assigned handles in a index

file may reside inside the processor

cache

problem when accessing the files with a other program

(delete, write, update)

cooperative cache for several server processes e.g. shared memory keep „sticky files“ separately

  • n the disk of the server

static data is usually there anyway cache for content that is generated (external programs/databases)

slide-12
SLIDE 12

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page45

Caching at the Client

In the space of the Web-Browser useful within sessions, e.g. back button cooperative, for multi-threaded clients

e.g.: one cache for multiple browser windows

  • n the hard disc at the client

persistent cache (uses for several successive sessions)

  • ften done for each individually (questionable!)

shared cache for all user – problem of concurrent access checking if the data in cache is still up to date at any access

  • nce a session

never

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page46

Proxy Cache Web Client Web Server Intranet Internet

Caching in the Network

Caching Proxy e.g. placed at the gateway from the local (company network) into the Internet or at gateways in the backbone caching for all Web clients in a Intranet

typical saving in transfer volume ~50%

  • ften in connection with a Firewall

HTTP HTTP ?

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page47

Cache Control

HTTP Header Expires Cache-Control

public, must-revalidate, proxy-revalidate, max-age

Validator

last-modified, ETag

based on the browser or proxy configuration

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page48

More Efficient Protocols

HTTP NG keep the connection stateful server longer transactions are better supported FTP large files are often faster using FTP than HTTP compression E.g. HTTP NG compressing data on the server Decompressing data at the client

slide-13
SLIDE 13

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page49

Literatur zu Kapitel 4 (I)

  • Orfali, R., Harkey , D., and Edwards, J.; 1996. The Essential Client/Server

Survival Guide. 2. Ed., John Wiley & Sons, New York.

  • Orfali, R., and Harkey, D.; 1998. Client/Server Programming with JAVA

and CORBA. 2. Ed., John Wiley & Sons, New York.

  • Powell, T.A., Jones, D.L., and Cutts, D.C.; 1998. Web Site Engineering -

Beyond Web Page Design. Prentice Hall, Upper Saddle River.

  • RealMediaArchitecture.

http://www.real.com/

  • Reilly, G.V.; 1999. Server Performance and Scalability Killers. Microsoft

Corporation.

  • Schulzrinne, H., Casner, S.L., Frederick, R., and Jacobson, V.; 1996.

RTP: A Transport Protocol for Real-Time Applications. Internet Proposed Standard, RFC 1889.

  • Schulzrinne, H., Rao, A., and Lanphier, R.; 1998. Real Time Streaming

Protocol (RTSP). Internet Proposed Standard, RFC 2326.

Prof. Dr. Dr. h.c. mult.Gerhard Krüger, Albrecht Schmidt: Web Engineering, WS00/01 page50

Literatur zu Kapitel 4 (II)

  • Turau, V.; 1999. Techniken zur Realisierung Web-basierter Anwendungen.

Informatik-Spektrum 22, pp. 3-12.

  • Vogel, A., and Duddy, K.; 1998. Java Programming with CORBA. 2. Ed.,

John Wiley & Sons, New York.

  • W3C; 1996. CERN httpd. http://www.w3.org/Daemon/
  • W3C; 1999. Jigsaw. http://www.w3.org/Jigsaw/
  • Web

Browser Sammlung. http://wdvl.internet.com/Software/Browsers/

  • Web Server Sammlung. http://serverwatch.iworld.com/
  • Wilde, E.; 1999. Wilde´s
  • WWW. Springer, Berlin.