Web Mining Andreas Andersson Gustav Strmberg Sandra Stendahl - - PDF document

web mining
SMART_READER_LITE
LIVE PREVIEW

Web Mining Andreas Andersson Gustav Strmberg Sandra Stendahl - - PDF document

2011-11-30 Web Mining Andreas Andersson Gustav Strmberg Sandra Stendahl Introduction Web mining o Structure mining o Content mining o Usage mining Web usage mining o Acquire the data o Preprocess the data o Detect patterns in the data o


slide-1
SLIDE 1

2011-11-30 1

Web Mining

Andreas Andersson Gustav Strömberg Sandra Stendahl

Introduction

  • Web mining
  • Structure mining
  • Content mining
  • Usage mining
  • Web usage mining
  • Acquire the data
  • Preprocess the data
  • Detect patterns in the data
  • Use the detected patterns - applications
slide-2
SLIDE 2

2011-11-30 2

Techniques

  • Preprocessing
  • Eliminating Web robots
  • Avoiding mislabeled sessions
  • Mining
  • Indirect association

Clustering techniques

  • K-Means
  • AprioriAll
  • Buckshot
  • Suffix tree
  • Fractionation
  • Single pass
slide-3
SLIDE 3

2011-11-30 3

Applications

  • Website modifications
  • System improvement
  • Web personalization
  • Buisness Intelligence

Summary

  • Web usage mining -Three major parts
  • Preprocessing is a major step to reduce time consumption
  • Four major application areas
slide-4
SLIDE 4

2011-11-30 4

Bibliography

Pang-Nin Tang, Vipin Kumar, “Mining association patterns in web usage data”. 2002 Magdalini Eirinaki, Michalis Vazirgiannis, “Web mining for web personalization”. 2003 Maria-Luiza Antonie, Osmar R. Zaïane, “Mining positive and negative association rules: An approach for confined rules”. 2004

  • M. Koster, “A standard for robot exclusion”,

http://info.webcrawler.com/mak/projects/robots/norobots.html, 1994

  • C. Brodley and M.A. Friedl, “Identifying mislabeled training data”, Journal of Artificial Intelligence Research, vol.

11 pp. 131-167, 1999 Jaideep Srivastava, Robert Cooley, Mukund Deshpande, Pang-Ning Tan, “Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data”. 1999

More bibliography

Mike Perkowitz, Oren Etzioni, “Adaptive Web Sites: Automatically Synthesizing Web Pages”. 1998 Tak Woon Yan, Matthew Jacobsen, Hector Garcia-Molina, Umeshwar Dayal, “From User Access Patterns to Dynamic Hypertext Linking”. 1996 Bamshad Mobasher, Robert Cooley, Jaideep Srivastava, “Creating Adaptive Web Sites Through Usage-Based Clustering of URLs”. 1999 BizIntel, http://www.bizintel.se/ (2011) webtrends, http://webtrends.com/ (2011) Samuel Sambasivan, Nick Theodosopoulos, “Advanced data clustering methods of mining web documents”. 2006

  • R. Agrawal and R. Skrikant. Fast algorithms for mining association rules. In Proc. of the 20th VLDB Conference,

Santiago, Chile, 1994.

slide-5
SLIDE 5

2011-11-30 5

Questions?