Text Mining in Search Engines By: DJ Ambler With special thanks to - - PowerPoint PPT Presentation

text mining in
SMART_READER_LITE
LIVE PREVIEW

Text Mining in Search Engines By: DJ Ambler With special thanks to - - PowerPoint PPT Presentation

Text Mining in Search Engines By: DJ Ambler With special thanks to the Internet Overview What is text mining? How is it used in search engines? Text Mining Definition A way to extract meaning from text Structuring, deriving


slide-1
SLIDE 1

Text Mining in Search Engines

By: DJ Ambler With special thanks to the Internet

slide-2
SLIDE 2

Overview

  • What is text mining?
  • How is it used in search engines?
slide-3
SLIDE 3

Text Mining Definition

  • A way to extract meaning from text
  • Structuring, deriving patterns, then evaluating
  • “High quality” in text mining
slide-4
SLIDE 4

Text Mining Tasks

  • Text categorization
  • Text clustering
  • Concept/entity extraction
  • Production of granular taxonomies
  • Sentiment analysis
  • Document summarization
  • Entity relation modeling
slide-5
SLIDE 5

Parts of a Search Engine

  • Crawler
  • Indexer
  • Ranker
slide-6
SLIDE 6

Crawler (Spider)

Issues in crawling:

  • 1. What to crawl?
  • 2. How much to crawl?
  • 3. How often to crawl?
slide-7
SLIDE 7

Indexer

  • Stop words
  • Stemming
  • Issues
slide-8
SLIDE 8

Ranker

  • Receives query
  • Searches index
  • Ranks the pages based on complex algorithms
slide-9
SLIDE 9

Ranking Criteria

  • Number of matching query words in the page
  • Proximity of matching words to one another
  • Location of terms within the page
  • Location of terms within tags e.g. <title>, <h1>, link text,

body text, etc...

  • Frequency of terms on the page and in general
  • How “fresh” is the page
slide-10
SLIDE 10

Sources

  • Cong, G. (n.d.). Introduction to Text Mining and Web Search.

Retrieved November 3, 2017.

  • Joshi, H. (n.d.). Search Engines - Text Mining in Action. Retrieved

November 03, 2017, from https://www.scribd.com/document/176948623/Search- Engines-Text-Mining-in-Action