Information Needs IR, session 2 CS6200: Information Retrieval - - PowerPoint PPT Presentation

information needs
SMART_READER_LITE
LIVE PREVIEW

Information Needs IR, session 2 CS6200: Information Retrieval - - PowerPoint PPT Presentation

Information Needs IR, session 2 CS6200: Information Retrieval Slides by: Jesse Anderton Information Retrieval Information Retrieval is the field of Computer Science concerned with finding the information from a collection that is relevant


slide-1
SLIDE 1

CS6200: Information Retrieval

Slides by: Jesse Anderton

Information Needs

IR, session 2

slide-2
SLIDE 2
  • Information Retrieval is the field of Computer Science concerned

with finding the information from a collection that is relevant to a user’s information need, as expressed by a query.

  • This general task has many possible concrete formalizations, which

define collection, information need, relevance, and query more precisely.

  • Let’s look at several examples to get a sense of the field’s scope.

Information Retrieval

slide-3
SLIDE 3
  • Currently, the most important search

task is ad hoc search on the Internet.

  • The collection is the set of web

pages indexed by the search engine.

  • The information need is the web

content the user is looking for.

  • The query is an ordered list of

keywords.

  • A document is relevant if it contains

text on the same topic as the query.

Ad-Hoc Search

slide-4
SLIDE 4
  • Vertical Search focuses on information from

a particular domain: flights, music, news, sports, etc.

  • The collection might be the set of all airline

fares, research papers, or blog posts.

  • An information need can be very specific:

“the cost of a flight to Iceland tomorrow”

  • The query may be structured using a web

form, providing specific property values to search for.

  • Document relevance is sometimes less

ambiguous: matching the search fields.

Vertical Search

slide-5
SLIDE 5
  • Enterprise Search is vertical search

run against a company’s internal content.

  • The collection is the set of

documents, e-mails, forum threads, wiki pages, etc. in the company’s internal network.

  • Information needs, queries, and

relevance are typically defined as for ad-hoc search.

Enterprise Search

slide-6
SLIDE 6
  • Desktop Search focuses on searching the

contents of your computer.

  • The collection is the set of files (and

contacts, messages, events, etc.) stored

  • n your computer.
  • An information need is generally either a

file, or information stored in a file.

  • A query can be a list of keywords as in

ad-hoc search, or a list of property values in a custom query language.

  • Relevance is defined as in ad-hoc search.

Desktop Search

slide-7
SLIDE 7
  • Peer to Peer Search focuses on finding

content shared on peer to peer networks.

  • The collection is the set of all files

currently shared by any peer on the network.

  • An information need is a particular file,

e.g. a music video.

  • A query is often a keyword list, but may

use an extended query language.

  • A document is relevant only if it’s the file

the user wanted.

Peer to Peer Search

slide-8
SLIDE 8
  • Question Answering tries to answer

questions posed as normal dialog.

  • Information needs are usually

restricted to concisely-stated answers.

  • Queries are posed as a single

sentence of natural language text.

  • A response is relevant if it answers

the question correctly, and if it is expressed clearly (e.g. fluently).

Question Answering

slide-9
SLIDE 9
  • Information Retrieval is the field of Computer Science concerned

with finding the information from a collection that is relevant to a user’s information need, as expressed by a query.

  • A query is an expression of an information need, and not the need

itself.

  • Next, we’ll take a look at some of the distinct types of information

needs users have.

Wrapping Up