Inferring Searcher Intent Eugene Agichtein Emory University - - PDF document

inferring searcher intent
SMART_READER_LITE
LIVE PREVIEW

Inferring Searcher Intent Eugene Agichtein Emory University - - PDF document

Eugene Agichtein, Emory University 11 July 2010 Inferring Searcher Intent Eugene Agichtein Emory University Tutorial Website (for expanded and updated bibliography): http://ir.mathcs.emory.edu/intent_tutorial/ Instructor contact information:


slide-1
SLIDE 1

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 1

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Eugene Agichtein Emory University

Inferring Searcher Intent

Eugene Agichtein Emory University

Tutorial Website (for expanded and updated bibliography): http://ir.mathcs.emory.edu/intent_tutorial/ Instructor contact information:

Email: eugene@mathcs.emory.edu Web: http://www.mathcs.emory.edu/~eugene/

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Tutorial Overview

  • Part 1: Search Intent Modeling

– Motivation: how intent inference could help search – Search intent & information seeking behavior in traditional IR – Searcher models: from eye tracking to clickthrough mining

  • Part 2: Inferring Web Searcher Intent

– Inferring result relevance: clicks – Richer interaction models: clicks + browsing

  • Part 3: Applications and Extensions

– Implicit feedback for ranking – Contextualized prediction: session modeling – Personalization, query suggestion, active learning

2

Eugene Agichtein Emory University

slide-2
SLIDE 2

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 2

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

About the Instructor

  • Eugene Agichtein (Ah-ghi-sh-tein)

http://www.mathcs.emory.edu/~eugene/

  • Research: Information retrieval and data mining

– Mining search behavior and interactions in web search – Text mining, information extraction, and question answering

  • Relevant experience:

2006 - Assistant Professor, Emory University Summer’07: Visiting Researcher, Yahoo! Research 2004-06: Postdoc, Microsoft Research 1998 - 2004: PhD student, Columbia

  • Databases/IR

Eugene Agichtein Emory University

3

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Outline: Search Intent and Behavior

Motivation: how intent inference could help search

  • Search intent and information seeking behavior

– Classical models of information seeking

  • Web searcher intent
  • Web searcher behavior

– Levels of modeling: micro-, meso-, and macro- levels – Variations in web searcher behavior – Click models

4

Eugene Agichtein Emory University

slide-3
SLIDE 3

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 3

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Some Key Challenges for Web Search

  • Query interpretation (infer intent)
  • Ranking (high dimensionality)
  • Evaluation (system improvement)
  • Result presentation (information visualization)

5

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Example: Task-Goal-Search Model

6

car safety ratings consumer reports

Eugene Agichtein Emory University

slide-4
SLIDE 4

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 4

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Information Retrieval Process Overview

7

Source Selection Search

Query: car safety ratings

Selection Ranked List Examination Documents Delivery Documents Query Formulation Resource

query reformulation, vocabulary learning, relevance feedback source reselection

Search Engine Result Page (SERP) Credit: Jimmy Lin, Doug Oard, …

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Explicit Intentions in Query Logs

  • Match known goals (from ConceptNet) to query logs

Eugene Agichtein Emory University

8

Strohmaier et al., K-Cap 2009

slide-5
SLIDE 5

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 5

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Unfortunately, most queries are not so explicit…

Eugene Agichtein Emory University

9

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Outline: Search Intent and Behavior

Motivation: how intent inference could help search Search intent and information seeking behavior

– Classical models of information seeking

  • Web Searcher Intent

– Broder – Rose – More recent?

  • Web Searcher Behavior

– Levels of modeling: micro-, meso-, and macro- levels – Variations in web searcher behavior – Click models

  • Challenges and open questions

10

Eugene Agichtein Emory University

slide-6
SLIDE 6

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 6

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Information Seeking Funnel

  • Wandering: the user does not have an

information seeking-goal in mind. May have a meta-goal (e.g. “find a topic for my final paper.”)

  • Exploring: the user has a general goal (e.g. “learn

about the history of communication technology”) but not a plan for how to achieve it.

  • Seeking: the user has started to identify

information needs that must be satisfied (e.g. “find out about the role of the telegraph in communication.”), but the needs are open- ended.

  • Asking: the user has a very specific information

need that corresponds to a closed-class question (“when was the telegraph invented?”).

Eugene Agichtein Emory University

11

  • D. Rose, 2008

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Models of Information Seeking

  • “Information-seeking … includes

recognizing … the information problem, establishing a plan of search, conducting the search, evaluating the results, and … iterating through the process.”- Marchionini, 1989

– Query formulation – Action (query) – Review results – Refine query

12

Adapted from: M. Hearst, SUI, 2009

Eugene Agichtein Emory University

slide-7
SLIDE 7

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 7

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Reviewing Results: Relevance Clues

  • What makes information or information objects

relevant? What do people look for in order to infer relevance?

– Topicality (subject relevance) – Extrinsic (task-, goal- specific)

  • Information Science “clues research”:

– uncover and classify attributes or criteria used for making relevance inferences

13

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Information Scent for Navigation

  • Examine clues where to find useful information

14

Search results listings must provide the user with clues about which results to click

Eugene Agichtein Emory University

slide-8
SLIDE 8

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 8

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Dynamic “Berry Picking” Model

  • Information needs change during interactions

15

[Bates, 1989] M.J. Bates. The design of browsing and berrypicking techniques for the on- line search interface. Online Review, 13(5):407–431, 1989.

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Goal: maximize rate of information gain.

Patches of information websites

Basic Problem: should I continue in the current patch

  • r look for another patch?

Expected gain from continuing in current patch, how long to continue searching in that patch

Information Foraging Theory

16

Eugene Agichtein Emory University

Pirolli and Card, CHI 1995

slide-9
SLIDE 9

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 9

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Diminishing returns: 80% of users scan only first 3 pages of search results

  • Charnov’s Marginal Value Theorem

17

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Hotel Search

Eugene Agichtein Emory University

18

Goal: Find cheapest 4-star hotel in Paris. Step 1: pick hotel search site Step 3: goto 1 Step 2: scan list

slide-10
SLIDE 10

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 10

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Example: Hotel Search (cont’d)

Eugene Agichtein Emory University

19

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Orienteering vs. Teleporting

  • Orienteering:

– Searcher issues a quick, imprecise to get to approximately the right information space region – Searchers follow known paths that require small steps that move them closer to their goal – Easy (does not require to generate a “perfect” query)

  • Teleporting:

– Issue (longer) query to jump directly to the target – Expert searchers issue longer queries – Requires more effort and experience. – Until recently, was the dominant IR model

20

Eugene Agichtein Emory University

Teevan et al., CHI 2004

slide-11
SLIDE 11

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 11

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Serendipity

Eugene Agichtein Emory University

21

Andre et al., CHI 2009

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Summary of Models

  • Static, berry-picking, information foraging,
  • rienteering, serendipity
  • Classical IR Systems research mainly uses the

simplest form of relevance (topicality)

  • Open questions:

– How people recognize other kinds of relevance – How to incorporating other forms of relevance (e.g., user goals/needs/tasks) into IR systems

Eugene Agichtein Emory University

22

slide-12
SLIDE 12

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 12

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 1: Search Intent and Behavior

Motivation: how intent inference could help search Search intent and information seeking behavior

Classical models of information seeking

  • Web Searcher Intent

– Broder – Rose – More recent?

  • Web Searcher Behavior

– Levels of modeling: micro-, meso-, and macro- levels – Variations in web searcher behavior – Click models

23

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Intent Classes (top level only)

User intent taxonomy (Broder 2002)

– Informational – want to learn about something (~40% / 65%) – Navigational – want to go to that page (~25% / 15%) – Transactional – want to do something (web-mediated) (~35% / 20%)

  • Access a serviceDownloads
  • Shop

– Gray areas

  • Find a good hub
  • Exploratory search “see what’s there”

Eugene Agichtein Emory University

History nonya food Singapore Airlines Jakarta weather Kalimantan satellite images Nikon Finepix Car rental Kuala Lumpur [from SIGIR 2008 Tutorial, Baeza-Yates and Jones]

24

slide-13
SLIDE 13

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 13

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Extended User Goal Taxonomy

Eugene Agichtein Emory University

25

Rose et al., 2004

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Complex Search

A complex search task refers to cases where:

  • searcher needs to conduct multiple searches to locate the

information sources needed,

  • completing the search task spans over multiple sessions (task is

interrupted by other things),

  • searcher needs to consult multiple sources of information (all the

information is not available from one source, e.g., a book, a webpage, a friend),

  • requires a combination of exploration and more directed

information finding activities,

  • often requires note-taking (cannot hold all the information that is

needed to satisfy the final goal in memory), and

  • specificity of the goal tends to vary during the search process

(often starts with exploration).

Eugene Agichtein Emory University

26

Aula and Russel, 2008

slide-14
SLIDE 14

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 14

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Complex Search (Cont’d)

Eugene Agichtein Emory University

27

Aula and Russel, 2008

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Web Search Queries

  • Cultural and educational diversity
  • Short queries and impatient interaction

– Few queries posed and few answers seen (first page) – Reformulation common

  • Smaller and different vocabulary

– Not “expert” searchers! – “Which box do I type in?”

Eugene Agichtein Emory University

28

slide-15
SLIDE 15

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 15

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010 Eugene Agichtein Emory University

29

[from SIGIR 2008 Tutorial, Baeza-Yates and Jones]

Intent Distribution by Topic

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Query Distribution by Demographics

  • Education:
  • Ethnicity:
  • Gender:

Eugene Agichtein Emory University

30

[Weber & Castillo, SIGIR 2010]

slide-16
SLIDE 16

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 16

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Query Demographics 2: Demo

Keywords have demographic signatures

– Microsoft adCenter Demographics Prediction: http://adlab.msn.com/DPUI/DPUI.aspx

adCenter [posters]

Quantcast

31

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Domain-Specific Intents: Named Entities

  • Named Entities

(Persons, Orgs, Places) are often searched – “Brittany Spears”?

  • Popular

phrases by entity type:

Eugene Agichtein Emory University

32

[Yin & Shah, WWW 2010]

slide-17
SLIDE 17

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 17

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Example Intent Taxonomy for Musicians

  • Musicians (most

popular phrases)

Eugene Agichtein Emory University

33

[Yin & Shah, WWW 2010]

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Analyzing Searches: Funneling

  • What is the intent of customers that type such queries?
  • Hint: What they searched before/after?

– Search Funnels: http://adlab.msn.com/searchfunnel/ – How can you catch customers earlier? – What customers do when they leave?

34

Eugene Agichtein Emory University

slide-18
SLIDE 18

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 18

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 1: Search Intent and Behavior

Motivation: how intent inference could help search Search intent and information seeking behavior

Classical models of information seeking

Web Searcher Intent

Broder Rose Demographics

  • Web Searcher Behavior

– Levels of modeling: micro-, meso-, and macro- levels – Variations in web searcher behavior – Click models

35

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Web Searcher Behavior

  • Meso-level: query, intent, and session

characteristics

  • Micro-level: how searchers interact with result

pages

  • Macro-level: patterns, trends, and interests

36

Eugene Agichtein Emory University

slide-19
SLIDE 19

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 19

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Levels of Understanding Searcher Behavior

  • Micro (eye tracking):

lowest level of detail, milliseconds

  • Meso (field studies):

mid-level, minutes to days

  • Macro (session analysis):

millions of observations, days to months

37

[Daniel M. Russell, 2007]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Search Behavior: Scales

Eugene Agichtein Emory University

38

from: Pirolli, 2008

slide-20
SLIDE 20

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 20

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Information Retrieval Process (User view)

Eugene Agichtein Emory University

39

Source Selection Search Query Selection Ranked List Examination Documents Delivery Documents Query Formulation Resource

query reformulation, vocabulary learning, relevance feedback source reselection

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

People Look at Only a Few Results

(Source: iprospect.com WhitePaper_2006_SearchEngineUserBehavior.pdf) Eugene Agichtein Emory University

40

slide-21
SLIDE 21

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 21

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Snippet Views Depend on Rank

Mean: 3.07 Median: 2.00

[Daniel M. Russell, 2007]

Eugene Agichtein Emory University

41

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Snippet Views and Clicks Depend on Rank

[from Joachims et al, SIGIR 2005]

Eugene Agichtein Emory University

42

slide-22
SLIDE 22

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 22

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

“Eyes are a Window to the Soul”

  • Eye tracking gives information

about search interests:

– Eye position – Pupil diameter – Seekads and fixations

Eugene Agichtein Emory University

43

Reading

Visual Search

Camera

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Micro-level: Examining Results

  • Users rapidly scan the search result page
  • What they see in lower summaries may influence

judgment of higher result

  • Spend most time scrutinizing top results 1 and 2

– Trust the ranking

44

[Daniel M. Russell, 2007]

Eugene Agichtein Emory University

slide-23
SLIDE 23

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 23

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

POM: Partially Observable Model

Eugene Agichtein Emory University

45

Wang et al., WSDM 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Result Examination (cont’d)

  • Searchers might use

the mouse to focus reading attention, bookmark promising results, or not at all.

  • Behavior varies with

task difficulty and user expertise

46

[K. Rodden, X. Fu, A. Aula, and I. Spiro, Eye-mouse coordination patterns on web search results pages, Extended Abstracts of ACM CHI 2008]

Eugene Agichtein Emory University

slide-24
SLIDE 24

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 24

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Result Ex. (cont): Predicting Eye-Mouse coordination

47

Guo & Agichtein, CHI 2010

1 2 3 4 x 10

4

150 300 450 600 750

Time Eye−mouse distance Euclidean Distance Threshold Prediction

1000 2000 3000 4000 5000 150 300 450 600

Time Eye−mouse distance Euclidean Distance Prediction Threshold

3000 6000 9000 12000 150 300 450 600

Time Eye−mosue distance Euclidean Distance Prediction Threshold

Eugene Agichtein Emory University

Actual Eye-Mouse Coordination Predicted No Coordination (30%) Bookmarking (25%) Eye follows mouse (25%)

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Macro-Level (Session) Analysis

  • Can examine theoretical user models in light of empirical

data:

– Orienteering? – Foraging? – Multi-tasking?

  • Search is often a multi-step process:

– Find or navigate to a good site (“orienteering”) – Browse for the answer there: [actor most oscars] vs. [oscars]

  • Teleporting

– “I wouldn’t use Google for this, I would just go to…”

  • Triangulation

– Draw information from multiple sources and interpolate – Example: “how long can you last without food?”

48

Eugene Agichtein Emory University

slide-25
SLIDE 25

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 25

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Users (sometimes) Multi-task

49

[Daniel M. Russell, 2007]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Kinds of Search+Browsing Behavior

50

[Daniel M. Russell, 2007]

Eugene Agichtein Emory University

slide-26
SLIDE 26

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 26

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Parallel Browsing Behavior [Huang & White, HT 2010]

  • 57% of all tabbed

sessions are parallel browsing

  • Can mean multi-

tasking

  • Common scenario:

“branching” in exploring search results

Eugene Agichtein Emory University

51

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Search Engine Switching Behavior

Eugene Agichtein Emory University

52

White et al., CIKM 2009

  • 4% of all search sessions contained a switching event
  • Switching events:

– 58.6 million switching events in 6-month period

  • 1.4% of all Google / Yahoo! / Live queries followed by switch

– 12.6% of all switching events involved same query – Two-thirds of switching events from browser search box

  • Users:

– 72.6% of users used multiple engines in 6-month period – 50% of users switched search engine within a session

slide-27
SLIDE 27

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 27

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Overview of Search Engine Switching

  • Switching is more frequent in longer sessions

White et al., CIKM 2009

53

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Overview of Switching - Survey

  • 70.5% of survey respondents reported having

switched

– Remarkably similar to the 72.6% observed in logs

  • Those who did not switch:

– Were satisfied with current engine (57.8%) – Believed no other engine would perform better (24.0%) – Felt that it was too much effort to switch (6.8%) – Other reasons included brand loyalty, trust, privacy

  • Within-session switching:

– 24.4% of switching users did so “Often” or “Always” – 66.8% of switching users did so “Sometimes”

White et al., CIKM 2009

54

Eugene Agichtein Emory University

slide-28
SLIDE 28

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 28

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Reasons for Engine Switching

  • Three types of reasons:

– Dissatisfaction with original engine – Desire to verify or find additional information – User preference

Other reasons included:

  • Loyalty to dest. engine
  • Multi-engine apps.
  • Hope (!)

White et al., CIKM 2009

55

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Pre-switch Behavior

  • Most common are queries and non-SERP clicks
  • This is the action immediately before the switch
  • What about pre-switch activity across the session?

White et al., CIKM 2009

56

Eugene Agichtein Emory University

slide-29
SLIDE 29

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 29

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Pre-switch Behavior (Survey)

“Is there anything about your search behavior immediately preceding a switch that may indicate to an

  • bserver that you are about to switch engines?”
  • Common answers:

– Try several small query changes in pretty quick succession – Go to more than the first page of results, again often in quick succession and often without clicks – Go back and forth from SERP to individual results, without spending much time on any – Click on lots of links, then switch engine for additional info – Do not immediately click on something

White et al., CIKM 2009

57

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Post-switch Behavior

  • Extending the analysis beyond next action:

– 20% of switches eventually lead to return to origin engine – 6% of switches eventually lead to use of third engine

  • > 50% led to a result click. Are users satisfied?

White et al., CIKM 2009

58

Eugene Agichtein Emory University

slide-30
SLIDE 30

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 30

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Post-Switch Satisfaction

  • Measures of user effort / activity (# Queries, # Actions)
  • Measure of the quality of the interaction

– % queries with No Clicks, # Actions to SAT (>30sec dwell)

  • Users issue more queries/actions; seem less satisfied

(higher %NoClicks and more actions to SAT)

  • Switching queries may be challenging for search engines

Activity # Queries # Actions Origin Destination Origin Destination All Queries 3.14 3.70 9.85 11.62 Same Queries 3.08 3.73 9.03 10.25 Success % NoClicks # Actions to SatAction Origin Destination Origin Destination All Queries 49.7 52.7 3.81 4.71 Same Queries 54.5 59.7 3.67 4.61 White et al., CIKM 2009

59

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Search Behavior: Expertise

  • Some people are more expert at searching than
  • thers

– Search expertise, not domain expertise – Alternative explanation: Orienteering vs. Teleporting

  • Find characteristics of these “advanced search

engine users” in an effort to better understand how these users search

  • Understanding what advanced searchers are doing

could improve the search experience for everyone

60

[White & Morris, WWW 2007]

Eugene Agichtein Emory University

slide-31
SLIDE 31

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 31

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Findings – Post-query browsing

Advanced users:

– Traverse trails faster – Spend less time viewing each Web page – Follow query trails with fewer steps – Revisit pages less often – “Branch” less often

Feature padvanced 0% > 0% ≥ 25% ≥ 50% ≥ 75% Session Secs 701.10 706.21 792.65 903.01 1114.71 Trail Secs 205.39 159.56 156.45 147.91 136.79 Display Secs 36.95 32.94 34.91 33.11 30.67

  • Num. Steps

4.88 4.72 4.40 4.40 4.39

  • Num. Revisits

1.20 1.02 1.03 1.03 1.02 Num. Branches 1.55 1.51 1.50 1.47 1.44 %Trails 72.14% 27.86% .83% .23% .05% %Users 79.90% 20.10% .79% .18% .04%

Non-advanced More advanced

  • Advanced

[White & Morris, 2007]

Eugene Agichtein Emory University

61

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Search Behavior: Demographics

  • Gender differences:

– Query “wagner”

  • Women: http://en.wikipedia.org/wiki/Richard_Wagner
  • Men: http://www.wagnerspraytech.com/
  • Education differences:

Eugene Agichtein Emory University

62

[Weber & Castillo, SIGIR 2010]

slide-32
SLIDE 32

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 32

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

ReFinding Behavior

  • 40% of the queries led to a click on a result that the

same user had clicked on in a past search session. – Teevan et al., 2007

  • What’s the URL for this

year’s SIGIR 2010?

– Does not really matter, it is faster to re-find it

63

[From Teevan et al, 2007]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

What Is Known About Re-Finding

  • Re-finding recent topic of interest
  • Web re-visitation common [Tauscher & Greenberg]
  • People follow known paths for re-finding

– Search engines likely to be used for re-finding

  • Query log analysis of re-finding

– Query sessions [Jones & Fain] – Temporal aspects [Sanderson & Dumais]

64

Eugene Agichtein Emory University

[From Teevan et al, 2007]

slide-33
SLIDE 33

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 33

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

3100 (24%) 36 (<1%) 635 (5%) 485 (4%) 637 (5%) 4 (<1%) 660 (5%) 7503 (57%)

Click on previously clicked results?

Click on different results? Same query issued before? New query? Click same and different? 1 click > 1 click

39%

Navigational Re-finding with different query

Eugene Agichtein Emory University

[From Teevan et al, 2007]

65

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Rank Change Degrades Re-Finding

  • Results change rank
  • Change in result rank reduces probability of re-click

– No rank change: 88% chance – Rank change: 53% chance

  • Rank change slower repeat click

– Compared with initial search to click – No rank change: Re-click is faster – Rank change: Re-click is slower

[From Teevan et al, 2007]

66

Eugene Agichtein Emory University

slide-34
SLIDE 34

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 34

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Aside: Mobile Search…

  • Not topic of today’s tutorial
  • Some references:
  • M. Jones, Mobile Search Tutorial, Mobile HCI, 2009

– K. Church, B. Smyth, K. Bradley, Keith and P. Cotter. A large scale study of European mobile search behaviour. Mobile HCI, 2008 – Kamvar, M., Kellar, M., Patel, R., and Xu, Y. Computers and iphones and mobile phones, oh my!: a logs- based comparison of search users on different devices. WWW 2009 – Kamvar, M. and Baluja, S. 2008. Query suggestions for mobile search: understanding usage patterns, CHI 2008

Eugene Agichtein Emory University

67

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 1: Summary

Understanding user behavior at micro-, meso-, and macro- levels Theoretical models of information seeking Web search behavior:

Levels of detail Search Intent Variations in web searcher behavior Keeping found things found

68

Eugene Agichtein Emory University

slide-35
SLIDE 35

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 35

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Eugene Agichtein Emory University

Part 2: Inferring Web Searcher Intent

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Tutorial Overview

Part 1: Search Intent Modeling

Motivation: how intent inference could help search Web search intent & information seeking models Web searcher behavior models

Part 2: Inferring Web Searcher Intent

– Inferring result relevance: clicks – Richer interaction models: clicks + browsing – Contextualizing intent models: personalization

70

Eugene Agichtein Emory University

slide-36
SLIDE 36

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 36

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 2: Inferring Searcher Intent

  • Inferring result relevance: clicks
  • Richer behavior models:

– SERP presentation info – Post-search behavior – Rich interaction models for SERPs

  • Contextualizing intent inference:

– Session-level models – Personalization

71

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Implicit Feedback

  • Users often reluctant to provide relevance judgments

– Some searches are precision-oriented (no “more like this”) – They’re lazy or annoyed: – “Was this document helpful?”

  • Can we gather relevance feedback without requiring

the user to do anything?

  • Goal: estimate relevance from behavior
  • 72

Eugene Agichtein Emory University

slide-37
SLIDE 37

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 37

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Observable Behavior

Minimum Scope

Segment Object Class

Behavior Category

Examine Retain Reference Annotate View Listen Select (click) Print Bookmark Save Purchase Delete Subscribe Copy / paste Quote Forward Reply Link Cite Mark up Rate Publish Organize

  • 73

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Clicks as Relevance Feedback

  • Limitations:

– Hard to determine the meaning of a click. If the best result is not displayed, users will click on something – Presentation bias – Click duration may be misleading

  • People leave machines unattended
  • Opening multiple tabs quickly, then reading them all slowly
  • Multitasking
  • Compare above to limitations of explicit feedback:

– Sparse, inconsistent ratings

  • 74

Eugene Agichtein Emory University

slide-38
SLIDE 38

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 38

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

“Strawman” Click model: No Bias

  • Naive Baseline

– cdi is P( Click=True | Document=d, Position=i ) – rd is P( Click=True | Document=d )

  • Why this baseline?

– We know that rd is part of the explanation – Perhaps, for ranks 9 vs 10, it’s the main explanation – It is a bad explanation at rank 1 e.g. Eye tracking

Attractiveness of summary ~= Relevance of result

[Craswell et al., 2008]

Eugene Agichtein Emory University

75

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Realistic Click models

  • Clickthrough and subsequent browsing behavior of

individual users influenced by many factors

– Relevance of a result to a query – Visual appearance and layout – Result presentation order – Context, history, etc.

Eugene Agichtein Emory University

76

slide-39
SLIDE 39

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 39

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

De-biasing position (first attempt)

Relative clickthrough for queries with known relevant results in position 3 (results in positions 1 and 2 are not relevant)

1 2 3 5 10

Result Position Relative Click Frequency All queries PTR=1 PTR=3

Higher clickthrough at top non-relevant than at top relevant document [Agichtein et al., 2006]

  • 77

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Simple Model: Deviation from Expected

  • Relevance component: deviation from “expected”:

Relevance(q , d)= observed - expected (p)

  • 0.023
  • 0.029
  • 0.009
  • 0.001
  • 0.013

0.010

  • 0.002
  • 0.001

0.144 0.063

  • 0.04
  • 0.02

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 1 2 3 5 10

Result position Click frequency deviation

PTR=1 PTR=3

[Agichtein et al., 2006]

  • 78

Eugene Agichtein Emory University

slide-40
SLIDE 40

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 40

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

  • CD: distributional model, extends SA+N

– Clickthrough considered iff frequency > ε than expected

  • Click on result 2 likely “by chance”
  • 4>(1,2,3,5), but not 2>(1,3)
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4

1 2 3 4 5

Result position Clickthrough Frequency Deviation

Simple Model: Example

1 2 3 4 5 6 7 8 Click Click

  • 79

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Simple Model Results

Improves precision by discarding “chance” clicks

  • 80

Eugene Agichtein Emory University

slide-41
SLIDE 41

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 41

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Another Formulation

  • There are two types of user/interaction

– Click based on relevance – Click based on rank (blindly)

  • A.k.a. the OR model:

– Clicks arise from relevance OR position – Estimate with logistic regression

1 2 3 4 5 6 7 8 9 10 0.2 0.4 i bi

[Craswell et al., 2008]

Eugene Agichtein Emory University

81

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Linear Examination Hypothesis

  • Users are less likely to look at lower ranks, therefore

less likely to click

  • This is the AND model

– Clicks arise from relevance AND examination – Probability of examination does not depend on what else is in the list

1 2 3 4 5 6 7 8 9 10 0.5 1 i xi

[Craswell et al., 2008]

Eugene Agichtein Emory University

82

slide-42
SLIDE 42

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 42

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Cascade Model

  • Users examine the results in rank order
  • At each document d

– Click with probability rd – Or continue with probability (1-rd)

[Taylor et al., 2008]

Eugene Agichtein Emory University

83

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Cascade Model (2)

Eugene Agichtein Emory University

84

query URL1 URL2 URL3 URL4 C1 C2 C3 C4 r1 r2 r3 r4

  • rd

(1-rd) (1-rd) (1-rd) rd rd rd

slide-43
SLIDE 43

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 43

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Cascade Model Example

  • 500 users typed a query
  • 0 click on result A in rank 1
  • 100 click on result B in rank 2
  • 100 click on result C in rank 3
  • Cascade (with no smoothing) says:
  • 0 of 500 clicked A rA = 0
  • 100 of 500 clicked B rB = 0.2
  • 100 of remaining 400 clicked C rC = 0.25

[Craswell et al., 2008]

Eugene Agichtein Emory University

85

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Cascade Model Seems Closest to Reality

Best possible: Given the true click counts for ordering BA

[Craswell et al., 2008]

Eugene Agichtein Emory University

86

slide-44
SLIDE 44

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 44

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Dynamic Bayesan Net

87

  • O. Chapelle, & Y Zhang, A Dynamic Bayesian Network Click

Model for Web Search Ranking, WWW 2009

did user examine url? was user satisfied by landing page? user attracted to url?

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Dynamic Bayesan Net

88

  • O. Chapelle, & Y Zhang, A Dynamic Bayesian Network Click

Model for Web Search Ranking, WWW 2009

Eugene Agichtein Emory University

slide-45
SLIDE 45

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 45

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

89

Dynamic Bayesan Net (results)

  • predicted relevance

agrees 80% with human relevance

  • O. Chapelle, & Y Zhang, A Dynamic Bayesian Network Click

Model for Web Search Ranking, WWW 2009

Use EM algorithm (similar to forward-backward to learn model parameters; set manually

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Clicks: Summary So Far

  • Simple model accounts for position bias
  • Bayes Net model: extension of Cascade model

shown to work well in practice

– Limitations?

  • Questions?

90

  • Eugene Agichtein

Emory University

slide-46
SLIDE 46

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 46

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Capturing a Click in its Context

91

[Piwowarski et al., 2009] Building query chains Building query chains

  • Simple model

based on time deltas & query similarities Analysing the chains Analysing the chains

  • Layered

Bayesian Network (BN) model Validation of the model Validation of the model

  • Relevance of

clicked documents

  • Boosted

Trees with features from the BN

  • Eugene Agichtein

Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Overall process

Time threshold Time threshold Similarity threshold Similarity threshold Grouping atomic sessions Grouping atomic sessions [Piwowarski et al., 2009]

  • 92

Eugene Agichtein Emory University

slide-47
SLIDE 47

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 47

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Layered Bayesian Network

[Piwowarski et al., 2009]

  • 93

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

The BN gives the context of a click

94

Probability (Chain state=… / observations) = (0.2, 0.4, 0.01, 0.39, 0) Probability (Search state=… / observations) = (0.1, 0.42, …) Probability (Page state=… / observations) = (0.25, 0.2, …) Probability (Click state=… / observations) = (0.02, 0.5, …) Probability ([not] Relevant / observations) = (0.4, 0.5)

Chain Search Relevance Click Page

[Piwowarski et al., 2009]

  • 94

Eugene Agichtein Emory University

slide-48
SLIDE 48

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 48

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Features for one click

  • For each clicked document, compute features:

– (BN) Chain/Page/Action/Relevance state distribution – (BN) Maximum likelihood configuration, likelihood – Word confidence values (averaged for the query) – Time and position related features

  • This is associated with a relevance judgment from

an editor and used for learning

[Piwowarski et al., 2009]

  • 95

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Learning with Gradient Boosted Trees

  • Use a Gradient boosted trees (Friedman 2001),

with a tree depth of 4 (8 for non BN-based model)

  • Used disjoint train (BN + GBT training) and test sets

– Two sets of sessions S1 and S2 (20 million chains) and two set of queries + relevance judgment J1 and J2 (about 1000 queries with behavior data) – Process (repeated 4 times):

  • learn the BN parameters on S1+J1,
  • extract the BN features and learn the GBT with S1+J1
  • Extract the BN features and predict relevance assessments of

J2 with sessions of S2

[Piwowarski et al., 2009]

  • 96

Eugene Agichtein Emory University

slide-49
SLIDE 49

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 49

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results: Predicting Relevance of Clicked Docs

[Piwowarski et al., 2009]

  • 97

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Problem: Users click based on result “Snippets”

  • Effect of Caption Features on Clickthrough

Inversions, C. Clarke, E. Agichtien, S. Dumais, R. White, SIGIR 2007

[Clarke et al., 2007]

Eugene Agichtein Emory University

98

slide-50
SLIDE 50

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 50

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Clickthrough Inversions

[Clarke et al., 2007]

Eugene Agichtein Emory University

99

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Relevance is Not the Dominant Factor!

[Clarke et al., 2007]

Eugene Agichtein Emory University

100

slide-51
SLIDE 51

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 51

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Snippet Features Studied

[Clarke et al., 2007]

Eugene Agichtein Emory University

101

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Feature Importance

[Clarke et al., 2007]

Eugene Agichtein Emory University

102

slide-52
SLIDE 52

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 52

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Important Words in Snippet

[Clarke et al., SIGIR 2007]

Eugene Agichtein Emory University

103

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Extension: Use Fair Pairs Randomization Click data: Example result: (bars should be equal if unbiased)

Eugene Agichtein Emory University

104

[Yue et al., WWW 2010]

slide-53
SLIDE 53

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 53

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Viewing Organic Results vs. Ads

  • Ads and Organic results

compete for user attention Navigational vs. Other Diversity v Similarity

Eugene Agichtein Emory University

105

Danescu-Niculescu-Mizil et al., WWW 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 2: Inferring Searcher Intent

Inferring result relevance: clicks Richer behavior models:

SERP presentation info Richer interaction models: +presentation, +behavior

106

Eugene Agichtein Emory University

slide-54
SLIDE 54

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 54

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Richer Behavior Models

  • Behavior measures of Interest

– Browsing, scrolling, dwell time – How to estimate relevance?

  • Heuristics
  • Learning-based

– General model: Curious Browser [Fox et al., TOIS 2005] – Query + Browsing [Agichtein et al., SIGIR 2006] – Active Prediction: [Yun et al., WWW 2010]

107

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Curious Browser

[Fox et al., 2003]

108

Eugene Agichtein Emory University

slide-55
SLIDE 55

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 55

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Data Analysis

  • Bayesian modeling at result and session level
  • Trained on 80% and tested on 20%
  • Three levels of SAT – VSAT, PSAT & DSAT
  • Implicit measures:
  • !
  • """

# $

  • !""%

#

[Fox et al., 2003]

Eugene Agichtein Emory University

109

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Data Analysis, cont’d

[Fox et al., 2003]

Eugene Agichtein Emory University

110

slide-56
SLIDE 56

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 56

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Result-Level Findings

  • 1. Dwell time, clickthrough and exit type

strongest predictors of SAT

  • 2. Printing and Adding to Favorites highly

predictive of SAT when present

  • 3. Combined measures predict SAT better

than clickthrough

[Fox et al., 2003]

Eugene Agichtein Emory University

111

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Result Level Findings, cont’d

Only clickthrough Combined measures Combined measures with confidence of > 0.5 (80-20 train/test split) [Fox et al., 2003]

Eugene Agichtein Emory University

112

slide-57
SLIDE 57

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 57

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Learning Result Preferences in Rich User Interaction Space

  • Observed and Distributional features

– Observed features: aggregated values over all user interactions for each query and result pair – Distributional features: deviations from the “expected” behavior for the query

  • Represent user interactions as vectors in “Behavior Space”

– Presentation: what a user sees before click – Clickthrough: frequency and timing of clicks – Browsing: what users do after the click

[Agichtein et al., 2006]

Eugene Agichtein Emory University

113

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Features for Behavior Representation

[Agichtein et al., SIGIR2006] Presentation Presentation ResultPosition ResultPosition Position of the URL in Current ranking Position of the URL in Current ranking QueryTitleOverlap QueryTitleOverlap Fraction of query terms in result Title Fraction of query terms in result Title Clickthrough Clickthrough DeliberationTime DeliberationTime Seconds between query and first click Seconds between query and first click ClickFrequency ClickFrequency Fraction of all clicks landing on page Fraction of all clicks landing on page ClickDeviation ClickDeviation Deviation from expected click frequency Deviation from expected click frequency Browsing Browsing DwellTime DwellTime Result page dwell time Result page dwell time DwellTimeDeviation DwellTimeDeviation Deviation from expected dwell time for query Deviation from expected dwell time for query Sample Behavior Features

Eugene Agichtein Emory University

114

slide-58
SLIDE 58

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 58

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Predicting Result Preferences

  • Task: predict pairwise preferences

– A judge will prefer Result A > Result B

  • Models for preference prediction

– Current search engine ranking – Clickthrough – Full user behavior model

[Agichtein et al., SIGIR2006]

Eugene Agichtein Emory University

115

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

User Behavior Model

  • Full set of interaction features

– Presentation, clickthrough, browsing

  • Train the model with explicit judgments

– Input: behavior feature vectors for each query-page pair in rated results – Use RankNet (Burges et al., [ICML 2005]) to discover model weights – Output: a neural net that can assign a “relevance” score to a behavior feature vector

[Agichtein et al., SIGIR2006]

Eugene Agichtein Emory University

116

slide-59
SLIDE 59

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 59

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

117

Results: Predicting User Preferences

SA+N

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 0.76 0.78 0.8 0.1 0.2 0.3 0.4

Recall Precision

SA+N CD UserBehavior Baseline

  • Baseline < SA+N < CD << UserBehavior
  • Rich user behavior features result in dramatic improvement

[Agichtein et al., SIGIR2006]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Predicting Queries from Browsing Behavior

  • Identify “Search Trigger” browse-search patterns
  • Distribution of “Search-Browse” patterns:

URLs: movies.about.com/ nationalpriorities.org pds.jpl.nasa.gov/planets

Eugene Agichtein Emory University

118

[Cheng et al., WWW 2010]

slide-60
SLIDE 60

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 60

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Summary of Part 2

  • Click data contains important information about the

distribution of intents for a query

  • For accurate interpretation, must model the (many)

biases present:

– Presentation, demographics, types of intent

Eugene Agichtein Emory University

119

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Part 3: Applications and Extensions

  • Improving search ranking

– Implicit feedback

  • Predicting Intent and Behavior

– Query suggestion, ads

  • Search personalization

Eugene Agichtein Emory University

120

slide-61
SLIDE 61

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 61

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Observable Behavior

Minimum Scope

Segment Object Class

Behavior Category

Examine Retain Reference Annotate View Listen Print Bookmark Save Purchase Delete Subscribe Copy / paste Quote Forward Reply Link Cite Mark up Rate Publish Organize

Eugene Agichtein Emory University

121

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Eye Tracking

  • Unobtrusive
  • Relatively precise

(accuracy: 1° of visual angle)

  • Expensive
  • Mostly used as „passive“ tool for

behavior analysis, e.g. visualized by heatmaps:

  • We use eye tracking for immediate

implicit feedback taking into account temporal fixation patterns

122

Eugene Agichtein Emory University

slide-62
SLIDE 62

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 62

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Using Eye Tracking for Relevance Feedback

  • Starting point: Noisy gaze data from the eye tracker.

2.

Fixation detection and saccade classification

3.

Reading (red) and skimming (yellow) detection line by line

See G. Buscher, A. Dengel, L. van Elst: “Eye Movements as Implicit Relevance Feedback”, in CHI '08

[Buscher et al., 2008]

123

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Three Feedback Methods Compared

Input: viewed documents

Baseline

TF x IDF

Gaze-Filter

TF x IDF

Gaze-Length- Filter Reading Speed

ReadingScore(t) x TF x IDF based on read vs. skimmed passages containing term t based on opened entire documents based on read or skimmed passages Interest(t) x TF x IDF based on length of coherently read text

[Buscher et al., 2008]

124

Eugene Agichtein Emory University

slide-63
SLIDE 63

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 63

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Eye Tracking-based RF Results

[Buscher et al., 2008]

125

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Instrumenting SERP Interactions: EMU

126

  • EMU: Firefox + LibX plugin instrumentation http log
  • Track whitelisted sites e.g., Emory, Google, Yahoo search…
  • All SERP events logged (asynchronous http requests)
  • 150 public use machines, 5,000+ opted-in users

HTTP Log HTTP Server Usage Data Data Mining & Management Train Prediction Models

Eugene Agichtein Emory University

Gui, Agichtein, et al., JCDL 2009

slide-64
SLIDE 64

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 64

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Classifying Research vs. Purchase Intent

  • 12 subjects (grad students and staff) asked to
  • 1. Research a product they want to purchase eventually

(Research intent)

  • 2. Search for a best deal on an item they want to

purchase immediately (Purchase intent)

  • Eye tracking and browser instrumentation

performed in parallel for some of the subjects

127

Guo & Agichtein, SIGIR 2010

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Research Intent

Eugene Agichtein Emory University

128

Guo & Agichtein, SIGIR 2010

slide-65
SLIDE 65

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 65

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Purchase Intent

129

Eugene Agichtein Emory University

Guo & Agichtein, SIGIR 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Contextualized Intent Inference

130

Guo & Agichtein, SIGIR 2010

Eugene Agichtein Emory University

slide-66
SLIDE 66

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 66

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Implementation: Conditional Random Field (CRF) Model

Eugene Agichtein Emory University

131

Guo & Agichtein, SIGIR 2010

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results: Ad Click Prediction

  • 200%+ precision improvement (within task)

Eugene Agichtein Emory University

132

Guo & Agichtein, SIGIR 2010

slide-67
SLIDE 67

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 67

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Application: Learning to Rank from Click Data

133

[ Joachims 2002 ]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results

134

[ Joachims 2002 ]

Summary: Learned outperforms all base methods in experiment Learning from clickthrough data is possible Relative preferences are useful training data.

Eugene Agichtein Emory University

slide-68
SLIDE 68

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 68

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Extension: Query Chains

135

[Radlinski & Joachims, KDD 2005]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Query Chains (Cont’d)

136

[Radlinski & Joachims, KDD 2005]

Eugene Agichtein Emory University

slide-69
SLIDE 69

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 69

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Query Chains (Results)

  • Query Chains add slight improvement over clicks

137

[Radlinski & Joachims, KDD 2005]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Richer Behavior for Dynamic Ranking

138

[Agichtein et al., SIGIR2006]

Presentation Presentation ResultPosition ResultPosition Position of the URL in Current ranking Position of the URL in Current ranking QueryTitleOverlap QueryTitleOverlap Fraction of query terms in result Title Fraction of query terms in result Title Clickthrough Clickthrough DeliberationTime DeliberationTime Seconds between query and first click Seconds between query and first click ClickFrequency ClickFrequency Fraction of all clicks landing on page Fraction of all clicks landing on page ClickDeviation ClickDeviation Deviation from expected click frequency Deviation from expected click frequency Browsing Browsing DwellTime DwellTime Result page dwell time Result page dwell time DwellTimeDeviation DwellTimeDeviation Deviation from expected dwell time for query Deviation from expected dwell time for query Sample Behavior Features (from Lecture 2)

Eugene Agichtein Emory University

slide-70
SLIDE 70

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 70

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Feature Merging: Details

  • Value scaling:

– Binning vs. log-linear vs. linear (e.g., μ=0, σ=1)

  • Missing Values:

– 0? (meaning for normalized feature values s.t. μ=0?)

  • “real-time”: significant architecture/system problems

Result URL BM25 PageRank … Clicks DwellTime … sigir2007.org 2.4 0.5 … ? ? … Sigir2006.org 1.4 1.1 … 150 145.2 … acm.org/sigs/sigir/ 1.2 2 … 60 23.5 …

Query: SIGIR, fake results w/ fake feature values [Agichtein et al., SIGIR2006]

139

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Review: NDCG

  • Normalized Discounted Cumulative Gain
  • Multiple Levels of Relevance
  • DCG:

– contribution of ith rank position: – Ex: has DCG score of

  • NDCG is normalized DCG

– best possible ranking as score NDCG = 1

) 1 log( 1 2 + − i

i

y

45 . 5 ) 6 log( 1 ) 5 log( ) 4 log( 1 ) 3 log( 3 ) 2 log( 1 ≈ + + + +

140

Eugene Agichtein Emory University

slide-71
SLIDE 71

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 71

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Human Judgments

http://jobs.monsterindia.com/details/7902838.html

141

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results for Incorporating Behavior into Ranking

MAP Gain RN 0.270 RN+ALL 0.321 0.052 (19.13%) BM25 0.236 BM25+ALL 0.292 0.056 (23.71%)

0.56 0.58 0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 1 2 3 4 5 6 7 8 9 10

K NDCG

RN Rerank-All RN+All

[Agichtein et al., SIGIR2006]

142

Eugene Agichtein Emory University

slide-72
SLIDE 72

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 72

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Which Queries Benefit Most

50 100 150 200 250 300 350 0.1 0.2 0.3 0.4 0.5 0.6

  • 0.4
  • 0.35
  • 0.3
  • 0.25
  • 0.2
  • 0.15
  • 0.1
  • 0.05

0.05 0.1 0.15 0.2 Frequency Average Gain

Most gains are for queries with poor original ranking

[Agichtein et al., SIGIR2006]

143

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Extension to Unseen Queries/Documents: Search Trails

144

[Bilenko and White, WWW 2008]

  • Trails start with a search engine query
  • Continue until a terminating event

– Another search – Visit to an unrelated site (social networks, webmail) – Timeout, browser homepage, browser closing

Eugene Agichtein Emory University

slide-73
SLIDE 73

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 73

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Probabilistic Model

  • IR via language modeling [Zhai-Lafferty, Lavrenko]
  • Query-term distribution gives more mass to rare

terms:

  • Term-website weights combine dwell time and counts

[Bilenko and White, WWW 2008]

145

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Results: Learning to Rank

Add Rel(q, di) as a feature to RankNet

0.58 0.6 0.62 0.64 0.66 0.68 0.7 0.72 NDCG@1 NDCG@3 NDCG@10 NDCG Baseline Baseline+Heuristic Baseline+Probabilistic Baseline+Probabilistic+RW [Bilenko and White, WWW 2008]

146

Eugene Agichtein Emory University

slide-74
SLIDE 74

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 74

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Personalization

Eugene Agichtein Emory University

147

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Which Queries to Personalize?

  • Personalization benefits ambiguous queries
  • Inter-rater reliability (Fleiss’ kappa)

– Observed agreement (Pa) exceeds expected (Pe) – κ = (Pa-Pe) / (1-Pe)

  • Relevance entropy

– Variability in probability result is relevant (Pr) – S = -Σ Pr log Pr

  • Potential for personalization

– Ideal group ranking differs from ideal personal – P4P = 1 - nDCGgroup

148 Teevan, J, S. T. Dumais, and D. J. Liebling. To personalize or not to personalize: modeling queries with variation in user intent., SIGIR 2008

[Teevan et al., 2008]

Eugene Agichtein Emory University

slide-75
SLIDE 75

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 75

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Predicting Ambiguous Queries

History No Yes Information Query Query length Contains URL Contains advanced operator Time of day issued Number of results (df) Number of query suggests Reformulation probability # of times query issued # of users who issued query

  • Avg. time of day issued
  • Avg. number of results
  • Avg. number of query suggests

Results Query clarity ODP category entropy Number of ODP categories Portion of non-HTML results Portion of results from .com/.edu Number of distinct domains Result entropy

  • Avg. click position
  • Avg. seconds to click
  • Avg. clicks per user

Click entropy Potential for personalization

Teevan, J, S. T. Dumais, and D. J. Liebling. To personalize or not to personalize: modeling queries with variation in user intent., SIGIR 2008

[Teevan et al., 2008]

149

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Mars (Candy) vs. Mars (Planet)

  • Approach:

– Intent = Set of visited documents – Cluster refinements using document visit distribution vectors

Clustering Query Refinements by User Intent

Eugene Agichtein Emory University

150

[Sadikov et al., WWW 2010]

slide-76
SLIDE 76

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 76

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Approaches to Personalization

  • 1. Pitkow et al., 2002
  • 2. Qiu et al., 2006
  • 3. Jeh et al., 2003
  • 4. Teevan et al., 2005
  • 5. Das et al., 2007

151 Figure adapted from: Personalized search on the world wide web, by Micarelli, A. and Gasparetti, F. and Sciarrone, F. and Gauch, S., LNCS 2007

1 2 4 5 3

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

personalization research

  • Ask the searcher

– Is this relevant?

  • Look at searcher’s clicks
  • Similarity to content

searcher’s seen before

Teevan et al., TOCHI 2010

152

Eugene Agichtein Emory University

slide-77
SLIDE 77

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 77

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Ask the Searcher

  • Explicit indicator of

relevance

  • Benefits

– Direct insight

  • Drawbacks

– Amount of data limited – Hard to get answers for the same query – Unlikely to be available in a real system

Teevan et al., TOCHI 2010

153

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Searcher’s Clicks

  • Implicit behavior-based

indicator of relevance

  • Benefits

– Possible to collect from all users

  • Drawbacks

– People click by mistake

  • r get side tracked

– Biased towards what is presented

Teevan et al., TOCHI 2010

154

Eugene Agichtein Emory University

slide-78
SLIDE 78

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 78

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Similarity to Seen Content

  • Implicit content-based

indicator of relevance

  • Benefits

– Can collect from all users – Can collect for all queries

  • Drawbacks

– Privacy considerations – Measures of textual similarity noisy

Teevan et al., TOCHI 2010

155

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Evaluating Personalized Search

  • Explicit judgments (offline and in situ)

– Evaluate components before system – NOTE: What’s relevant for you

  • Deploy system

– Verbatim feedback, Questionnaires, etc. – Measure behavioral interactions (e.g., click, reformulation, abandonment, etc.) – Click biases –order, presentation, etc. – Interleaving for unbiased clicks

  • Link implicit and explicit (Curious Browser toolbar)
  • From single query to search sessions and beyond

156

Eugene Agichtein Emory University

slide-79
SLIDE 79

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 79

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

User Control in Personalization (RF)

157

J-S. Ahn, P. Brusilovsky, D. He, and S.Y. Syn. Open user profiles for adaptive news systems: Help or harm? WWW 2007

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Personalization Summary

  • Lots of relevant content ranked low
  • Potential for personalization high
  • Implicit measures capture explicit variation

– Behavior-based: Highly accurate – Content-based: Lots of variation

  • Example: Personalized Search

– Behavior + content work best together – Improves search result click through

158

Eugene Agichtein Emory University

slide-80
SLIDE 80

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 80

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

New Direction: Active Learning

  • Goal: Learn the relevances with as little training

data as possible.

  • Search involves a three step process:
  • 1. Given relevance estimates, pick a ranking to display to

users.

  • 2. Given a ranking, users provide feedback: User clicks

provide pairwise relevance judgments.

  • 3. Given feedback, update the relevance estimates.

159

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Overview of Approach

  • Available information:

1. Have an estimate of the relevance of each result. 2. Can obtain pairwise comparisons of the top few results. 3. Do not have absolute relevance information.

  • Goal: Learn the document relevance quickly.
  • Addresses four questions:

1. How to represent knowledge about doc relevance. 2. How to maintain this knowledge as we collect data. 3. Given our knowledge, what is the best ranking? 4. What rankings do we show users to get useful data?

160

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein Emory University

slide-81
SLIDE 81

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 81

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

  • Given a fixed query, maintain knowledge about

relevance as clicks are observed.

– This tells us which documents we are sure about, and which ones need more data.

161

1: Representing Document Relevance

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

  • Problem: could present the ranking based on

current best estimate of relevance.

– Then the data we get would always be about the documents already ranked highly.

  • Instead, optimize ranking shown users:
  • 1. Pick top two docs to minimize future loss
  • 2. Append current best estimate ranking.

162

4: Getting Useful Data

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein Emory University

slide-82
SLIDE 82

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 82

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

163

4: Exploration Strategies

[Radlinski & Joachims, KDD 2007]

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

164

Results: TREC Data

[Radlinski & Joachims, KDD 2007]

Optimizing for relevance estimates better than for ordering

Eugene Agichtein Emory University

slide-83
SLIDE 83

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 83

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Tutorial Summary

Understanding user behavior at micro-, meso-, and macro- levels Theoretical models of information seeking Web search behavior:

Levels of detail Search Intent Variations in web searcher behavior Keeping found things found Click models

165

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Tutorial Summary (2)

Inferring result relevance: clicks Richer behavior models:

SERP presentation info Post-search behavior Rich interaction models for SERPs

Contextualizing intent inference:

Session-level models Personalization

166

Eugene Agichtein Emory University

slide-84
SLIDE 84

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 84

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

Inferring Searcher Intent (Information)

Eugene Agichtein Emory University

  • Tutorial Page:
  • http://ir.mathcs.emory.edu/intent_tutorial/

See the online version for expanded and updated bibliography

  • Contact information for the instructor:

– Eugene Agichtein – Email: eugene@mathcs.emory.edu – Homepage: http://www.mathcs.emory.edu/~eugene/

167

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

References and Further Reading (1)

  • Marti Hearst, Search User Interfaces, 2009, Chapter 3 “Models of the

Information Seeking Process”: http://searchuserinterfaces.com/

  • Teevan, J., Adar, E., Jones, R. and Potts, M. Information Re-Retrieval:

Repeat Queries in Yahoo's Logs, SIGIR 2007

  • Clarke, C, E. Agichtein, S. Dumais and R. W. White, The Influence of

Caption Features on Clickthrough Patterns in Web Search, SIGIR 2007

  • Craswell, N., Zoeter, O., Taylor, M., Ramsey, B. An experimental

comparison of click position-bias models, WSDM 2008

  • Dupret, G and Piwowarski, B: A user browsing model to predict

search engine click data from past observations. SIGIR 2008

  • White, R and D. Morris, Investigating the Querying and Browsing

Behavior of Advanced Search Engine Users, SIGIR 2007

168

Eugene Agichtein Emory University

slide-85
SLIDE 85

Eugene Agichtein, Emory University 11 July 2010 AAAI 2010 Tutorial: Inferring Searcher Intent 85

AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

References and Further Reading (2)

  • Marti Hearst, Search User Interfaces, 2009, Chapter 3 “Models of the

Information Seeking Process”: http://searchuserinterfaces.com/

  • Teevan, J., Adar, E., Jones, R. and Potts, M. Information Re-Retrieval:

Repeat Queries in Yahoo's Logs, SIGIR 2007

  • Clarke, C, E. Agichtein, S. Dumais and R. W. White, The Influence of

Caption Features on Clickthrough Patterns in Web Search, SIGIR 2007

  • Craswell, N., Zoeter, O., Taylor, M., Ramsey, B. An experimental

comparison of click position-bias models, WSDM 2008

  • Dupret, G and Piwowarski, B: A user browsing model to predict

search engine click data from past observations. SIGIR 2008

  • White, R and D. Morris, Investigating the Querying and Browsing

Behavior of Advanced Search Engine Users, SIGIR 2007

169

Eugene Agichtein Emory University AAAI 2010 Tutorial: Inferring Searcher Intent 7/11/2010

References and Further Reading (3)

Kelly, D. and Teevan, J. Implicit feedback for inferring user preference: a

  • bibliography. SIGIR Forum 37, 2 (Sep. 2003)

Joachims, T., Granka, L., Pan, B., Hembrooke, H., and Gay, G. Accurately interpreting clickthrough data as implicit feedback., SIGIR 2005 Agichtein, E., Brill, E., Dumais, S., and Ragno, R. Learning user interaction models for predicting web search result preferences, SIGIR 2006 Buscher, G., Dengel, A., and van Elst, L. Query expansion using gaze-based feedback on the subdocument level., SIGIR 2008 Chapelle, O, and Y. Zhang, A Dynamic Bayesian Network Click Model for Web Search Ranking, WWW 2009 Piwowarski, B, Dupret, G, Jones, R: Mining user web search activity with layered bayesian networks or how to capture a click in its context, WSDM 2009 Guo, Q and Agichtein, E. Ready to Buy or Just Browsing? Detecting Web Searcher Goals from Interaction Data, to appear, SIGIR 2010

170

Eugene Agichtein Emory University