[PPT] - Search User Behavior Modeling YiqunLIU Department of Computer PowerPoint Presentation

SLIDE 1

Search ¡User ¡Behavior ¡Modeling

YiqunLIU Department ¡of ¡Computer ¡Science ¡& ¡Technology Tsinghua ¡University, ¡Beijing, ¡China @ASSIA2015, ¡Taipei

SLIDE 2

About ¡ ¡The ¡ ¡Lect cturer

Name: ¡Yiqun LIU ¡(刘奕群/劉奕群)
Associate ¡Professor ¡@ ¡DCST, ¡Tsinghua ¡University ¡

(Beijing)

Visiting ¡Research ¡Associate ¡Professor ¡@ ¡SOC, National ¡

University ¡of ¡Singapore

Homepage/Personal Info Links: ¡
http://www.thuir.cn/group/~yqliu
https://scholar.google.com/citations?user=NJOnxh4AAAAJ
http://dblp.uni-‑trier.de/pers/hd/l/Liu:Yiqun

SLIDE 3

Department ¡of ¡Computer ¡Science ¡& ¡Technology
Established ¡in ¡1958, ¡first ¡DCST ¡in ¡China
Ranked ¡7th according ¡to ¡USNews CS ¡subject ¡Rankings
Ranked ¡33rd according ¡to ¡QS ¡CS ¡subject ¡Rankings
THUIR: ¡Information ¡Retrieval ¡@ ¡Tsinghua
http://www.thuir.org/
Focused ¡on ¡IR ¡researches ¡since ¡2001

Th The ¡ ¡TH THUIR ¡ ¡Group

SLIDE 4

Research ¡Interests
Information ¡retrieval ¡models ¡and ¡algorithms
Web ¡search ¡technologies
Cognitive ¡behavior ¡of ¡Web ¡search ¡users
Members
Leader: ¡Prof. ¡Shaoping ¡Ma
Professors: ¡Min ¡Zhang, ¡

Yijiang Jin, ¡Yiqun ¡Liu;

Students: ¡10 ¡Ph. ¡D. ¡students, ¡

8 ¡M.S. ¡students, ¡...

Th The ¡ ¡TH THUIR ¡ ¡Group

SLIDE 5

Cooperation ¡with ¡industries
Tsinghua-‑Sogou joint ¡lab ¡on ¡Web ¡search ¡technology ¡(since ¡2006)
Tsinghua-‑Baidu joint ¡course: ¡Fundamentals ¡of ¡search ¡engine ¡

technology ¡(since ¡2008), ¡Computational ¡advertising ¡(since ¡2013)

Tsinghua-‑Google joint ¡course: ¡Search ¡Engine ¡Product ¡Design ¡and ¡

Implementation ¡(since ¡2009), ¡Google ¡Code ¡University ¡Project

Research ¡projects ¡from ¡Yahoo!, ¡Samsung, ¡Toshiba, ¡etc. ¡

Th The ¡ ¡TH THUIR ¡ ¡Group

SLIDE 6

Th The ¡ ¡TH THUIR ¡ ¡Group

When ¡Cognitive ¡Psychology meets ¡Web ¡search
Users’ ¡information ¡perceiving ¡process ¡on ¡SERPs
E.g. ¡Decision ¡Making ¡Behavior ¡(Click-‑through/query ¡

reformulation/abandonment/search ¡engine ¡switch)

Applications
Search ¡ranking ¡algorithm: ¡click ¡models, ¡LTR ¡training, ¡…
Search ¡evaluation ¡methodology: ¡evaluation ¡metrics, ¡A/B ¡test, ¡

interleaving, ¡…

E.g. ¡Result ¡Examination ¡Behavior

Search ¡satisfaction ¡prediction: ¡satisfaction, ¡frustration, ¡…

SLIDE 7

Yahoo ¡LTR ¡task: ¡700+ ¡ranking ¡signals
Hyperlink, ¡Content ¡relevance, ¡User ¡behavior, ¡Page ¡

structure, ¡Freshness, ¡Service ¡stability, ¡……

User ¡behavior ¡helps
A ¡certain ¡user ¡may ¡make ¡

mistakes

User ¡crowds ¡usually ¡make

much ¡wiser ¡decisions

E.g. ¡the ¡most ¡clicked ¡results

How ¡ ¡do ¡ ¡Search ch ¡ ¡Engines ¡ ¡Rank ¡ ¡Results

SLIDE 8

Us User ¡ ¡Behavior ¡ ¡may ¡ ¡be ¡ ¡Biased

Example: ¡position ¡bias
Users’ ¡behaviors ¡may ¡be ¡affected ¡by ¡ranking ¡positions
How ¡to ¡model ¡this ¡effect ¡is ¡essential ¡for ¡the ¡utility ¡of ¡

user ¡behaviors

SLIDE 9

Sch chedule

Our ¡aim: ¡understand ¡search ¡user ¡behavior ¡and ¡

try ¡to ¡model ¡them ¡with ¡computational ¡methods

Morning ¡session ¡(est. ¡1 ¡hour): ¡Search ¡user ¡behavior ¡in ¡

a ¡homogeneous ¡environment

Afternoon ¡session ¡(est. ¡1 ¡hour): ¡Search ¡user ¡behavior ¡

and ¡perception ¡of ¡satisfaction ¡with ¡the ¡existence ¡of ¡ heterogeneous ¡results

SLIDE 10

Mo Morni rning ng ¡ ¡Session Search ch ¡ ¡Behavior ¡ ¡in ¡ ¡Homogeneous ¡ ¡SERPs

SLIDE 11

Search ch ¡ ¡Behavior ¡ ¡in ¡ ¡Homogeneous ¡ ¡SERPs

1. Querying ¡behavior ¡of ¡search ¡users
2. Clicking/examination ¡behavior ¡of ¡search ¡users
3. Constructing ¡click ¡models

Result 1 Result 2 Result 3 Result 10 … Query Examine /Click

SLIDE 12

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.1 ¡The ¡timing ¡of ¡search ¡behaviors
Data ¡source: ¡Bing, ¡2012/08-‑2012/10

Yang ¡Song, ¡Hao ¡Ma, ¡Hongning Wang, ¡and ¡Kuansan Wang. ¡Exploring ¡and ¡Exploiting ¡User ¡Search ¡ Behavior ¡on ¡Mobile ¡and ¡Tablet ¡Devices ¡to ¡Improve ¡Search ¡Relevance. ¡WWW ¡2013.

Query Distribution

SLIDE 13

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.2 ¡Distribution ¡of ¡querying ¡behavior: ¡frequency
Power ¡law ¡distribution: ¡hot ¡queries ¡are ¡rare

Huijia Yu, ¡Yiqun ¡Liu, ¡Min ¡Zhang, ¡Liyun ¡Ru ¡and ¡Shaoping ¡Ma, ¡Research ¡in ¡Search ¡Engine ¡User ¡Behavior ¡Based ¡on ¡Log ¡ Analysis ¡(in ¡Chinese). ¡Journal ¡of ¡Chinese ¡ Information ¡Processing. ¡Vol. ¡21(1): ¡pp. ¡109-‑114, ¡2007

SLIDE 14

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.2 ¡Distribution ¡of ¡querying ¡behavior: ¡frequency
Mobile ¡search ¡v.s. ¡Tablet ¡search ¡v.s. ¡Desktop ¡search

Rank

Kamvar, ¡M., ¡Baluja, ¡S.: ¡A ¡Large ¡Scale ¡Study ¡of ¡Wireless ¡Search ¡Behavior: ¡Google ¡Mobile ¡Search. ¡In: ¡Proceedings ¡of ¡ the ¡SIGCHI ¡conference ¡on ¡Human ¡Factors ¡in ¡computing ¡systems ¡(2006)

SLIDE 15

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.3 ¡Distribution ¡of ¡querying ¡behavior: ¡length
Query ¡string: ¡queries ¡segmented ¡by ¡inputted ¡spaces

1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1 4 16 64 256 用户查询频率查询长度查询串数查询词数

Rongwei Cen, ¡Yiqun Liu ¡et ¡al., ¡Search ¡user ¡behavior ¡analysis ¡based ¡on ¡log ¡mining. ¡Journal ¡of ¡Chinese ¡Information ¡ Processing ¡(in ¡Chinese). ¡Vol.24(3): ¡pp49-‑54. ¡2010。

Query ¡length

Query ¡string Query ¡word Frequency

SLIDE 16

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.3 ¡Distribution ¡of ¡querying ¡behavior: ¡length
Query ¡character: ¡number ¡of ¡characters ¡in ¡queries

Kamvar, ¡M., ¡Baluja, ¡S.: ¡A ¡Large ¡Scale ¡Study ¡of ¡Wireless ¡Search ¡Behavior: ¡Google ¡Mobile ¡Search. ¡In: ¡Proceedings ¡of ¡ the ¡SIGCHI ¡conference ¡on ¡Human ¡Factors ¡in ¡computing ¡systems ¡(2006)

#Query ¡word #Query ¡character

SLIDE 17

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.4 ¡Distribution ¡of ¡querying ¡behavior: ¡content
Local ¡queries ¡are ¡not ¡so ¡popular ¡on ¡mobile ¡devices

SLIDE 18

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.4 ¡Distribution ¡of ¡querying ¡behavior: ¡content
Content ¡v.s. ¡time ¡of ¡day
Complex ¡queries ¡in ¡querying ¡behavior ¡
English ¡words in ¡Chinese ¡searchers’ ¡queries: ¡17.63%
Queries ¡with ¡operators: ¡0.73%; ¡Queries ¡with ¡URLs: ¡2.82%

SLIDE 19

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.5 ¡Evolution ¡of ¡search ¡querying ¡behavior
Long ¡term ¡users ¡v.s. ¡ordinary ¡users: ¡What ¡are ¡the ¡

differences ¡between ¡long ¡term ¡and ¡other ¡users?

Users ¡behavior ¡evolution: ¡How ¡does ¡the ¡interaction ¡

process ¡of ¡search ¡users ¡evolve ¡during ¡several ¡years?

Year #Users #Session %long-‑term-‑user 2009 60,443,119 123,223,460 0.0306% 2010 92,087,410 231,901,502 0.0201% 2011 129,534,711 344,901,821 0.0143%

Jian ¡Liu, ¡Yiqun Liu, ¡Min ¡Zhang, ¡Shaoping Ma. ¡How ¡Do ¡Users ¡Grow ¡Up ¡along ¡with ¡Search ¡Engines? ¡A ¡Study ¡of ¡Long-‑ term ¡Users’ ¡Behavior. ¡In ¡CIKM ¡2013.

SLIDE 20

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.5 ¡Evolution ¡of ¡search ¡querying ¡behavior
Long%term%user%across%different%years
Long%term%v.s.%ordinary%users

SLIDE 21

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.5 ¡Evolution ¡of ¡search ¡querying ¡behavior
Percentage ¡of ¡navigational ¡tasks: ¡long ¡term ¡users ¡

(17.5%); ¡ordinary ¡users ¡(29.0%)

Number ¡of ¡search ¡

queries ¡submitted ¡ per ¡session

Percentage ¡of ¡

sessions ¡which contains ¡two ¡or ¡ more ¡clicks

SLIDE 22

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.5 ¡Evolution ¡of ¡search ¡querying ¡behavior
Query ¡length: ¡long ¡term ¡user ¡(13.9=>14.2); ¡ordinary ¡

user ¡(14.8=>16.1)

Question-‑like ¡queries ¡(identified ¡with ¡5W1H)

Percentage)of)question0like)queries Percentage)of)users)who)ever) adopts)question0like)queries

SLIDE 23

1. ¡

¡Search ch ¡ ¡Querying ¡ ¡Behavior

1.6 ¡Summary ¡of ¡Findings
Search ¡behavior ¡are ¡different ¡on ¡different ¡platforms
Power ¡law ¡distribution: ¡Query ¡frequency, ¡Query ¡length
Query ¡content: ¡not ¡so ¡many ¡local ¡searches ¡on ¡mobile
Long ¡term ¡users ¡v.s. ¡short ¡term ¡users
Long ¡term ¡users ¡through ¡years

SLIDE 24

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.1 ¡Data ¡Collecting: ¡Clicking ¡behavior
Search ¡click-‑through ¡logs ¡(e.g. ¡WSCD, ¡SogouQ)
User ¡info: ¡user ¡ID ¡& ¡IP, ¡search ¡device ¡
Query ¡info: ¡query ¡text, ¡time ¡stamp, ¡location, ¡… ¡
Click ¡info: ¡URL, ¡time ¡stamp, ¡… ¡
Search ¡results ¡ ¡
Organic results: ¡algorithmic ¡results ¡
Ads ¡results: ¡advertisement ¡results
Query ¡suggestions, ¡Vertical ¡links, ¡…

SLIDE 25

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.1 ¡Data ¡Collecting: ¡Clicking ¡behavior
Data ¡sample ¡from ¡SogouQ

SLIDE 26

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.1 ¡Data ¡Collecting: ¡Examining ¡behavior
Eye-‑tracking ¡behavior ¡of ¡search ¡users
Strong ¡eye-‑mind ¡hypothesis: ¡There ¡is ¡no ¡appreciable ¡

lag ¡between ¡what ¡is ¡fixated ¡on ¡and ¡what ¡is ¡processed ¡ (Just ¡et ¡al., ¡1980).

http://aeg.knmurthy.netdna-‑cdn.com/wp-‑content/uploads/2013/01/tobii.jpg

SLIDE 27

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.1 ¡Data ¡Collecting: ¡Examining ¡behavior
Human ¡reading ¡behavior: ¡fixation ¡v.s. ¡saccade
Fixation: ¡spatially ¡stable ¡gazes ¡each ¡lasting ¡for ¡

approximately ¡200−500 ¡milliseconds ¡

Saccade: ¡rapid ¡eye ¡movements ¡

that ¡occur ¡between ¡fixations ¡ lasting ¡40−50 ¡milliseconds ¡

Most ¡existing ¡studies ¡infer

examination ¡behavior ¡with eye ¡fixation ¡sequences

SLIDE 28

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.2 ¡Position ¡bias ¡in ¡clicking/examination
A ¡user ¡study ¡organized ¡

by ¡Nielson ¡Group ¡with ¡

ver ¡230 ¡participants ¡
n ¡search ¡user ¡behavior
Golden ¡Triangle: ¡

F-‑shape ¡heat ¡map ¡in ¡ eye ¡fixation ¡sequence

Northwestern: ¡Hot
Southeastern: ¡Cold

SLIDE 29

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.2 ¡Position ¡bias ¡in ¡clicking/examination
Users ¡have ¡a ¡larger ¡chance ¡to ¡examine ¡top-‑ranked ¡

results ¡and ¡then ¡click ¡them

Joachims et.al, Eye-tracking analysis of user behavior in www search. SIGIR 2005

Title: 17.4%
Snippet: 42.1%
Category: 1.9%
URL: 30.4%
Other: 8.2% (includes,

cached, similar pages, description)

SLIDE 30

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users
Cascade ¡assumption: ¡Users ¡tend ¡to ¡examine ¡results ¡

from ¡top ¡to ¡bottom ¡

Mean ¡time ¡of ¡arrival ¡v.s. ¡result ¡ranking ¡position

SLIDE 31

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users
Revisiting ¡behaviors also ¡happen ¡a ¡lot
Chinese ¡search ¡engine ¡(Sogou): ¡27.9% ¡sessions
Non-‑Chinese ¡search ¡engine ¡(Yandex): ¡30.4% ¡sessions

Danqing Xu, Yiqun Liu, et al. Incorporating Revisiting Behaviors into Click Models. WSDM 2012

SLIDE 32

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users

1 5 2 1 2 3 4 5 S E 1 2 3 1 2 3 4 2 S E 1 2 3 2 S E 1 3 2 S E

Long Revisit Short Revisit Skip ¡and ¡revisit ¡

Cascade ¡assumption Retaining ¡Sequential ¡ Information

SLIDE 33

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users
The ¡necessity ¡of ¡retaining ¡sequential ¡information

1 2 3 4 5 6 7 8 1 2 3 6 2 S E 1 6 2

Examine (unobserved) Click (observed) Reorganize ¡data ¡with ¡ cascade ¡assumption

1 6 2

Problem#1: ¡not ¡the ¡true ¡last ¡click Problem#2: ¡decision ¡process ¡is ¡missing

SLIDE 34

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users
How ¡often ¡do ¡users ¡change ¡the ¡direction ¡of ¡

examination ¡between ¡clicks?

1 2 1 2 3 1 2 1 2 3 4

click examine

SLIDE 35

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users
Locally ¡Unidirectional ¡Examination: ¡users ¡tend ¡to ¡

examine ¡search ¡results ¡in ¡a ¡single ¡direction ¡without ¡ changes ¡between ¡their ¡clicks

Chao Wang, Yiqun Liu, et al., Incorporating Non-sequential Behavior into Click Models. In: SIGIR2015

SLIDE 36

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users
How ¡far ¡do ¡users’ ¡eye ¡fixations ¡jump ¡after ¡examining ¡

the ¡current ¡clicked ¡result?

click examine

1 2 1 2 3 1 2 1 2 3 4 5

SLIDE 37

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users
Non ¡First-‑order ¡Examination: ¡Users ¡always ¡skip ¡a ¡few ¡

results ¡and ¡examine ¡a ¡result ¡at ¡some ¡distance ¡from ¡the ¡ current ¡one ¡between ¡clicks

SLIDE 38

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.3 ¡Examination sequence of search users
Users usually follow a cascade pattern in examination

(he/she examines search results one by one from top to bottom)

It is also common for users to revisit some results

(he/she examines/clicks a higher ranked search result after examining/clicking a lower ranked one)

During revisiting, he/she usually examines search

results from bottom to top with some skips

SLIDE 39

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.4 ¡Fixation ¡v.s. ¡Examination: ¡a ¡two-‑stage ¡model
Data ¡Collected: ¡click-‑through, ¡mouse ¡movement, ¡eye ¡

movement, ¡explicit ¡feedback ¡on ¡examination. ¡

37 ¡participants, ¡25 ¡queries ¡(INF:TRA:NAV ¡= ¡2:2:1)
Datasets are available for research http://www.thuir.cn/group/~YQLiu/

SLIDE 40

2. ¡

¡Click cking/Examination ¡ ¡Behavior

SERP%Generation Fixed% Queries Removal%of%Ads,% verticals,%… Task%completion%and% eye/mouse%logging Collecting%user%feedback%on% result%comprehension Next%Search%Task Search Users EyeGtracker%Calibration Commercial%Search% Engine Search% Tasks

SLIDE 41

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.4 ¡Fixation ¡v.s. ¡Examination: ¡a ¡two-‑stage ¡model
Fixation ¡v.s. ¡Examination ¡: ¡Eye ¡fixation ¡on ¡a ¡search ¡

result ¡is ¡a ¡prerequisite ¡for ¡examining ¡this ¡result

Fixation=0 Fixation=1 Examine=0 31.61% 28.81% Examine=1 5.49% 34.09%

Why:don’t:you:annotate:the:fixated: results:as:examined?:

SLIDE 42

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.4 ¡Fixation ¡v.s. ¡Examination: ¡a ¡two-‑stage ¡model
Examination ¡v.s. ¡Click: ¡Examining ¡a ¡search ¡result ¡is ¡a ¡

prerequisite ¡for ¡clicking ¡on ¡the ¡result.

Examine=0 Examine=1 Click=0 59.24% 17.57% Click=1 1.18% 22.01% Fixation=0 Fixation=1 Click=0 34.96% 41.85% Click=1 2.15% 21.04% Fixation=0 Fixation=1 Examine=0 31.61% 28.81% Examine=1 5.49% 34.09% E C F C F E F E C Preliminary ¡ judgment More ¡ careful ¡ judgment

SLIDE 43

2. ¡

¡Click cking/Examination ¡ ¡Behavior

Stage1 Examination Eye Fixation (Skimming) Judgment Skip the result Result

n

SERP Click the result Comprehension (Reading) Behavior Biases (Position, Attractiveness, Domain, …) Behavior Biases (Position, …), Snippet content (title/abstract) Potential Relevant Not Relevant Judgment Relevant Not Relevant Landing Page Reading Stage2 Examination

SLIDE 44

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.4 ¡Fixation ¡v.s. ¡Examination: ¡a ¡two-‑stage ¡model
Fixation ¡doesn’t ¡necessarily ¡mean ¡examination
Users ¡examine ¡results ¡with ¡a ¡two-‑stage ¡model
Consistent ¡with ¡information ¡triage
the ¡process ¡of ¡determining ¡the ¡priority ¡of ¡processing
Consistent ¡with ¡selective ¡attention ¡
the ¡process ¡whereby ¡the ¡brain ¡selectively ¡filters ¡out ¡large ¡

amounts ¡of ¡sensory ¡information ¡to ¡focus

SLIDE 45

2. ¡

¡Click cking/Examination ¡ ¡Behavior

SLIDE 46

2. ¡

¡Click cking/Examination ¡ ¡Behavior

2.5 ¡Summary ¡of ¡Findings
Position ¡bias: ¡Users ¡pay ¡more ¡attention ¡on ¡higher-‑

ranked ¡results

Sequential ¡examination: ¡in ¡about ¡70% ¡cases, ¡users ¡

examine ¡results ¡sequentially.

Non-‑sequential ¡examination: ¡in ¡about ¡30% ¡cases, ¡there ¡

exist ¡non-‑sequential ¡examination ¡behaviors, ¡in ¡which ¡ users ¡usually ¡follow ¡Locally ¡Unidirectional ¡Examination ¡ and ¡Non ¡First-‑order ¡Examination ¡patterns

Two-‑stage ¡examination: ¡from ¡skimming ¡to ¡reading

SLIDE 47

3. ¡

¡Construct cting ¡ ¡Click ck ¡ ¡Models

How ¡to ¡improve ¡ranking ¡with ¡user ¡behavior?
Simple ¡solution: ¡click ¡= ¡voting
Problem: ¡position ¡bias
How ¡to ¡estimate ¡relevance ¡

without ¡position ¡effect?

Courtesy of http://hubdesignsmagazine.com/2011/03/27/its-good-to-be-on-the-first-page-of-google/

“Golden ¡Triangle”

SLIDE 48

3. ¡

¡Construct cting ¡ ¡Click ck ¡ ¡Models

3.1 ¡Examination ¡Hypothesis
The ¡likelihood ¡that ¡a ¡user ¡will ¡click ¡on ¡a ¡search ¡result ¡is ¡

influenced ¡by ¡

Whether ¡the ¡user ¡examined ¡the ¡search ¡result ¡
Whether ¡the ¡result ¡is ¡attractive/relevant
Examination: ¡user ¡has ¡comprehended ¡(part ¡of) ¡the ¡

result ¡and ¡made ¡a ¡decision ¡on ¡whether ¡to ¡click.

How ¡to ¡estimate ¡the ¡probability ¡of ¡examination?
M. Richardson, et al. Predicting clicks: estimating the click-through rate for new ads. WWW 2007, pp. 521-530.

SLIDE 49

3. ¡

¡Construct cting ¡ ¡Click ck ¡ ¡Models

3.2 ¡Click ¡Models: ¡cascade ¡model
User ¡examines ¡sequentially
User ¡will ¡not ¡stop ¡examining

until ¡he/she ¡clicks ¡a ¡result ¡

User ¡will ¡stop ¡immediately ¡

after ¡clicking ¡a ¡result

Suitable ¡for ¡navigational ¡or

transactional ¡search ¡tasks

1 2 3 4 5 6 7 8 9

Ci 1-‑Ci

SLIDE 50

3. ¡

¡Construct cting ¡ ¡Click ck ¡ ¡Models

3.2 ¡Click ¡Models: ¡Dependent click ¡model ¡(DCM)
User ¡examines ¡sequentially
User ¡will ¡not ¡stop ¡examining

until ¡he/she ¡clicks ¡a ¡result ¡

User ¡has ¡probability ¡λi to ¡

continue ¡after ¡clicking ¡a ¡result

More ¡powerful ¡and ¡practical ¡than ¡cascade ¡model

i i+1

Ci 1-‑Ci λi 1- λi

SLIDE 51

3. ¡

¡Construct cting ¡ ¡Click ck ¡ ¡Models

3.2 ¡Click ¡Models: ¡User ¡Browsing Model ¡(UBM)
The ¡probability ¡of ¡user’s ¡

examination ¡is ¡related ¡with ¡both ¡ the ¡last ¡clicking ¡position ¡and ¡the ¡ distance ¡from ¡that ¡position.

UBM ¡take ¡users’ ¡attention ¡

decaying factor ¡into ¡consideration

More powerful, more parameters

to be estimated

1 2 3 r i … …

ri di

SLIDE 52

3. ¡

¡Construct cting ¡ ¡Click ck ¡ ¡Models

3.3 ¡Possible ¡future ¡directions: ¡Mouse ¡movement

Both ¡methods ¡of ¡testing ¡are ¡conducted ¡simultaneously, ¡there ¡is ¡an ¡84%-‑88% ¡ correlation ¡in ¡the ¡results

SLIDE 53

3. ¡

¡Construct cting ¡ ¡Click ck ¡ ¡Models

3.3 ¡Possible ¡future ¡directions: ¡Mouse ¡movement

SLIDE 54

Af Afternoon ¡ ¡Session Search ch ¡ ¡Behavior ¡ ¡in ¡ ¡Het Heter erogen geneo eous ¡ ¡SERPs

SLIDE 55

Search ch ¡ ¡Behavior ¡ ¡in ¡ ¡Het Heter erogen geneo eous ¡ ¡SERPs

How ¡do ¡heterogeneous ¡results ¡affect ¡users’ ¡

search ¡interaction ¡behavior

How ¡do ¡heterogeneous ¡results ¡affect ¡users’ ¡

perception ¡of ¡satisfaction

SLIDE 56

Search ch ¡ ¡Behavior ¡ ¡in ¡ ¡Heterogeneous ¡ ¡SERPs

Vertical results are everywhere (over 80% SERPs)

SLIDE 57

1. ¡

¡Vertica cal ¡ ¡Results ¡ ¡v. v.s. ¡ . ¡User ¡ r ¡Beha havior

RQ ¡1.1 ¡How do vertical affect users’ examination

behavior ¡on SERPs?

RQ ¡1.2 ¡How ¡do ¡users ¡examine ¡different ¡sub-‑

components ¡within ¡verticals?

RQ ¡1.3 ¡Constructing ¡vertical-‑aware ¡click ¡models

SLIDE 58

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Collecting ¡user ¡behavior ¡data
Task ¡organization: ¡30 ¡query ¡tasks ¡per ¡participant
Original ¡query ¡+ ¡task ¡description
Vertical ¡quality ¡manipulation: ¡off-‑target ¡queries
35 ¡ ¡participants
Aged ¡18-‑25, ¡mean ¡= ¡18.8; ¡students ¡with ¡different ¡majors
32 ¡of ¡35 ¡record ¡valid ¡eye-‑tracking ¡data

Original ¡Query ¡ Off-‑target ¡Query Vertical Type iTunes ¡download iTools download Application ¡download Nike ¡basketball ¡shoes Nike ¡football ¡shoes Image

SLIDE 59

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Presentation ¡style
Textual, ¡Application, ¡Encyclopedia, ¡Image, ¡News
Vertical ¡Position
Rank ¡1st/3rd/5th: ¡top/middle/bottom ¡of ¡the ¡first ¡screen
Vertical ¡quality
On ¡topic ¡(relevant) ¡v.s. ¡Off ¡topic ¡(irrelevant)

SLIDE 60

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Attention ¡distribution
Eye ¡fixation ¡on ¡results
Variable #1: position

For most ranking positions, verticals draw more attention

rganic results receive less
Variable #2: presentation style

Image vertical and application download vertical draw more attention

Position 1 2 3 4 5 6~10 46.03% 27.92% 9.80% 5.43% 4.01% 6.81% 1 61.93%* 17.79% 9.36% 5.53% 3.68% 1.71%* 3 43.39% 24.90% 16.13%* 3.05% 8.30% 4.24% 5 41.15% 20.96% 10.25% 14.93%** 4.42% 8.29% 1 61.6%* 24.90% 6.63% 1.25%* 2.21% 3.41% 3 45.15% 20.88% 24.97%** 2.49% 4.00% 2.51% 5 51.61% 17.1%* 9.45% 7.49% 10.89%** 3.47% 1 78.37%**10.48%** 8.12% 1.22%* 1.15% 0.66%* 3 38.36% 15.88%* 40.5%** 3.70% 0.91% 0.65%* 5 20.47%** 16.41%* 12.88% 10.91%* 34.4%** 4.92% 1 81.36%** 5.7%** 2.57%** 2.49% 5.52% 2.36% 3 21.62%** 19.49% 50.12%** 2.10% 4.18% 2.48% 5 43.62% 17.83% 4.64% 7.34% 25.06%** 1.5%* 1 50.55% 10.99%** 10.48% 10.28%* 3.73% 13.97%** 3 33.19% 18.94% 29.37%** 5.40% 6.80% 6.29% 5 44.17% 19.09% 10.22% 9.47% 7.97% 9.08% Application-download No Vertical Textual News Encyclopedia Image-only

SLIDE 61

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Rank ¡1st Rank ¡3rd Rank ¡5th

Attractiveness ¡Effect
Certain ¡verticals ¡draw ¡more/early ¡attention

SLIDE 62

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Variable ¡#3: ¡quality ¡(on ¡topic ¡v.s. ¡off ¡topic)

Position 1 2 3 4 5 6~10 46.03% 27.92% 9.80% 5.43% 4.01% 6.81% 1 35.32% 30.07% 9.44% 11.22%** 6.55% 7.39% 3 41.24% 27.35% 9.99% 6.73% 3.48% 11.20% 5 43.26% 22.07% 8.77% 11.61%* 7.79% 6.49% 1 33.09% 31.59% 14.56% 6.66% 4.75% 9.35% 3 43.06% 18.20% 11.68% 10.46%* 6.15% 10.46% 5 54.15% 17.27%* 7.03% 6.16% 7.04% 8.35% 1 42.36% 32.85% 10.31% 6.17% 4.09% 4.22% 3 37.13% 18.22% 21.89%** 8.14% 5.34% 9.28% 5 27.67%**12.66%** 15.34% 11.43%* 23.42%** 9.48% 1 41.99% 28.48% 12.98% 5.25% 3.85% 7.45% 3 35.86% 17.2%* 35.28%** 6.59% 2.85% 2.22% 5 46.48% 20.91% 13.27% 4.37% 11.03%** 3.95% 1 42.85% 24.88% 14.15% 3.88% 3.66% 10.59% 3 34.28% 33.01% 17.35%* 6.20% 2.75% 6.40% 5 34.91% 30.12% 14.61% 5.78% 4.49% 10.08% Application-download News No Vertical Textual Encyclopedia Image-only Position 1 2 3 4 5 6~10 46.03% 27.92% 9.80% 5.43% 4.01% 6.81% 1 61.93%* 17.79% 9.36% 5.53% 3.68% 1.71%* 3 43.39% 24.90% 16.13%* 3.05% 8.30% 4.24% 5 41.15% 20.96% 10.25% 14.93%** 4.42% 8.29% 1 61.6%* 24.90% 6.63% 1.25%* 2.21% 3.41% 3 45.15% 20.88% 24.97%** 2.49% 4.00% 2.51% 5 51.61% 17.1%* 9.45% 7.49% 10.89%** 3.47% 1 78.37%**10.48%** 8.12% 1.22%* 1.15% 0.66%* 3 38.36% 15.88%* 40.5%** 3.70% 0.91% 0.65%* 5 20.47%** 16.41%* 12.88% 10.91%* 34.4%** 4.92% 1 81.36%** 5.7%** 2.57%** 2.49% 5.52% 2.36% 3 21.62%** 19.49% 50.12%** 2.10% 4.18% 2.48% 5 43.62% 17.83% 4.64% 7.34% 25.06%** 1.5%* 1 50.55% 10.99%** 10.48% 10.28%* 3.73% 13.97%** 3 33.19% 18.94% 29.37%** 5.40% 6.80% 6.29% 5 44.17% 19.09% 10.22% 9.47% 7.97% 9.08% Application-download No Vertical Textual News Encyclopedia Image-only

SLIDE 63

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Examination ¡sequence
Image ¡vertical ¡and ¡

application ¡download vertical ¡draw ¡users attention ¡quite ¡quickly

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1 2 3 4 5 6~10 Percentage of Session Time

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1 2 3 4 5 6~10 Percentage of Session Time 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1 2 3 4 5 6~10 Percentage of Session Time

1st Rank 3rd Rank 5th Rank

SLIDE 64

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Attention ¡distribution ¡after ¡vertical ¡examination
After ¡examining ¡the ¡relevant ¡(on-‑topic) ¡vertical, ¡users ¡

tend ¡to ¡pay ¡less attention ¡to ¡other ¡organic ¡results

0% 10% 20% 30% 40% 50% 60% 70% 80% Textual Encyclopedia Image-only Download News

Percentage of Session Time Relevant vertical Irrelevant vertical Organic

SLIDE 65

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Cut-‑off ¡Effect
After ¡users ¡have ¡viewed ¡on-‑topic verticals, ¡they ¡are ¡

more ¡likely ¡to ¡decrease their visual attention on the

rganic results which are below verticals.

Relevant Vertical Textual Encyclo- pedia Image-only Application

download

News Organic Vertical 30.13% 16.70% 8.44% 13.04% 22.61% Diff

12.95%
51.74%*
75.62%**
62.32%**
34.68%

Organic Vertical 26.30% 19.27% 10.33% 6.21% 38.69% Diff 4.09%

23.76%
59.10%*
75.44%*

53.09% Position = 3 34.61% Position = 5 25.27%

SLIDE 66

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Spill-‑over ¡Effect ¡[Arguello et ¡al., ¡CIKM2014]
Users ¡spend ¡more attention ¡on ¡the ¡organic results ¡after ¡

they ¡have ¡examined ¡irrelevant vertical ¡results.

Textual Encyclo- pedia Image-only Application

download

News Relevant 61.49% 59.90% 45.14% 52.37% 54.86% Irrelevant 86.72% 87.74% 76.81% 63.14% 64.07% Diff 41.04%* 46.47%** 70.18%** 20.57% 16.80% Relevant 80.97% 59.56% 46.58% 49.64% 69.81% Irrelevant 79.92% 91.57% 68.93% 80.01% 85.93% Diff

1.30%

53.75%** 47.99% 61.16%** 23.09%* Position = 3 Position = 5

SLIDE 67

1. 1.1 ¡ 1 ¡Page-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Spill-‑over ¡Effect ¡[Arguello et ¡al., ¡CIKM2014]
Users ¡spend ¡more attention ¡on ¡the ¡organic results ¡after ¡

they ¡have ¡examined ¡irrelevant vertical ¡results.

Relevant ¡Vertical Irrelevant ¡Vertical

SLIDE 68

1. 1.2 ¡ 2 ¡Result-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Be Behavior

How do users examine different sub-‑components

within the vertical block?

Investigating ¡the ¡attention ¡distribution ¡among ¡different ¡

components

Image ¡component ¡v.s. ¡Textual ¡component
Components ¡with ¡different ¡ranking ¡positions
Discovering ¡the ¡common ¡examination ¡patterns ¡of ¡users
Where ¡to ¡start, ¡where ¡to ¡end ¡the ¡examination ¡process
Help ¡understand ¡how ¡users ¡judge ¡the ¡quality ¡of ¡verticals

SLIDE 69

1. 1.2 ¡ 2 ¡Result-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Textual ¡Vertical
No ¡position ¡bias ¡

among ¡homogeneous components ¡(items ¡2-‑3)

Users ¡pay ¡much ¡attention ¡on ¡Item ¡1 ¡and ¡title ¡

Title Item2 Item3 Item 1

2 3 1

0.05 0.1 0.15 0.2 0.25 0.3 (1,E) (0,1,E) (1,0,E) (0,E) (1,3,E)

textual

26.81% 56.60% 7.23% 9.36% 0% 10% 20% 30% 40% 50% 60% 70% Title Item1 Item2 Item3

textual

SLIDE 70

1. 1.2 ¡ 2 ¡Result-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Encyclopedia Vertical
Users ¡do ¡not ¡start/end ¡with ¡image ¡component ¡(v.s. ¡

page ¡level ¡user ¡behavior)

Title Image Text(Item1)

1 2

38.53% 20.49% 40.98% 0% 10% 20% 30% 40% 50% 60% 70% Title Image Text

encyclopedia

0.05 0.1 0.15 0.2 0.25 0.3 (0,E) (2,E) (0,2,E) (2,0,2,E) (1,E)

encyclopedia

SLIDE 71

1. 1.2 ¡ 2 ¡Result-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

News ¡Vertical
No ¡position ¡bias ¡

among ¡items ¡2-‑5

Image ¡component

is ¡hardly ¡examined

Title Image Item1 Item2-5

1 2 3 13.53% 9.74% 39.11% 9.08% 9.74% 10.73% 8.09% 0% 10% 20% 30% 40% 50% 60% 70% Title Image Item1 Item2 Item3 Item4 Item5

news

0.05 0.1 0.15 0.2 0.25 0.3 (2,E) (0,E) (2,3,E) (0,2,E) (1,E)

news

SLIDE 72

1. 1.2 ¡ 2 ¡Result-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Image ¡Vertical
Title ¡receives ¡much

attention

Position ¡bias ¡exists ¡

among ¡homogeneous ¡components ¡(images ¡1-‑4)

Image2 ¡receives ¡even ¡more ¡attention ¡than ¡Iamge1

Title Image1 Image2 Image3 image4

1 2 3 3

34.22% 22.67% 24.00% 9.78% 9.33% 0% 10% 20% 30% 40% 50% 60% 70% Title Image1 Image2 Image3 Image4

image-only

0.05 0.1 0.15 0.2 0.25 0.3 (0,E) (1,2,E) (1,2,0,E) (1,E) (2,E)

image-only

SLIDE 73

1. 1.2 ¡ 2 ¡Result-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Application-‑download ¡Vertical
Title and ¡text

are ¡focused

Image ¡component

is ¡hardly ¡examined

Title Image Text(Item1) Botton1 Botton2

1 2 3 3

73

32.82% 8.90% 49.08% 5.83% 3.37% 0% 10% 20% 30% 40% 50% 60% 70% Title Image Text Button1 Button2

application-download

0.05 0.1 0.15 0.2 0.25 0.3 (2,E) (0,E) (2,0,E) (0,2,E) (2,0,2,E)

application-download

SLIDE 74

1. 1.2 ¡ 2 ¡Result-‑

‑le

level ¡ l ¡Exam amin inat ation ion ¡ ¡Behavior ior

Examination ¡on ¡result-‑level ¡is ¡quite ¡different ¡from ¡

that ¡on ¡page-‑level

Attractiveness ¡bias: ¡verticals ¡with ¡images ¡receive ¡much ¡

attention ¡in ¡page-‑level ¡examination ¡while ¡image ¡ components ¡are ¡seldom ¡examined ¡within ¡verticals

Position ¡bias: ¡ranking ¡position ¡greatly ¡affect ¡users’ ¡

attention ¡allocation ¡in ¡page-‑level ¡examination ¡while ¡ plays ¡less ¡important ¡part ¡among ¡homogeneous ¡results ¡ within ¡verticals ¡(esp. ¡for ¡textual ¡components)

Possible ¡reasons: ¡skimming ¡v.s. ¡reading

SLIDE 75

1.3 ¡ ¡Vertica cal-‑

‑aw

awar are ¡ ¡Click ck ¡ ¡Model ¡ ¡

Model ¡Construction ¡(based ¡on ¡UBM)

Effect ¡on ¡Examination Effect ¡on ¡Click-‑through Restart ¡effect Original ¡ UBM ¡ Users ¡examine ¡ vertical ¡results ¡at ¡first ¡ Simplified ¡ case: ¡difficult ¡to ¡quantify ¡the ¡ effect ¡when ¡not ¡all ¡results ¡are ¡affected

SLIDE 76

1.3 ¡ ¡Vertica cal-‑

‑aw

awar are ¡ ¡Click ck ¡ ¡Model ¡ ¡

Experimental ¡results
About ¡300,000 ¡queries ¡and ¡11,000,000 ¡sessions ¡

collected ¡from ¡a ¡major ¡Chinese ¡search ¡engine

Perplexity UBM VCM VCM ¡Improvement Text ¡vertical 1.2266 1.2139 +5.58% Multimedia ¡vertical 1.3735 1.3071 +17.78% Application ¡vertical 1.1908 1.1601 +16.09% UBM VCM VCM ¡Improvement Text ¡vertical

‑2.9093
‑2.7968

+11.90% Multimedia ¡vertical

‑4.1142
‑3.8638

+28.44% Application ¡vertical

‑2.2671
‑2.1427

+13.24% Click/skip ¡perplexity Log-‑likelihood

SLIDE 77

2.

2. ¡

¡Vertica cal ¡ ¡Results ¡ ¡v. v.s. ¡ . ¡User ¡ r ¡Satisfact ction

Offline ¡Method
E.g. ¡Cranfield methodology ¡(Cleverdon et ¡al., ¡1966)
Reusability ¡is ¡nice, ¡adopted ¡in ¡TREC, ¡NTCIR ¡benchmarks
Annotators ¡are ¡different ¡from ¡users: ¡e.g. ¡ambiguity
Online ¡Method
E.g. ¡A/B ¡test, ¡Interleaving, ¡Satisfaction ¡feedback, ¡etc. ¡
Usually ¡not ¡reusable, ¡should ¡perform ¡online
More ¡reliable ¡evaluation ¡results ¡(Al-‑Maskari et ¡al., ¡2007; ¡

Huffman ¡et ¡al., ¡2007)

SLIDE 78

2. ¡

¡Vertica cal ¡ ¡Results ¡ ¡v. v.s. ¡ ¡User ¡ ¡Satisfact ction

Definition ¡(firstly ¡proposed ¡by ¡Su ¡et ¡al., ¡1992)
“Satisfaction ¡can ¡be ¡understood ¡as ¡the ¡fulfillment ¡of ¡a ¡

specified ¡desire ¡or ¡goal” ¡(Kelly ¡et ¡al., ¡2009)

Challenges ¡in ¡search ¡satisfaction ¡researches
Subjectivity: ¡different ¡users, ¡different ¡opinions
Heterogeneity: ¡heterogeneous ¡results ¡/ ¡vertical ¡

components ¡are ¡popular ¡today ¡besides ¡“ten ¡blue ¡links”

Sparsity: ¡interaction ¡behaviors ¡(e.g. ¡click ¡through) ¡are ¡

relatively ¡sparse, ¡which ¡makes ¡the ¡prediction ¡difficult

SLIDE 79

2. ¡

¡Vertica cal ¡ ¡Results ¡ ¡v. v.s. ¡ ¡User ¡ ¡Satisfact ction

RQ2.1 what is ¡the ¡difference ¡between users’ ¡

perception ¡and ¡annotators’ ¡external ¡judgments ¡ for ¡search ¡satisfaction?

RQ2.2 how ¡do ¡heterogeneous ¡results ¡affect ¡users’ ¡

perception ¡of ¡satisfaction?

SLIDE 80

2. 2.1 ¡ 1 ¡Judgment: ¡ : ¡Users ¡ ¡v. v.s. ¡ . ¡Assessors

Explicit ¡feedback ¡directly ¡from ¡search ¡users
Users ¡report ¡their ¡feelings ¡after ¡search ¡sessions ¡(Field ¡

et ¡al, ¡2010; ¡Guo et ¡al., ¡2012)

External ¡judgment ¡from ¡professional ¡assessors
Assessors ¡make ¡judgment ¡according ¡to ¡behavior ¡logs ¡

(Huffman ¡et ¡al., ¡2007; ¡Guo et ¡al., ¡2010)

Subjectivity: ¡are ¡two ¡kinds ¡of ¡judgments ¡different? ¡
For ¡relevance ¡judgments: ¡users’ ¡and ¡external ¡assessors’ ¡
pinions ¡are ¡different ¡(Verberne et ¡al., ¡2013)

SLIDE 81

2. 2.1 ¡ 1 ¡Judgment: ¡ ¡Users ¡ ¡v. v.s. ¡ . ¡Assessors

Search tasks: 30 ¡tasks ¡from ¡NTCIR ¡IMine
10 ¡navigational ¡tasks and ¡20 ¡informational tasks
Satisfaction ¡manipulation
Ordered ¡SERP: ¡ranking ¡according ¡to ¡relevance
Reverse ¡SERP: ¡reverse ¡ranking ¡according ¡to ¡relevance
Intuitively, ¡users ¡are ¡more ¡satisfied ¡with ¡Ordered ¡SERPs
Participants: 40 undergraduate students
External judgment: 3 professional assessors
Relevance ¡KAPPA: ¡0.70; ¡Satisfaction ¡KAPPA: ¡0.41

SLIDE 82

2. 2.1 ¡ 1 ¡Judgment: ¡ ¡Users ¡ ¡v. v.s. ¡ . ¡Assessors

Distribution ¡of ¡Satisfaction ¡Annotations ¡from ¡

Users ¡and ¡Assessors ¡

Users ¡are ¡more ¡positive ¡than ¡assessors

SLIDE 83

2. 2.1 ¡ 1 ¡Judgment: ¡ ¡Users ¡ ¡v. v.s. ¡ . ¡Assessors

Distribution ¡of ¡Scores ¡on ¡Manipulated ¡SERPs ¡
Both ¡prefer ¡ordered ¡SERPs ¡to ¡reverse ¡SERPs
Users’ ¡feedbacks ¡are ¡less ¡distinguishable ¡

Reverse ¡ranking: ¡results ¡are ¡not ¡changed, ¡user ¡efforts ¡are ¡different ¡

SLIDE 84

2. 2.1 ¡ 1 ¡Judgment: ¡ ¡Users ¡ ¡v. v.s. ¡ . ¡Assessors

Comparison ¡in ¡Benefit ¡and ¡Cost
Benefit: ¡information ¡gain; ¡Cost: ¡time/efforts ¡spent

Users ¡do ¡not ¡care ¡so ¡much ¡about ¡the ¡efforts ¡spent Both ¡users ¡and ¡assessors ¡care ¡about ¡benefit

SLIDE 85

2. 2.1 ¡ 1 ¡Judgment: ¡ ¡Users ¡ ¡v. v.s. ¡ . ¡Assessors

Summary ¡of ¡Findings
Users ¡are ¡a ¡bit ¡more ¡positive ¡than ¡assessors
Users ¡and ¡assessors ¡make ¡judgments ¡according ¡to ¡

different ¡factors

Assessors ¡pay ¡more ¡attention ¡on ¡the ¡efforts ¡spent ¡in ¡

information ¡gathering ¡(why ¡don’t ¡users ¡care?)

SLIDE 86

2.2 ¡ ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

Collecting ¡Data
Task ¡organization: ¡30 ¡query ¡tasks ¡per ¡participant
Original ¡query ¡+ ¡task ¡description
Vertical ¡quality ¡manipulation: ¡off-‑target ¡queries
35 ¡participants: ¡Aged ¡18-‑25, ¡mean ¡= ¡18.8, ¡undergraduates
External ¡judgment: ¡3 ¡professional ¡assessors, ¡Kappa=0.48

Original ¡Query ¡ Off-‑target ¡Query Vertical Type iTunes ¡download iTools download Application ¡download Nike ¡basketball ¡shoes Nike ¡football ¡shoes Image

SLIDE 87

2. 2.2 ¡ 2 ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

SLIDE 88

2.2 ¡ ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

Satisfaction ¡distribution ¡of ¡search ¡users
Z-‑Score ¡is ¡used ¡to ¡get ¡comparable ¡distributions
A ¡large ¡part ¡of ¡query ¡sessions ¡are ¡satisfactory
On-‑topic ¡verticals ¡obtain ¡best ¡result; ¡Off-‑topic ¡verticals ¡

usually ¡lead ¡to ¡lower ¡scores

10 20 30 40 50 60 70 [0,20%] [20%,40%] [40%,60%] [60%,80%] [80%,100%] Percentage of Sessions Percentage Range of Z-Scores On topic Vertical Non vertical Off topic vertical 10 20 30 40 50 60 70 [0,20%] [20%,40%] [40%,60%] [60%,80%] [80%,100%] Percentage of Sessions Percentage Range of Z-Scores On topic Vertical Non vertical Off topic vertical

Users’ ¡Satisfaction ¡Feedback External Assessors’ ¡Judgment

SLIDE 89

2.2 ¡ ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

Effect ¡of ¡presentation ¡style(for users’ judgment)
On-‑topic brings more satisfaction (encyclopedia,

download)

Off-‑topicleads to less satisfaction (textual, image)

All Organic W/ On-topic Vertical W/ Off-topic Vertical On-Off Difference Textual 5.15 5.10 (-0.05) 4.95 (-0.20**) +0.15* Encyclopedia 4.46 4.99 (+0.53) 4.67 (+0.21) +0.32 Image 5.17 5.07 (-0.10) 4.58 (-0.59) +0.49 Download 4.75 5.25 (+0.50) 4.60 (-0.15) +0.65 News 4.43 4.34 (-0.09) 4.38 (-0.05)

0.04

News verticals don’t bring much differences to users

SLIDE 90

2.2 ¡ ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

Effect ¡of ¡presentation ¡style(for external assessors)
Differences between SERPs with on-‑topic and off-‑topic

results are still significant

Off-‑topicdownloads brings higher scores (?)

All Organic W/ On-topic Vertical W/ Off-topic Vertical On-Off Difference Textual 3.75 3.64 (-0.11) 3.45 (-0.30) +0.19 Encyclopedia 3.18 3.62 (+0.44) 3.13 (-0.05) +0.49 Image 3.34 3.59 (+0.25) 3.18 (-0.16) +0.41 Download 2.85 3.58 (+0.73) 3.31 (+0.46) +0.27 News 3.10 2.73 (-0.37) 3.02 (-0.08)

0.29*

News verticals don’t bring much differences to users

SLIDE 91

2.2 ¡ ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

Effects of vertical position
Verticals bring larger effect while placed higher
Top-‑ranked off-‑topic results disappoint users

Users

Only organic On-‑topic Off-‑topic On-‑Off Difference Rank 1 4.79 5.06(+0.27**) 4.43(-‑0.36**) +0.63** Rank 3 4.79 4.93(+0.14) 4.63(-‑0.16) +0.29** Rank 5 4.79 4.87(+0.08*) 4.85(+0.06) +0.02

Assessors

Only organic On-‑topic Off-‑topic On-‑Off Difference Rank 1 3.24 3.48(+0.24*) 3.09(-‑0.15) +0.39** Rank 3 3.24 3.46(+0.22**) 3.31(+0.07) +0.15 Rank 5 3.24 3.37(+0.13) 3.27(+0.03) +0.10

SLIDE 92

2.2 ¡ ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

Benefit-‑cost analysis
On-‑topic Download vertical are quite worthwhile
Off-‑topicverticals are not worthwhile and should be

avoided

ECG: Examined

results’ cumu-‑ lated gain

LER: Length of

examined result sequence

SLIDE 93

2.2 ¡ ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

Verticals bring much effect to users’ perception
f satisfaction
Vertical-‑aware features in satisfaction prediction

Feature ¡Group Feature

Click-‑Through Features (CT)

verticalstyle verticalposition mouse arriveverticaltime if verticalis clicked Click verticaltime After verticaltime Hover time verticaldwelltime if other click Feature ¡Group Feature

Eye-‑Tracking Features (ET)

number of examinedresults lengthof examinedresult sequence fixationarrivaltimeof vertical fixationduration

SLIDE 94

2.2 ¡ ¡Effect ct ¡ ¡of ¡ ¡Heterogeneous ¡ ¡Result ¡ ¡

Prediction Results
Ridge regression,five-‑fold cross validation, NRMSE
Vertical-‑aware ¡features ¡bring ¡significant ¡improvement
Larger ¡improvement ¡for ¡user ¡satisfaction ¡predictions

NRMSE (Predicting User Satisfaction) NRMSE (Predicting External Judgment) Guo, 2012 Jiang, 2015 Guo, 2012 Jiang, 2015 Original 0.259 0.249 0.193 0.223 Original+CT 0.248** 0.243** 0.193 0.219 Original+ET 0.246** 0.242** 0.188 0.201** Original+CT+ET 0.242** 0.230** 0.187* 0.193**

SLIDE 95

2. 2.3 ¡ 3 ¡Summary ¡ ¡of ¡ ¡Findings

Users ¡v.s. ¡External ¡Assessors
Both ¡care ¡about ¡benefit, ¡effort ¡means ¡more ¡to ¡

assessors

Verticals ¡affect ¡users’ ¡perception ¡of ¡satisfaction
Top-‑ranked ¡on-‑topic ¡verticals ¡help ¡improve ¡satisfaction
Top-‑ranked off-‑topic results disappoint users
News ¡vertical ¡doesn’t ¡bring ¡much ¡difference ¡to ¡users
Search ¡Satisfaction ¡Prediction
Vertical-‑aware ¡features ¡help ¡improve ¡prediction ¡

performance

SLIDE 96

So Some Us Useful Resource ces

SIGIR2015 Tutorial
http://clickmodels.weebly.com/sigir-‑2015-‑tutorials.html
Textbook: Click models
http://www.morganclaypool.com/doi/10.2200/S00654ED1V01Y

201507ICR043

Open source projects
Clickmodels: https://github.com/varepsilon/clickmodels;

https://github.com/THUIR/PSCMModel;WSCD Dataset: http://research.microsoft.com/en-‑us/um/people/nickcr/wscd2012/ ,

Sogou Lab: http://www.sogou.com/labs

SLIDE 97

Dataset is availablefor ¡academic ¡use: ¡

Eye ¡fixations, ¡mouse ¡movement ¡features, ¡ clicks, ¡relevance ¡annotation, ¡examination ¡ feedback, ¡…

http://www.thuir.cn/group/~YQLiu/

Search ¡User ¡Behavior ¡Modeling

YiqunLIU Department ¡of ¡Computer ¡Science ¡& ¡Technology Tsinghua ¡University, ¡Beijing, ¡China @ASSIA2015, ¡Taipei

About ¡ ¡The ¡ ¡Lect cturer

(Beijing)

University ¡of ¡Singapore

Th The ¡ ¡TH THUIR ¡ ¡Group

Yijiang Jin, ¡Yiqun ¡Liu;

8 ¡M.S. ¡students, ¡...

Th The ¡ ¡TH THUIR ¡ ¡Group

technology ¡(since ¡2008), ¡Computational ¡advertising ¡(since ¡2013)

Implementation ¡(since ¡2009), ¡Google ¡Code ¡University ¡Project

Th The ¡ ¡TH THUIR ¡ ¡Group

Th The ¡ ¡TH THUIR ¡ ¡Group

reformulation/abandonment/search ¡engine ¡switch)

interleaving, ¡…

Search ¡satisfaction ¡prediction: ¡satisfaction, ¡frustration, ¡…

structure, ¡Freshness, ¡Service ¡stability, ¡……

mistakes

much ¡wiser ¡decisions

How ¡ ¡do ¡ ¡Search ch ¡ ¡Engines ¡ ¡Rank ¡ ¡Results

Us User ¡ ¡Behavior ¡ ¡may ¡ ¡be ¡ ¡Biased

user ¡behaviors

Sch chedule

try ¡to ¡model ¡them ¡with ¡computational ¡methods

a ¡homogeneous ¡environment

and ¡perception ¡of ¡satisfaction ¡with ¡the ¡existence ¡of ¡ heterogeneous ¡results

Mo Morni rning ng ¡ ¡Session Search ch ¡ ¡Behavior ¡ ¡in ¡ ¡Homogeneous ¡ ¡SERPs

Search ch ¡ ¡Behavior ¡ ¡in ¡ ¡Homogeneous ¡ ¡SERPs

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Search ch ¡ ¡Querying ¡ ¡Behavior

differences ¡between ¡long ¡term ¡and ¡other ¡users?

process ¡of ¡search ¡users ¡evolve ¡during ¡several ¡years?

Year #Users #Session %long-­‑term-­‑user 2009 60,443,119 123,223,460 0.0306% 2010 92,087,410 231,901,502 0.0201% 2011 129,534,711 344,901,821 0.0143%

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Search ch ¡ ¡Querying ¡ ¡Behavior

(17.5%); ¡ordinary ¡users ¡(29.0%)

queries ¡submitted ¡ per ¡session

sessions ¡which contains ¡two ¡or ¡ more ¡clicks

¡Search ch ¡ ¡Querying ¡ ¡Behavior

user ¡(14.8=>16.1)

¡Search ch ¡ ¡Querying ¡ ¡Behavior

¡Click cking/Examination ¡ ¡Behavior

¡Click cking/Examination ¡ ¡Behavior

¡Click cking/Examination ¡ ¡Behavior

lag ¡between ¡what ¡is ¡fixated ¡on ¡and ¡what ¡is ¡processed ¡ (Just ¡et ¡al., ¡1980).

¡Click cking/Examination ¡ ¡Behavior

approximately ¡200−500 ¡milliseconds ¡

that ¡occur ¡between ¡fixations ¡ lasting ¡40−50 ¡milliseconds ¡

examination ¡behavior ¡with eye ¡fixation ¡sequences

¡Click cking/Examination ¡ ¡Behavior

by ¡Nielson ¡Group ¡with ¡

F-­‑shape ¡heat ¡map ¡in ¡ eye ¡fixation ¡sequence

¡Click cking/Examination ¡ ¡Behavior

results ¡and ¡then ¡click ¡them

¡Click cking/Examination ¡ ¡Behavior

from ¡top ¡to ¡bottom ¡

¡Click cking/Examination ¡ ¡Behavior

¡Click cking/Examination ¡ ¡Behavior

Cascade ¡assumption Retaining ¡Sequential ¡ Information

¡Click cking/Examination ¡ ¡Behavior

Problem#1: ¡not ¡the ¡true ¡last ¡click Problem#2: ¡decision ¡process ¡is ¡missing

¡Click cking/Examination ¡ ¡Behavior

examination ¡between ¡clicks?

¡Click cking/Examination ¡ ¡Behavior

examine ¡search ¡results ¡in ¡a ¡single ¡direction ¡without ¡ changes ¡between ¡their ¡clicks

¡Click cking/Examination ¡ ¡Behavior

the ¡current ¡clicked ¡result?

¡Click cking/Examination ¡ ¡Behavior

results ¡and ¡examine ¡a ¡result ¡at ¡some ¡distance ¡from ¡the ¡ current ¡one ¡between ¡clicks

¡Click cking/Examination ¡ ¡Behavior

(he/she examines search results one by one from top to bottom)

(he/she examines/clicks a higher ranked search result after examining/clicking a lower ranked one)

results from bottom to top with some skips

¡Click cking/Examination ¡ ¡Behavior

Year #Users #Session %long-‑term-‑user 2009 60,443,119 123,223,460 0.0306% 2010 92,087,410 231,901,502 0.0201% 2011 129,534,711 344,901,821 0.0143%

F-‑shape ¡heat ¡map ¡in ¡ eye ¡fixation ¡sequence

exist ¡non-‑sequential ¡examination ¡behaviors, ¡in ¡which ¡ users ¡usually ¡follow ¡Locally ¡Unidirectional ¡Examination ¡ and ¡Non ¡First-‑order ¡Examination ¡patterns

Ci 1-‑Ci

Ci 1-‑Ci λi 1- λi

1. 1.1 ¡ 1 ¡Page-‑

1. 1.1 ¡ 1 ¡Page-‑

1. 1.1 ¡ 1 ¡Page-‑

1. 1.1 ¡ 1 ¡Page-‑

1. 1.1 ¡ 1 ¡Page-‑

1. 1.1 ¡ 1 ¡Page-‑

1. 1.1 ¡ 1 ¡Page-‑

1. 1.1 ¡ 1 ¡Page-‑