Modeling User Behavior and Interactions M d li U B h i d I t ti - PowerPoint PPT Presentation

Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 5: Search Interfaces + New Directions Eugene Agichtein Emory University 1 Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia)

Lecture 5 Plan 1. Generating result summaries (abstracts) Beyond result list d l l – 2 2. Spelling correction and query suggestion Spelling correction and query suggestion 3 3. New directions in search user interfaces New directions in search user interfaces – Collaborative Search – Collaborative Question Answering Collaborative Question Answering • PhD studies in the U.S. (and in Emory U) ( y ) Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 2

1. Generating Result Summaries g • How to present search results list to a user? • Most commonly, a list of the document titles plus a short summary, aka “10 blue links” 3 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

Good Summary Guidelines y • All query terms should appear in the All query terms should appear in the summary, showing their relationship to the retrieved page • When query terms are present in the title, they need not be repeated – allows snippets that do not contain query terms • Highlight query terms in URLs g g q y • Snippets should be readable text, not lists of keywords y Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 4

How to Generate Good Summaries? • The title is typically automatically extracted from document metadata. What about the summaries? – This description is crucial. – User can identify good/relevant hits based on description. • Two main kinds of summaries: – Static summary: always the same, regardless of the query that hit the doc – Dynamic summary : query-dependent attempt to explain why the document was retrieved for the query at hand t h d Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 5

Dynamic Summary Generation y y • Query-dependent document summary Q d d t d t • Simple summarization approach – rank each sentence in a document using a significance factor – select the top sentences for the summary – first proposed by Luhn in 50’s Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 6

Sentence Selection • Significance factor for a sentence is calculated based on the occurrence of significant words th f i ifi t d – If f d,w is the frequency of word w in document d , then w is a significant word if it is not a stopword and a significant word if it is not a stopword and where s d is the number of sentences in document d – text is bracketed by significant words (limit on number of non-significant words in bracket) i ifi t d i b k t) Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 7

Sentence Selection • Significance factor for bracketed text spans is Significance factor for bracketed text spans is computed by dividing the square of the number of significant words in the span by the total number of words • e.g., • Significance factor = 4 2 /7 = 2 3 • Significance factor = 4 /7 = 2.3 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 8

Dynamic Snippet Generation (Cont’d) y pp ( ) • Involves more features than just significance f factor t • e.g. for a news story, could use – whether the sentence is a heading – whether it is the first or second line of the document – the total number of query terms occurring in the sentence the total number of query terms occurring in the sentence – the number of unique query terms in the sentence – the longest contiguous run of query words in the sentence – a density measure of query words (significance factor) • Weighted combination of features used to rank sentences Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 9

Static Summary Generation y • Web pages are less structured than news Web pages are less structured than news stories – can be difficult to find good summary sentences g y • Snippet sentences are often selected from other sources – metadata associated with the web page • e.g., <meta name="description" content= ...> – external sources such as web directories • e.g., Open Directory Project, http://www.dmoz.org – Wikipedia: summary paragraph, infoboxes, … Wiki di h i f b Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 10

Problem? Very Good Summaries May Not Get Clicks ! Everything you needed is in the summary Everything you needed is in the summary Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, 11 Russia)

Organizing Search Results Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 Query: jaguar List Organization Category Org (SWISH) Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

System Components Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 web web training running search (offline) (online) results classified SVM SVM web model pages classified Search results Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 13

Text Classification Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 • Text Classification – Assign documents to one or more of a predefined set of categories – E.g., News feeds, Email - spam/no-spam, Web data – Manually vs. automatically • Inductive Learning for Classification – Training set: Manually classified a set of documents Training set: Manually classified a set of documents – Learning: Learn classification models – Classification: Use the model to automatically classify – Classification: Use the model to automatically classify new documents Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 14

Learning & Classification Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 • Support Vector Machine (SVM) – Accurate and efficient for text classification (Dumais Accurate and efficient for text classification (Dumais et al., Joachims) – Model = weighted vector of words • “Automobile” = motorcycle, vehicle, parts, automobile, harley, car, auto, honda, porsche … • “Computers & Internet” = rfc, software, provider, windows, p , , p , , user, users, pc, hosting, os, downloads ... • Hierarchical Models – 1 model for N top level categories 1 d l f N t l l t i – N models for second level categories – Very useful in conjunction w/ user interaction Very useful in conjunction w/ user interaction Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 15

Information Overlay Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 – Use tooltips to show • Summaries of web pages • Category hierarchy Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 16

Expansion of Category Structure Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, 17 Russia)

User Study - Conditions Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 Category Interface List Interface 18 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia

User Study Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, 19 Russia)

Subjective Results Dumais , S, E. Cutrell, and H. Chen. Optimizing search by showing results in context , CHI 2001 7-point rating scale (1=disagree; 7=agree) Question Question Category Category List List significance significance It was easy to use this software. 6.4 3.9 p<.001 I liked using this software 6.7 4.3 p<.001 I prefer this to my usual Web Search engine 6.4 4.3 p<.001 It was easy to get a good sense of the range of alternatives It t t d f th f lt ti 6 4 6.4 4.2 4 2 p<.001 < 001 I was confident that I could find information if it was there. 6.3 4.4 p<.001 The "More" button was useful 6.5 6.1 n.s. The display of summaries was useful 6.5 6.4 n.s. Average Number of Uses of Feature per Task Interface Features Category List significance Expansing / Collapsing Structure 0.78 0.48 p<.003 Viewing Summaries in Tooltips Viewing Summaries in Tooltips 2 99 2.99 4 60 4.60 p< 001 p<.001 Viewing Web Pages 1.23 1.41 p<.053 Eugene Agichtein, RuSSIR 2009, September 11-15, Petrozavodsk, Russia 20

Modeling User Behavior and Interactions M d li U B h i d I t ti - PowerPoint PPT Presentation

Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 5: Search Interfaces + New Directions Eugene Agichtein Emory University 1 Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) Lecture 5 Plan 1.

Eugene Agichtein g g Emory University Eugene Agichtein RuSSIR 2009: Modeling User Behavior and

Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with Behavior Data Eugene

Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 2: Interpreting

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 4: Search

BEHAVIOR @ HOME Behavior Basics Simple strategies that can make a big difference! Presented by

Identifying Web Spam Identifying Web Spam With User Behavior Analysis With User Behavior

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

Annotator https://github.com/okfn/annotator User interactions Protocols Data models User

Molecular Modeling Used as a Molecular Modeling Used as a Probe of Interactions to Study the

APPLIED BEHAVIOR ANALYSIS Specialization Overview Agenda What is Applied Behavior Analysis

Molecular Modeling of Proteins O. Michielin, SIB/LICR Molecular Modeling of Proteins Lecture

Intermolecular interactions and scattering M.H.J. Koch 1 Intermolecular interactions

Part II: (S)RG and Low-Momentum Interactions To understand the properties of complex nuclei from

WGCV/WGISS interactions Ric har d MORE NO WGISS Chair CNE S WGCV / WGISS interactions

System Policy and Procedure Title: Meditech Patient Care Systems (PCS) & Number: SY-AD-009

development in Mozambique drgs Developing, Sustaining, and Managing a Competent Global Health

Europe: A burgeoning market for fracking How far is Europe from widespread fracking?...

Welcome to Samara! INDUSTRIAL PARKS PREOBRAZHENKA CHAPAEVSK 1 INDUSTRIAL PARK

Fagron: 2017 Results R AFAEL P ADILLA , CEO K ARIN DE J ONG , CFO 7 F EBRUARY 2018 Headlines 2017

Academ ic perspective Marie L De Bruin Utrecht Institute for Pharmaceutical Sciences 9th

P LANNED S ECTIONS OF US 19 Project Advisory Committee Meeting December 2, 2015 A J OINT E FFORT OF

Co-digestion of meat-processing by-products, manure and residual glycerin D. Hidalgo*, J. M.

Modeling User Behavior and Interactions M d li U B h i d I t ti - PowerPoint PPT Presentation

Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 5: Search Interfaces + New Directions Eugene Agichtein Emory University 1 Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) Lecture 5 Plan 1.

Eugene Agichtein g g Emory University Eugene Agichtein RuSSIR 2009: Modeling User Behavior and

Lecture 3: Improving Ranking with Lecture 3: Improving Ranking with Behavior Data Eugene

Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 2: Interpreting

RUN groupadd -r user &amp;&amp; useradd -r -g user user USER user $ docker run --read-only debian

Modeling User Behavior and Interactions M d li U B h i d I t ti Lecture 4: Search

BEHAVIOR @ HOME Behavior Basics Simple strategies that can make a big difference! Presented by

Identifying Web Spam Identifying Web Spam With User Behavior Analysis With User Behavior

Modeling of proteins and complexes High resolution Low resolution Modeling of domains Modeling

Virtual Reality Modeling Virtual Reality Modeling from http://www.okino.com/ Modeling Modeling

Annotator https://github.com/okfn/annotator User interactions Protocols Data models User

Molecular Modeling Used as a Molecular Modeling Used as a Probe of Interactions to Study the

APPLIED BEHAVIOR ANALYSIS Specialization Overview Agenda What is Applied Behavior Analysis

Molecular Modeling of Proteins O. Michielin, SIB/LICR Molecular Modeling of Proteins Lecture

Intermolecular interactions and scattering M.H.J. Koch 1 Intermolecular interactions

Part II: (S)RG and Low-Momentum Interactions To understand the properties of complex nuclei from

WGCV/WGISS interactions Ric har d MORE NO WGISS Chair CNE S WGCV / WGISS interactions

System Policy and Procedure Title: Meditech Patient Care Systems (PCS) &amp; Number: SY-AD-009

development in Mozambique drgs Developing, Sustaining, and Managing a Competent Global Health

Europe: A burgeoning market for fracking How far is Europe from widespread fracking?...

Welcome to Samara! INDUSTRIAL PARKS PREOBRAZHENKA CHAPAEVSK 1 INDUSTRIAL PARK

Fagron: 2017 Results R AFAEL P ADILLA , CEO K ARIN DE J ONG , CFO 7 F EBRUARY 2018 Headlines 2017

Academ ic perspective Marie L De Bruin Utrecht Institute for Pharmaceutical Sciences 9th

P LANNED S ECTIONS OF US 19 Project Advisory Committee Meeting December 2, 2015 A J OINT E FFORT OF

Co-digestion of meat-processing by-products, manure and residual glycerin D. Hidalgo*, J. M.

RUN groupadd -r user && useradd -r -g user user USER user $ docker run --read-only debian

System Policy and Procedure Title: Meditech Patient Care Systems (PCS) & Number: SY-AD-009