Coping with Noisy Search Experiences Pierre-Antoine Champin, Peter - PowerPoint PPT Presentation

Coping with Noisy Search Experiences Coping with Noisy Search Experiences Pierre-Antoine Champin, Peter Briggs, Maurice Coyle, Barry Smyth LIRIS, Clarity, Universit´ e de Lyon, University College Dublin, France Ireland 16 December 2009

Coping with Noisy Search Experiences Structure of the Talk 1 Context 2 Addressed Problem 3 Proposals and results 4 Perspectives 2 / 29

Coping with Noisy Search Experiences Context Structure of the Talk 1 Context Recommender Systems Context Aware Recommendation Social Search 2 Addressed Problem 3 Proposals and results 4 Perspectives 3 / 29

Coping with Noisy Search Experiences Context HeyStaks HeyStaks is a social context aware recommender system for web searches 4 / 29

Coping with Noisy Search Experiences Context Recommender Systems Recommender Systems Recommender systems aim at presenting users with information that suit their particular preferences or needs. General purpose search engines provide results based on an objective measure of relevance w.r.t. the query → same results for everyone Recommender systems for web search aim at personalising the results of search engines. 5 / 29

Coping with Noisy Search Experiences Context Recommender Systems HeyStaks as a Recommender System HeyStaks is an extension for Firefox. It aims at integrating into users’ habits rather than forcing them to change. It recognizes result pages from popular search engines, and alter them in order to promote links (move them up in the list), insert new links, based on the user’s past search experiences . 6 / 29

Coping with Noisy Search Experiences Context Recommender Systems HeyStaks as a Recommender System HeyStaks is an extension for Firefox. It aims at integrating into users’ habits rather than forcing them to change. It recognizes result pages from popular search engines, and alter them in order to promote links (move them up in the list), insert new links, based on the user’s past search experiences . Past search experiences are acquired by : implicit feedback: query results click-through explicit feedback: tagging page, voting, sharing 6 / 29

Coping with Noisy Search Experiences Context Recommender Systems Recommendations in HeyStaks 7 / 29

Coping with Noisy Search Experiences Context Context Aware Recommendation Context Aware Recommendation Search engines provide the same results for every user. Recommender systems provide personalised results... ... but provide the same personalisation every time. Nobody is only one user... ... their need depends on the context , especially when considering Web searches. 8 / 29

Coping with Noisy Search Experiences Context Context Aware Recommendation Context Aware Recommendation Search engines provide the same results for every user. Recommender systems provide personalised results... ... but provide the same personalisation every time. Nobody is only one user... ... their need depends on the context , especially when considering Web searches. My searches are sometimes related to my research, my teaching, my leisure... → need for different recommendations in different contexts . 8 / 29

Coping with Noisy Search Experiences Context Context Aware Recommendation Search Staks A search stak is a repository of search experiences all related to the same context. Users can create as many staks as they need. They manually select the active stak (current context). The active stak is where search experiences will be collected, and where they will be tapped to provide recommendations. 9 / 29

Coping with Noisy Search Experiences Context Social Search Social Search Social search is the process of sharing search experiences between like-minded Web searchers. In HeyStaks, social search is possible by shared staks : several users can contribute to, and receive recommendation from the same stak. Staks can be private: only people I invite can join it. public: anyone can join the stak. 10 / 29

Coping with Noisy Search Experiences Context Social Search HeyStaks Portal 11 / 29

Coping with Noisy Search Experiences Addressed Problem Structure of the Talk 1 Context 2 Addressed Problem 3 Proposals and results 4 Perspectives 12 / 29

Coping with Noisy Search Experiences Addressed Problem The Problem of Stak Selection Users fail to select the appropriate stak before starting a search. The recommendations they will get will be less relevant. Their search experience is filed in the wrong stak. → HeyStaks ends up with a noisy experience repositories, and provides less accurate recommendations (even when the correct stak is selected). 13 / 29

Coping with Noisy Search Experiences Addressed Problem Implemented Workarounds Fall back to default stak when idle. limits the input noise potentially reduces context awareness Signal when other staks provide recommendations. improves the relevance of recommendation, despite a wrong active stak encourages to select the right stak 14 / 29

Coping with Noisy Search Experiences Addressed Problem Other Possible Solutions Automatically select the right stak at query time. almost impossible if based on the sole query terms hazardous if based on the available recommendations technically complicated if based on external indicators time tracking tools browsing history ... 15 / 29

Coping with Noisy Search Experiences Addressed Problem Other Possible Solutions Automatically select the right stak at query time. almost impossible if based on the sole query terms hazardous if based on the available recommendations technically complicated if based on external indicators time tracking tools browsing history ... Help stak owners to maintain (or curate ) their staks. use classification techniques recommend the correct stak for a page pages are easier to classify than queries 15 / 29

Coping with Noisy Search Experiences Addressed Problem Other Possible Solutions Help stak owners to maintain (or curate ) their staks. use classification techniques recommend the correct stak for a page pages are easier to classify than queries 15 / 29

Coping with Noisy Search Experiences Addressed Problem Training the Page Classifier Most work on coping with noise in recommender systems assume that a clean training set is available before noise is encountered. We need to find the kernel of each stak: the set (or a subset of) the pages actually relevant to that stak. How can we find a reliable kernel? How can we evaluate its reliability? 16 / 29

Coping with Noisy Search Experiences Proposals and results Structure of the Talk 1 Context 2 Addressed Problem 3 Proposals and results Clustering Popularity weighting Popularity-based kernel 4 Perspectives 17 / 29

Coping with Noisy Search Experiences Proposals and results Clustering Clustering Idea: use clustering techniques to identify a candidate kernel Rationale: kernel pages must be somehow similar , while noisy pages will be heterogeneous Problem: huge variability depending on numerous parameters comparing terms or pages different similarity measures different clustering algorithms threshold values 18 / 29

Coping with Noisy Search Experiences Proposals and results Popularity weighting Popularity Weighting Idea: use a measure of the popularity of pages as a proxy to relevance, in order to provide a fuzzy kernel Rationale: kernel pages are repeatedly selected, while noisy pages will only be accidentally selected 19 / 29

Coping with Noisy Search Experiences Proposals and results Popularity weighting Popularity Measure 20 / 29

Coping with Noisy Search Experiences Proposals and results Popularity weighting User Evaluation Poll: for each of the 20 biggest shared staks 15 most popular pages & 15 least popular pages presented in random order to the stak owner asked if the page is relevant to the stak 21 / 29

Coping with Noisy Search Experiences Proposals and results Popularity weighting Poll Results 100 Irelevant I don’t know 90 Relevant 80 number of documents 70 60 50 40 30 20 10 0 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 popularity 22 / 29

Coping with Noisy Search Experiences Proposals and results Popularity weighting Experiment Classifier: decision tree / naive bayse for each of the 20 biggest shared staks trained with every page, weighted by normalized popularity 10-fold cross validation Accuracy each page contributes to the accuracy proportionally to its normalized popularity → it is more important for the classifier to recognize popular pages than unpopular pages. 23 / 29

Coping with Noisy Search Experiences Proposals and results Popularity weighting Experimental Results 0.8 weighted 0.7 0.6 weighted accuracy 0.5 0.4 0.3 0.2 0.1 J48 NaiveBayes ZeroR 24 / 29

Coping with Noisy Search Experiences Proposals and results Popularity weighting Experimental Results 0.8 weighted boolean unweighted 0.7 0.6 weighted accuracy 0.5 0.4 0.3 0.2 0.1 J48 NaiveBayes ZeroR 24 / 29

Coping with Noisy Search Experiences Pierre-Antoine Champin, Peter - PowerPoint PPT Presentation

Coping with Noisy Search Experiences Coping with Noisy Search Experiences Pierre-Antoine Champin, Peter Briggs, Maurice Coyle, Barry Smyth LIRIS, Clarity, Universit e de Lyon, University College Dublin, France Ireland 16 December 2009

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

How MENTAL HEALTH SPAIN is coping with COVID-19 How MENTAL HEALTH SPAIN is coping with COVID-19

Experiencing and Coping with Grief Experiencing and Coping with Grief Stages of Grief Being

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Discriminative Training February 19, 2013 Tuesday, February 19, 13 Noisy Channels Again p ( e )

Multi-parameter regularization for ill-posed problems with noisy right hand side and noisy

Noisy Channel Coding: Correlated Random Variables & Communication over a Noisy Channel Toni

Learning Nearest Neighbor Graphs from Noisy Distance Samples Noisy Distance Samples Blake Mason,

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed

2 EBI Search 3 EBI Search 4 EBI

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Search Algorithms 3 AI Slides (6e) c Lin Zuoquan@PKU 2003-2020 3 1 3 Search Algorithms

Transmission Maintenance Coordination Committee (TMCC) Membership Appointment Recommendations

Attendance Area Adjustment Recommendation Equity in Action Presented by Dr. Michael J.

PUBLIC OPTION FOR HEALTH CARE COVERAGE 1 Agenda 2 What was the Impetus for

Network Peering Dashboard for SURFnet David Garay Supervisors : Marijke Kaat, Jac Kloots 1

Update on Jail Reforms 1 August 26, 2016 Independent Civilian Oversight Of Custody Operations

Budget Advisory Council Budget Proposal Student Affairs Division Overview Student Affairs is

accessDC Study Project Advisory Committee Meeting November 1, 2016 Meeting Agenda

Quayside Public Briefing Waterfront Toronto Joe Cressy City Councillor Ward 10 Spadina-Fort York

Sambuz

Useful Links

Newsletter

Mail Us

Coping with Noisy Search Experiences Pierre-Antoine Champin, Peter - PowerPoint PPT Presentation

Coping with Noisy Search Experiences Coping with Noisy Search Experiences Pierre-Antoine Champin, Peter Briggs, Maurice Coyle, Barry Smyth LIRIS, Clarity, Universit e de Lyon, University College Dublin, France Ireland 16 December 2009

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

How MENTAL HEALTH SPAIN is coping with COVID-19 How MENTAL HEALTH SPAIN is coping with COVID-19

Experiencing and Coping with Grief Experiencing and Coping with Grief Stages of Grief Being

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Discriminative Training February 19, 2013 Tuesday, February 19, 13 Noisy Channels Again p ( e )

Multi-parameter regularization for ill-posed problems with noisy right hand side and noisy

Noisy Channel Coding: Correlated Random Variables &amp; Communication over a Noisy Channel Toni

Learning Nearest Neighbor Graphs from Noisy Distance Samples Noisy Distance Samples Blake Mason,

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed

2 EBI Search 3 EBI Search 4 EBI

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Search Algorithms 3 AI Slides (6e) c Lin Zuoquan@PKU 2003-2020 3 1 3 Search Algorithms

Transmission Maintenance Coordination Committee (TMCC) Membership Appointment Recommendations

Attendance Area Adjustment Recommendation Equity in Action Presented by Dr. Michael J.

PUBLIC OPTION FOR HEALTH CARE COVERAGE 1 Agenda 2 What was the Impetus for

Network Peering Dashboard for SURFnet David Garay Supervisors : Marijke Kaat, Jac Kloots 1

Update on Jail Reforms 1 August 26, 2016 Independent Civilian Oversight Of Custody Operations

Budget Advisory Council Budget Proposal Student Affairs Division Overview Student Affairs is

accessDC Study Project Advisory Committee Meeting November 1, 2016 Meeting Agenda

Quayside Public Briefing Waterfront Toronto Joe Cressy City Councillor Ward 10 Spadina-Fort York

Sambuz

Useful Links

Newsletter

Mail Us

Noisy Channel Coding: Correlated Random Variables & Communication over a Noisy Channel Toni