IR in Context of the User: Interactive IR Evaluation
Peter Ingwersen Royal School of LIS Denmark pi@iva.dk – http://www.iva.dk/pi Oslo University College, Norway
Essir2011 1 Ingwersen
IR in Context of the User: Interactive IR Evaluation Peter - - PowerPoint PPT Presentation
IR in Context of the User: Interactive IR Evaluation Peter Ingwersen Royal School of LIS Denmark pi@iva.dk http://www.iva.dk/pi Oslo University College, Norway Essir2011 Ingwersen 1 Agenda - 1 Introduction (20 min) Research
Peter Ingwersen Royal School of LIS Denmark pi@iva.dk – http://www.iva.dk/pi Oslo University College, Norway
Essir2011 1 Ingwersen
Essir2011 2 Ingwersen
Research Frameworks vs. Models Central components of Interactive IR (IIR) The Integrated Cognitive Research Framework for IR
Short-term IR interaction experiments Sample study – Diane Kelly (2005/2007)
Essir2011 3 Ingwersen
Interactive-light session-based IR studies Request types Test persons Design of task-based simulated search situations Relevance and evaluation measures in IIR Sample study – Pia Borlund (2000; 2003b)
Essir2011 4 Ingwersen
Integrating context variables Live systems & (simulated) work tasks Sample Study – Marianne Lykke (Nielsen) (2001; 2004)
Essir2011 5 Ingwersen
Frameworks describe
Essential objects to study The relationships of objects The changes in the objects /
relationships that affect the functioning of the system
Promising goals and methods
Frameworks contain (tacit)
ontological, conceptual, factual,
epistemological, and methodological
The concept model
A precise (often formal)
representation of objects and relationships (or processes) within a framework
Modeling may also in principle
encompass human actors and
Frameworks may lead to
Research Designs, incl. Research Questions;
Experimental Setting; Methodology
Docu- ments Represen- tation Database Search request Query Matching Represen- tation Query Result Evaluation Result Evaluation Relevance assessment Recall base
Pseudo RF
Essir2009 Ingwersen
7
Essir2011 6 Ingwersen
Essir2011 7 Ingwersen
2011 Peter Ingwersen 8
Seeking IR
Job-related Work Tasks Interests Non-job-related Tasks and Interests Daily-life behavior
Information behaviour Interactive IR Behaviour
2011 Peter Ingwersen 9
Järvelin, fig. 2
Perceived Task Personal Seeking style Organization Personal Factors Situational Factors Information Need analysis Choice of Action
alternatives
Implementation Evaluation
a) needs satisfied, task may be completed b) needs cannot be satisfied c) further information is needed
Perceived Task Personal Seeking style Organization Personal Factors Situational Factors Information Need analysis Choice of Action
alternatives
Implementation Evaluation
a) needs satisfied, task may be completed b) needs cannot be satisfied c) further information is needed (From: The Turn, p. 69)
2011 Peter Ingwersen 10
2011 Peter Ingwersen 11
Type Abstract Document Values Criteria DIEs Decision processing combining deciding Knowledge of topic person
journal document type Decision Rules Elimination Multiple criteria Dominance Scarcity Satisfice Chain Author Title Orientation Topicality Date Series Journal Relation Authority Availability Novelty Quality Emotional Social Conditional Functional Epistemic Rejection Maybe Acceptance DIEs: Document Information Elements Values: Document Values/Worth (From: The Turn, p. 201)
2011 Peter Ingwersen 12
Essir2011 13 Ingwersen
Laboratory experiments – no test persons, but
Simulations – Log analyses (not treated in presentation)
Laboratory study – with test persons: ‘Ultra-light’ (short-term interaction: 1-2 retrieval runs)
Field experiment – experimental (artificial) situation
Field study – study of natural performance or
Longitudinal studies
Case study – (qualitative) study with few test persons
Essir2011 14 Ingwersen
Interface functionalities; Different IR models; Searcher knowledge
Dependent (the ‘effect’), e.g.,
Performance measures of output (recall/prec.; CumGain; usability)
Controlled (held constant; statistically neutralized; randomized):
Database; Information objects Search algorithms Simulated work task situations – Assigned TREC topics Test persons
Hidden variables (Moderating or Intervening), e.g.,
Variation of test persons’ levels of experience …!!! – see the Integrated
Research Framework for IR
Essir2009 15
Research Frameworks vs. Models Central components of Interactive IR (IIR) The Integrated Cognitive Research Framework for IR
Short-term IR interaction experiments Sample study – Diane Kelly (2005/2007)
Essir2011 15 Ingwersen
Essir2011 16 Ingwersen
Information
IT: Engines Logics Algorithms Interface Cognitive Actor(s)
(team)
Org. Cultural Social
Context
Information
IT: Engines Logics Algorithms Interface Cognitive Actor(s)
(team)
Org. Cultural Social
Context
Information
IT: Engines Logics Algorithms Interface Cognitive Actor(s)
(team)
Org. Cultural
= Cognitive transformation and influence = Interactive communication of cognitive structures = Cognitive transformation and influence over time
Social
The Lab. Framework
Essir2011 17 Ingwersen
Essir2009 18 Ingwersen
Information
IT: Engines Logics Algorithms Interface Cognitive Actor(s)
(team)
Org. Cultural Social
Context
Information
IT: Engines Logics Algorithms Interface Cognitive Actor(s)
(team)
Org. Cultural Social
Context
Interaction
Essir2011 18 Ingwersen
Essir2009 19 Ingwersen
1.
2.
3.
4.
5.
6.
7.
8.
9.
Each containing multiple variables
Socio-org. task dimensions Actor dimensions Algorithmic dimensions
Essir2011 19 Ingwersen
Essir2009 20 Ingwersen
Natural Work Tasks (WT) & Org Natural Search Tasks (ST) Actor Perceived Work Tasks Perceived Search Tasks
WT Structure ST Structure Domain Knowledge Perceived WT Structure Perceived Information Need Content WT Strategies & Practices ST Strategies & Practices IS&R Knowledge Perceived WT Strategies & Practices Perceived ST Structure/Type WT Granularity, Size & Complexity ST Granularity, Size & Complexity Experience on Work Task Perceived WT Granularity, Size & Complexity Perceived ST Strategies & Practices WT Dependencies ST Dependencies Experience on Search Task Perceived WT Dependencies Perceived ST Specificity & Complexity WT Requirements ST Requirements Stage in Work Task Execution Perceived WT Requirements Perceived ST Dependencies WT Domain & Context ST Domain & Context Perception of Socio-Org. Context Perceived WT Domain & Context Perceived ST Stability Sources of Difficulty Perceived ST Domain & Context Motivation & Emotional State
Variables with values
Essir2011 20 Ingwersen
Essir2009 21 Ingwersen
Natural Work Tasks (WT) & Org Natural Search Tasks (ST) Actor Perceived Work Tasks Perceived Search Tasks
WT Structure ST Structure Domain Knowledge Perceived WT Structure Perceived Information Need Content WT Strategies & Practices ST Strategies & Practices IS&R Knowledge Perceived WT Strategies & Practices Perceived ST Structure/Type WT Granularity, Size & Complexity ST Granularity, Size & Complexity Experience on Work Task Perceived WT Granularity, Size & Complexity Perceived ST Strategies & Practices WT Dependencies ST Dependencies Experience on Search Task Perceived WT Dependencies Perceived ST Specificity & Complexity WT Requirements ST Requirements Stage in Work Task Execution Perceived WT Requirements Perceived ST Dependencies WT Domain & Context ST Domain & Context Perception of Socio-Org. Context Perceived WT Domain & Context Perceived ST Stability Sources of Difficulty Perceived ST Domain & Context Motivation & Emotional State
Essir2011 21 Ingwersen
Essir2009 22 Ingwersen
Document and Source IR Engines IT Component IR Inter-faces Access and Interaction
Document Structure Exact Match Models Domain Model Attributes Interaction Duration Document Types Best Match Models System Model Features Actors or Components Document Genres Degree of Doc. Structure and Content Used User Model Features Kind of Interaction and Access Information Type in Document Use of NLP to Document Indexing System Model Adaption Strategies and Tactics Communication Function
Representation User Model Building Purpose of Human Communication Temporal Aspects Use of Weights in Doc. indexing Request Model Builder Purpose of System Communication Document Sign Language Degree of Req. Structure and Content Used Retrieval Strategy Interaction Mode Layout and Style Use of NLP to Request Indexing Response Generation Least effort Factors Document Isness
Representation Feedback Generation
Use of Weights in Requests Mapping ST History Contextual Hyperlink Structure Explanation Features Human Source (see Actor) Transformation of Messages Scheduler Essir2011 22 Ingwersen
Essir2009 23 Ingwersen
1.
2.
3.
Essir2011 23 Ingwersen
Essir2009 24
Research Frameworks vs. Models Central components of Interactive IR (IIR) The Integrated Cognitive Research Framework for IR
Short-term IR interaction experiments Sample study – Diane Kelly (2005/2007)
Essir2011 24 Ingwersen
Docu- ments Represen- tation Database Search request Query Matching Represen- tation Query Result Evaluation Result Evaluation Relevance assessment Recall base
Max. ONE Relevance Feedback Run Allowed
Essir2011 25 Ingwersen
Essir2011 26 Ingwersen
Essir2011 27 Ingwersen
That is why this setting is ´interactive ultra-light´
Graded relevance assessments possible
Can be used OUTSIDE traditional test collections!
The setting is limited in realism (only 2 runs)
Essir2011 28 Ingwersen
Pseudo RF
Work task perception Search task perception Actor need
Essir2011 29 Ingwersen
Kelly, D., Dollu, V.D. & Xin Fu (2005). The loquacious user:
Essir2011 30 Ingwersen
3.
Essir2011 31 Ingwersen
Essir2011 32 Ingwersen
1.
2.
Essir2011 33 Ingwersen
Strength:
Easy to apply existing test collections, with … Relevance assessments existing a priori (as in TREC or INEX) New relevance assessments possible – with graded assessments
Can lead to more solid interactive investigations in later studies
Weakness:
Are all variable values known?? (people means hidden ones!) ‘Ultra-light’ IIR is limited in realism (1-2 iterations; context
Limited number of documents assessed (per test person)
Essir2009 34
Research Frameworks vs. Models Central components of Interactive IR (IIR) The Integrated Cognitive Research Framework for IR
Short-term IR interaction experiments Sample study – Diane Kelly (2005/2007)
Ingwersen
Essir2011 34 Ingwersen
Essir2011 35 Ingwersen
Interactive-light session-based IR studies Request types Test persons Design of task-based simulated search situations Relevance and evaluation measures in IIR Sample study – Pia Borlund (2000; 2003b)
Docu- ments Represen- tation Database Search request Query Matching Represen- tation Query Result Evaluation Result Evaluation Relevance assessment Recall base
MANY Relevance Feedback Runs Allowed Searcher MUST DO Posteriori Relevance Assessments
Context
Essir2011 36 Ingwersen
Essir2011 37 Ingwersen
Observation Thinking (talking) aloud - Introspection Eye-tracking Critical incidence Questionnaires Interviews (structured; open-ended; closed)
Post or/and pre-interviews
Focus groups Diaries – Self reporting Logging and recording of behavior (system/client logs)
Assessments of relevance
Essir2011 38 Ingwersen
Natural request/ real
Assigned to test person ‘Query’ is the retrieval
Topical (as TREC ‘topics) Factual ‘Known Item’ Other metadata Simulated Task Situation
‘Sample’ as request Simplistic request formulation
Essir2011 39 Ingwersen
Number depends on goal of research & no. of
Behavioral field study/experiment: many persons
Performance-like field experiment: many search
Note: Sanderson et al. paper: IIIX 2005 on no. of topics
The best design: always > 25 persons
Essir2011 40 Ingwersen
You have 2 x 10 test persons (doctors & med. stud.) They need to do 12 search jobs per person = 120
or 2 x 20 persons doing 6 jobs each.
Essir2011 41 Ingwersen
Essir2011 42 Ingwersen
Interactive-light session-based IR studies Request types Test persons Design of task-based simulated search situations Relevance and evaluation measures in IIR Sample study – Pia Borlund (2000; 2003a)
Essir2011 43 Ingwersen
Essir2011 44 Ingwersen
Simulated situation: sim A Simulated work task situation: After your graduation you will be looking for a job in industry. You want information to help you focus your future job
information about employment patterns in industry and what kind of qualifications employers will be looking for from future employees. Indicative request: Find for instance something about future employment trends in industry, i.e. areas of growth and decline.
Essir2011 45 Ingwersen
Docs Repr DB Request Query Match Repr Result
A: Recall, precision, efficiency, (quality of information/process) B: Usability, Graded rel., CumGain; Quality of information/process C: Quality of info & work task result; Graded R.
Work Task Seeking Task Seeking Process Work Process Task Result Seeking Result
Evaluation Criteria:
Work task context Seeking context IR context Socio-organizational& cultural context
D: Socio-cognitive relevance; Social
utility: rating; citations; inlinks;
Essir2011 46 Ingwersen
Realistic assessment behaviour Indication of users’ subjective impression of system performance
Nielsen, 2006) Other measurements to be used on Interaction Process:
Display time; No. of requests/queries; Visits & Downlods Selection patterns; Views & clicks; Social utility assessments; No. of documents assessed; Perceived ease of process; …
Essir2011 47 Ingwersen
Interactive-light session-based IR studies Request types Test persons Design of task-based simulated search situations Relevance and evaluation measures in IIR Sample study – Pia Borlund (2000; 2003a)
Essir2011 48 Ingwersen
Essir2011 49 Ingwersen
Essir2011 50 Ingwersen
Natural Work Tasks (WT) & Org Natural Search Tasks (ST) Actor Perceived Work Tasks Perceived Search Tasks
WT Structure ST Structure Domain Knowledge Perceived WT Structure Perceived Information Need Content WT Strategies & Practices ST Strategies & Practices IS&R Knowledge Perceived WT Strategies & Practices Perceived ST Structure/Type WT Granularity, Size & Complexity ST Granularity, Size & Complexity Experience on Work Task Perceived WT Granularity, Size & Complexity Perceived ST Strategies & Practices WT Dependencies ST Dependencies Experience on Search Task Perceived WT Dependencies Perceived ST Specificity & Complexity WT Requirements ST Requirements Stage in Work Task Execution Perceived WT Requirements Perceived ST Dependencies WT Domain & Context ST Domain & Context Perception of Socio-Org. Context Perceived WT Domain & Context Perceived ST Stability Sources of Difficulty Perceived ST Domain & Context Motivation & Emotional State
Independent Variables
Essir2011 51 Ingwersen
Document and Source IR Engines IT Component IR Inter-faces Access and Interaction
Document Structure Exact Match Models Domain Model Attributes Interaction Duration Document Types Best Match Models System Model Features Actors or Components Document Genres Degree of Doc. Structure and Content Used User Model Features Kind of Interaction and Access Information Type in Document Use of NLP to Document Indexing System Model Adaption Strategies and Tactics Communication Function
Representation User Model Building Purpose of Human Communication Temporal Aspects Use of Weights in Doc. indexing Request Model Builder Purpose of System Communication Document Sign Language Degree of Req. Structure and Content Used Retrieval Strategy Interaction Mode Layout and Style Use of NLP to Request Indexing Response Generation Least effort Factors Document Isness
Representation Feedback Generation
Use of Weights in Requests Mapping ST History Contextual Hyperlink Structure Explanation Features Human Source (see Actor) Transformation of Messages Scheduler
Controlled Variables
Essir2011 52 Ingwersen
Integrating context variables Live systems & (simulated/real) work tasks Sample Study – Marianne Lykke Nielsen (2001; 2004)
Essir2011 53 Ingwersen
Essir2011 54 Ingwersen
Integrating context variables Live systems & (simulated) work tasks Sample Study – Marianne Lykke Nielsen (2001; 2004)
Essir2011 55 Ingwersen
Marianne L. Nielsen (2004): Task-based evaluation
Research setting: Danish Pharmaceutical Company Goal: To observe if a company thesaurus (ontology)
Essir2011 56 Ingwersen
Nielsen, M.L. (2001). A framework for work task based
Made from several association tests with 35 employees
This thesaurus was larger in number of entries (379
Essir2011 57 Ingwersen
20 test persons from the basic and clinical researchers,
3 simulated search task situations (next slide) per
“Blind testing” of the two thesaurus types: test
Essir2011 58 Ingwersen
Essir2011 59 Ingwersen
Essir2011 60 Ingwersen
Essir2011 61 Ingwersen
Natural Work/Search Task Org. setting Perceived Work Task Structure; Complexity (high) Perceived Information Need Database; Retrieval Engine; Interface
Essir2011 62 Ingwersen
Metadata Struc.
Work task perception Search task perception Actor Char.
Essir2011 63 Ingwersen
Document and Source IR Engines IT Component IR Inter-faces Access and Interaction
Document Structure Exact Match Models Domain Model Attributes Interaction Duration Document Types Best Match Models System Model Features Actors or Components Document Genres Degree of Doc. Structure and Content Used User Model Features Kind of Interaction and Access Information Type in Document Use of NLP to Document Indexing System Model Adaption Strategies and Tactics Communication Function
Representation User Model Building Purpose of Human Communication Temporal Aspects Use of Weights in Doc. indexing Request Model Builder Purpose of System Communication Document Sign Language Degree of Req. Structure and Content Used Retrieval Strategy Interaction Mode Layout and Style Use of NLP to Request Indexing Response Generation Least effort Factors Document Isness
Representation Feedback Generation
Use of Weights in Requests Mapping ST History Contextual Hyperlink Structure Explanation Features Human Source (see Actor) Transformation of Messages Scheduler
Essir2011 64 Ingwersen
Natural Work Tasks (WT) & Org Natural Search Tasks (ST) Actor Perceived Work Tasks Perceived Search Tasks
WT Structure ST Structure Domain Knowledge Perceived WT Structure Perceived Information Need Content WT Strategies & Practices ST Strategies & Practices IS&R Knowledge Perceived WT Strategies & Practices Perceived ST Structure/Type WT Granularity, Size & Complexity ST Granularity, Size & Complexity Experience on Work Task Perceived WT Granularity, Size & Complexity Perceived ST Strategies & Practices WT Dependencies ST Dependencies Experience on Search Task Perceived WT Dependencies Perceived ST Specificity & Complexity WT Requirements ST Requirements Stage in Work Task Execution Perceived WT Requirements Perceived ST Dependencies WT Domain & Context ST Domain & Context Perception of Socio-Org. Context Perceived WT Domain & Context Perceived ST Stability Sources of Difficulty Perceived ST Domain & Context Motivation & Emotional State
Essir2011 65 Ingwersen
Finding synonyms and /or more specific terms Clarifying meaning (in task perspective) of terms
Essir2011 66 Ingwersen
Basic researchers (exploring new drugs) Clinical researchers (clinical drug tests) This also concerns the satisfaction of the use of
Essir2011 67 Ingwersen
Integrating context variables Live systems & (simulated) work tasks Sample Study – Marianne Lykke Nielsen (2001; 2004)
Essir2011 68 Ingwersen
In pure ‘laboratory experiments’ only simulations of
If one wishes to stick to existing test collections, with
Requires short-term IR interaction; In the form of ‘laboratory studies’. Number of test persons, search jobs and research setting follow
Essir2011 69 Ingwersen
IR interaction ‘light’ entails session-based IR, with test
Can be carried out as laboratory study or field experiment
Like in ‘ultra-light’ and ‘naturalistic’ IR, number of test
IR interaction ‘light’: assigned realistic requests,
Naturalistic IR interaction assumes natural tasks
Essir2011 70 Ingwersen
Essir2011 71 Ingwersen
Essir2011 72 Ingwersen