What CLIR researchers assume User is User needs Machine happy (or - PowerPoint PPT Presentation

iCLEF 2009 overview tags : image_search, multilinguality, interactivity, log_analysis, web2.0 J U LI O G O N ZA LO V Í CTO R P E I N A D O J U LI O G O N ZA LO , V Í CTO R P E I N A D O , P A U L CLO U G H & J U S S I K A R LG R E N CL E F 2 0 0 9 , CO R F U

What CLIR researchers assume User is User needs Machine happy (or information. searches. not).

But finding is a matter of two But finding is a matter of two Fast stupid smart slow Room for collaboration!

“Users screw things up” g p Can’t be reset Differences between systems dissappear y pp Differences between interactive systems too! Diff b t i t ti t t ! Who needs QA systems having a search engine and a user?

But CLIR is different

Help! p

iCLEF methodology: hypothesis-driven gy yp � hypothesis � Reference & contrastive systems, topics, users y , p , � latin-square pairing between system/ topic/ user � Features: � Hypothesis-based (vs. operational) � Controlled (vs. ecological) � Deductive (vs. inductive) � Sound

iCLEF 2001-2005: tasks 5 On newswire On newswire On im age archives On im age archives � Cross-Language � Cross-Language Image Document Selection Document Selection search. search. � Cross-Language query formulation and formulation and refinement � Cross-Language � Cross Language Question Answering

Practical outcome!

iCLEF 2001-2005: problems 5 p � Unrealistic search scenario, user sample U li ti h i l opportunistic � Experimental design not cost-effective i l d i ff i � Only one aspect of CLIR at a time � High cost of recruiting, training, observing users. i h f i i i i b i

Pick a document for “saffron”

Pick an illustration for “saffron”

Flickr

iCLEF 2006 Topics Topics Methodology Methodology � Ad hoc : find as many � Participants m ust propose their own photographs of (different) m ethodology and m ethodology and european parliaments as european parliaments as experim ent design possible. � Creative : find five illustrations for this article about saffron in Italy. � Visual : What is the name � Visual : What is the name of the beach where this crab is lying on?

Explored issues p • How users deal with native/ passive/ unknown user’s languages? behaviour behaviour • Do they actually use CLIR facilities when available? il bl ? user’s • Satisfaction (all tasks) • Completeness (creative,ad-hoc) perceptions • Quality (creative) search search • How many facets were retrieved (creative, ad-hoc) H f i d ( i d h ) effectiveness • Was the image found? (visual)

iCLEF 2008/ 2009 / 9 Produce reusable Much larger set of dataset dataset users users search log search log online gam e analysis task.

iCLEF 2008/ 2009: Log Analysis / 9 g y Online game: see this image? Find it! (in any of six languages) Game interface features ML search assistance Users register with a language profile Users register with a language profile Dataset: rich search log • All search interactions • Explicit success/ failure • Post-search questionnaires Queries • Easy to find with the appropriate tags ( � typically 3 tags) • Hint mechanism (first target language then tags) • Hint mechanism (first target language, then tags)

Simultaneous search in six languages g g

Boolean search with translations

Relevance feedback

Assisted query translation y q

User profiles p

User rank (Hall of Fame) ) (

Group rank p

Hint mechanism

Language skills bias in 2008 g g Native Languages Language Skills: English DE EN native native ES active FR passive IT unknown unknown NL Other

Language skills bias in 2008 g g Target language was for the user… 31% active passive 55% 55% unknown k 14%

Selection of topics (images) p ( g ) � No English annotations (new for 20 0 9) N E li h t ti ( f ) � Not buried in search results � Visual cues � No named entities

Harvested logs g 20 0 8 20 0 8 20 0 9 20 0 9 � 312 users / 41 teams 130 users / 18 teams � � 5101 complete search sessions 2410 complete search sessions � � Linguistics students, � Linguistics students CS & linguistics students, CS & linguistics students � � photography fans, IR photography fans, IR researchers from industry and researchers from industry and academia monitored groups academia, monitored groups, academia monitored groups academia, monitored groups, other other.

Language skills bias in 2009 g g 9 Target language was for the user… 0% 1% active passive unknown k 99% 99%

Log statistics g

Distribution of users Distribution of users

Native languages Native languages Language skills g g Interface Interface

Language skills (II) ) Spanish Spanish ( g g English English

Language skills (III) ) ( Dutch Dutch g g Germ an Germ an

Language skills (and IV) ) Italian Italian ( g g French French

Participants (I): log analysis p ( ) g y U i University of i f • Goal: correlation between lexical ambiguity in queries and search success Alicante • Methodology: analysis of full search log • Goal: correlations between several search parameters UAIC and search success • Methodology: own set of users, search log analysis M th d l t f h l l i • Goal: correlation between search strategies and UNED UNED search success h • Methodology: analysis of full search log • Goal: study confidence and satisfaction from search SICS logs • Methodology: analysis of full search log

Participants (II): other strategies p ( ) g • Goal: focus on users’ trust and confidence to Manchester reveal their perceptions of the task. Metropolitan Metropolitan • Methodology: Own set of users, own set of M h d l O f f queries, training, observational study, University retrospective thinking aloud, questionnaires. • Goal: understanding challenges when G l d t di h ll h searching images that have multilingual University of annotations. North Texas North Texas • Methodology: Own set of users training • Methodology: Own set of users, training, questionnaires, interviews, observational analysis.

Discussion � 2008+2009 logs = “iCLEF legacy” 8 l “iCLEF l ” � 442 users w. heterogeneous language skills � 7511 search sessions w. questionnaires � iCLEF has been a success in terms of � iCLEF has been a success in terms of providing insights into interactive CLIR � … and a failure in terms of gaining adepts?

So long! g

the iCLEF Bender Awards And now…

What CLIR researchers assume User is User needs Machine happy (or - PowerPoint PPT Presentation

iCLEF 2009 overview tags : image_search, multilinguality, interactivity, log_analysis, web2.0 J U LI O G O N ZA LO V CTO R P E I N A D O J U LI O G O N ZA LO , V CTO R P E I N A D O , P A U L CLO U G H & J U S S I K A R LG R E N

Techniques to improve Dictionary Based CLIR Sai Madhurya Peyyeti KX48810 Different Techniques

End-to-End Neural CLIR by Sharing Representation LILY Spring 2018 Workshop Rui Zhang

Y o u Researchers/Students Researchers = Medical Doctors, Sociologists, Nurses,

A Survey on Cross-language IR (CLIR) Naveen Yamparala (RS09174) Types of IR (Language based)

Analysis of Cross Language Information Retrieval methods Introduction to Cross Language

1 Translation model Language model Dictionaries used Languages Name #Entries Type P(S|T)

Revisiting Document Length Hypotheses NTCIR-4 CLIR and Patent Experiments at Patolis 4 June 2004

IASL System for NTCIR-6 Korean-Chinese CLIR Yu-Chun Wang Cheng-Wei Lee Richard Tzong-Han Tsai

How Deep Learning is making MT and other areas converge? MARTA R. COSTA-JUSS UNIVERSITAT

Digitizing Hidden Collections Recipient Informational Webinar June 6, 2018

for CLIR CLEF09: Ad-hoc (TEL) Session, Corfu, Greece Institute AIFB University of Karlsruhe

Modeling Power and Pilgrimage in Medieval Orkney Jennifer Grayburn Julie Gibson CLIR

Dictionary and Monolingual Corpus-based Query Translation for Basque-English CLIR Xabier Saralegi

An Assume-Guarantee Method for Modular An Assume-Guarantee Method for Modular Verification of

CE Data from the Perspectives of Researchers and Survey Managers Researchers and Survey Managers

Researchers, Funders and Users Researchers, Funders and Users Presented by: Vern Christensen,

Cordery Deputy Director of Director of Policy & Chair - Skills for Care Policy, NHS England

Workshop `Pauli2016 University of Oxford 12-15 April 2016 Welcome! Reduced Density Matrices

Directories & Continuation This Week: How to program with directories more Reading:

Data-Driven Neogeography Michal Migurski, Stamen Design BAAMA September 2011 1 Stamen Eric

Extended Static Checking for Java Lukas Erlacher TU Mnchen - Seminar Verification 14. Juli

Wheeled Mobile Robots 1 Mechanics of Mobile Robots companion slides for the blackboard lecture

Performance characteristics of a small animal PET camera for molecular imaging Hastings DL 1 ,

Natural Language Processing The Speech Signal Dan Klein UC Berkeley Speech in a Slide Frequency

Sambuz

Useful Links

Newsletter

Mail Us

What CLIR researchers assume User is User needs Machine happy (or - PowerPoint PPT Presentation

iCLEF 2009 overview tags : image_search, multilinguality, interactivity, log_analysis, web2.0 J U LI O G O N ZA LO V CTO R P E I N A D O J U LI O G O N ZA LO , V CTO R P E I N A D O , P A U L CLO U G H & J U S S I K A R LG R E N

Techniques to improve Dictionary Based CLIR Sai Madhurya Peyyeti KX48810 Different Techniques

End-to-End Neural CLIR by Sharing Representation LILY Spring 2018 Workshop Rui Zhang

Y o u Researchers/Students Researchers = Medical Doctors, Sociologists, Nurses,

A Survey on Cross-language IR (CLIR) Naveen Yamparala (RS09174) Types of IR (Language based)

Analysis of Cross Language Information Retrieval methods Introduction to Cross Language

1 Translation model Language model Dictionaries used Languages Name #Entries Type P(S|T)

Revisiting Document Length Hypotheses NTCIR-4 CLIR and Patent Experiments at Patolis 4 June 2004

IASL System for NTCIR-6 Korean-Chinese CLIR Yu-Chun Wang Cheng-Wei Lee Richard Tzong-Han Tsai

How Deep Learning is making MT and other areas converge? MARTA R. COSTA-JUSS UNIVERSITAT

Digitizing Hidden Collections Recipient Informational Webinar June 6, 2018

for CLIR CLEF09: Ad-hoc (TEL) Session, Corfu, Greece Institute AIFB University of Karlsruhe

Modeling Power and Pilgrimage in Medieval Orkney Jennifer Grayburn Julie Gibson CLIR

Dictionary and Monolingual Corpus-based Query Translation for Basque-English CLIR Xabier Saralegi

An Assume-Guarantee Method for Modular An Assume-Guarantee Method for Modular Verification of

CE Data from the Perspectives of Researchers and Survey Managers Researchers and Survey Managers

Researchers, Funders and Users Researchers, Funders and Users Presented by: Vern Christensen,

Cordery Deputy Director of Director of Policy &amp; Chair - Skills for Care Policy, NHS England

Workshop `Pauli2016 University of Oxford 12-15 April 2016 Welcome! Reduced Density Matrices

Directories &amp; Continuation This Week: How to program with directories more Reading:

Data-Driven Neogeography Michal Migurski, Stamen Design BAAMA September 2011 1 Stamen Eric

Extended Static Checking for Java Lukas Erlacher TU Mnchen - Seminar Verification 14. Juli

Wheeled Mobile Robots 1 Mechanics of Mobile Robots companion slides for the blackboard lecture

Performance characteristics of a small animal PET camera for molecular imaging Hastings DL 1 ,

Natural Language Processing The Speech Signal Dan Klein UC Berkeley Speech in a Slide Frequency

Sambuz

Useful Links

Newsletter

Mail Us

Cordery Deputy Director of Director of Policy & Chair - Skills for Care Policy, NHS England

Directories & Continuation This Week: How to program with directories more Reading: