How do users formulate their queries? A morpho-syntactic analysis - - PowerPoint PPT Presentation

how do users formulate their queries a morpho syntactic
SMART_READER_LITE
LIVE PREVIEW

How do users formulate their queries? A morpho-syntactic analysis - - PowerPoint PPT Presentation

11th European Conference of Medical and Health Libraries How do users formulate their queries? A morpho-syntactic analysis Nicolas Ariste Fairon Life sciences library, University of Liege, 4000 Lige, Belgium <nicolas.fairon@ulg.ac.be>


slide-1
SLIDE 1

How do users formulate their queries? A morpho-syntactic analysis

Nicolas Ariste Fairon Life sciences library, University of Liege, 4000 Liège, Belgium <nicolas.fairon@ulg.ac.be>

11th European Conference of Medical and Health Libraries

24th of June, Nicolas Fairon

slide-2
SLIDE 2

2

Queries formulated in French natural language

Natural Language Processing Automatic extraction of concepts

Medline search strategy

French MeSH

slide-3
SLIDE 3

3

The Facts

Despite the efforts, many users remain unable to perform

an efficient Medline research. Why?

Introduction – Material & Methods – Results - Conclusions

Bad query formulation Bad knowledge of MeSH terms Not enough practice Problems with boolean operator

slide-4
SLIDE 4

4

What exists

Medline interfaces, with interesting features:

Query expansion

Searching MeSH and keywords Automatic explosion...

Permuted index MeSH translations Elementary tools for natural language searching

Introduction – Material & Methods – Results - Conclusions

slide-5
SLIDE 5

5

Natural Language Approach

Analyzing the query to find relevant concepts

Introduction – Material & Methods – Results - Conclusions Medline interfaces Natural language

Precision Recall Controlled language

Torticollis

83.7% 100%

Torticollis [MeSH]

Congenital torticollis

40.0% 90.0%

Torticollis/cn [MeSH]

Smoking adverse effects 4.2%

44.1%

Smoking/ae [MeSH]

complexity Efficiency

slide-6
SLIDE 6

6

What we want to do

Introduction – Material & Methods – Results - Conclusions

slide-7
SLIDE 7

7

Materials & Methods

Query submitted by user Corrected Semantically tagged CORPUS All queries Analysis Descriptive Concepts extraction Dictionary Local grammar Hybrid Approaches

Introduction – Material & Methods – Results - Conclusions

Manual Automatic

slide-8
SLIDE 8

8

Queries'collecting

Query submitted by user Corrected Semantically tagged

Introduction – Material & Methods – Results - Conclusions

Je cherche des articles sur le trétement du canser du sein. Je cherche des articles sur le traitement du cancer du sein. Je cherche des articles sur le {w11s*traitement*} du {w21*cancer du sein*}.

Correcting Tagging

slide-9
SLIDE 9

9

Manual tagging

To append semantic flags to useful concepts To identify and keep track of every concept To evaluate the efficiency of our application

Introduction – Material & Methods – Results - Conclusions

Query submitted by user Corrected Semantically tagged

slide-10
SLIDE 10

10

The Corpus

Query submitted by user Corrected Semantically tagged CORPUS All queries

Introduction – Material & Methods – Results - Conclusions

A web application to store for each query

Raw, corrected, and tagged versions Medline search history done by a scientific librarian

195 queries formulated by 68 different users 6 985 words

slide-11
SLIDE 11

11

Extracting concepts

Analysis Descriptive

Introduction – Material & Methods – Results - Conclusions

Concepts extraction Dictionary Local grammar Hybrid UNITEX Dictionnaries French MeSH Hand-made Local grammars

slide-12
SLIDE 12

12

Evaluation of automatic extraction

Introduction – Material & Methods – Results - Conclusions

List B (reference) List A COMPARISON Recall

Precision

Concepts extraction Queries Concepts CORPUS u n t a g g e d tagged VS

slide-13
SLIDE 13

13

Descriptive analysis

464 concepts have been identified

Introduction – Material & Methods – Results - Conclusions

slide-14
SLIDE 14

14

Concepts' extraction: dictionary approach

Applying MeSH dictionary to queries in order to

identify them.

Introduction – Material & Methods – Results - Conclusions

MeSH terms Subheadings Keywords

10 20 30 40 50 60 70 80 90 100

Recall Precision

%

slide-15
SLIDE 15

15

Concepts'extraction: Local grammar approach

Use recognition patterns relying on

queries'morphology and syntax.

Introduction – Material & Methods – Results - Conclusions

MeSH terms Subheadings Keywords

10 20 30 40 50 60 70 80 90 100

Recall Precision

%

slide-16
SLIDE 16

16

Concepts'extraction: Hybrid approach

Using local grammars combined with dictionaries

Introduction – Material & Methods – Results - Conclusions MeSH terms Subheadings Keywords

10 20 30 40 50 60 70 80 90 100 Recall Precision %

slide-17
SLIDE 17

17

Conclusions

Creating a new interface based on natural language

processing involves

Concept mapping Concepts combination

Hybrid approach shows best results

Dictionaries Local grammar

Dictionaries'quality influes on performance

Introduction – Material & Methods – Results - Conclusions

slide-18
SLIDE 18

18

What's next?

Disambiguiation of fuzzy MeSH concepts Combination of the concepts with adequate booleans

  • perators

Made the tool available to users as a web application

Introduction – Material & Methods – Results - Conclusions

slide-19
SLIDE 19

19

Thank you for your attention

nicolas.fairon@ulg.ac.be

Open source tools used for the work and the presentation :