Automatic Analysis of Author Judgment in Scientific Articles Based - - PowerPoint PPT Presentation

automatic analysis of author judgment in scientific
SMART_READER_LITE
LIVE PREVIEW

Automatic Analysis of Author Judgment in Scientific Articles Based - - PowerPoint PPT Presentation

Automatic Analysis of Author Judgment in Scientific Articles Based on Semantic Annotation Marc Bertin, Iana Atanassova and Jean-Pierre Descl es Paris-Sorbonne University, LaLIC Laboratory 20 May 2009 FLAIRS 2009 Introduction Implementation


slide-1
SLIDE 1

Automatic Analysis of Author Judgment in Scientific Articles Based on Semantic Annotation

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es

Paris-Sorbonne University, LaLIC Laboratory

20 May 2009 FLAIRS 2009

slide-2
SLIDE 2

Introduction Implementation Demonstration Evaluation Conclusion

Outline

1

Introduction Problem Semantic Annotation Strategy

2

Implementation Text processing Corpus Semantic Map

3

Demonstration

4

Evaluation Methodology Results

5

Conclusion

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-3
SLIDE 3

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy

1

Introduction Problem Semantic Annotation Strategy

2

Implementation Text processing Corpus Semantic Map

3

Demonstration

4

Evaluation Methodology Results

5

Conclusion

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-4
SLIDE 4

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 How can we use the bibliographic citations of authors in texts? 2 Our aim is to identify relations between authors and more

precisely the nature of these relations by an automatic semantic annotation of citations.

3 There are various motivations to cite and there are many

functions of citation. Citation is a complex phenomenon.

4 Citation analysis requires linguistic approaches for the

categorization of the relations between authors.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-5
SLIDE 5

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 How can we use the bibliographic citations of authors in texts? 2 Our aim is to identify relations between authors and more

precisely the nature of these relations by an automatic semantic annotation of citations.

3 There are various motivations to cite and there are many

functions of citation. Citation is a complex phenomenon.

4 Citation analysis requires linguistic approaches for the

categorization of the relations between authors.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-6
SLIDE 6

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 How can we use the bibliographic citations of authors in texts? 2 Our aim is to identify relations between authors and more

precisely the nature of these relations by an automatic semantic annotation of citations.

3 There are various motivations to cite and there are many

functions of citation. Citation is a complex phenomenon.

4 Citation analysis requires linguistic approaches for the

categorization of the relations between authors.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-7
SLIDE 7

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 How can we use the bibliographic citations of authors in texts? 2 Our aim is to identify relations between authors and more

precisely the nature of these relations by an automatic semantic annotation of citations.

3 There are various motivations to cite and there are many

functions of citation. Citation is a complex phenomenon.

4 Citation analysis requires linguistic approaches for the

categorization of the relations between authors.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-8
SLIDE 8

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 Localization of relevant text segments:

Segmentation into sentences Identification of indexed references in the sentences: by finite state automata

2 Localization of the textual segments in which we will most

probably find the judgment of the author on another author:

Hypothesis: author judgments are localized in the textual space close to an indexed reference. Segments containing references carry potentially information on the type of relation between the authors.

3 Automatic semantic annotation: Contextual Exploration

Method (see Descl´ es 2006)

4 Exploitation of the annotation and text navigation Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-9
SLIDE 9

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 Localization of relevant text segments:

Segmentation into sentences Identification of indexed references in the sentences: by finite state automata

2 Localization of the textual segments in which we will most

probably find the judgment of the author on another author:

Hypothesis: author judgments are localized in the textual space close to an indexed reference. Segments containing references carry potentially information on the type of relation between the authors.

3 Automatic semantic annotation: Contextual Exploration

Method (see Descl´ es 2006)

4 Exploitation of the annotation and text navigation Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-10
SLIDE 10

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 Localization of relevant text segments:

Segmentation into sentences Identification of indexed references in the sentences: by finite state automata

2 Localization of the textual segments in which we will most

probably find the judgment of the author on another author:

Hypothesis: author judgments are localized in the textual space close to an indexed reference. Segments containing references carry potentially information on the type of relation between the authors.

3 Automatic semantic annotation: Contextual Exploration

Method (see Descl´ es 2006)

4 Exploitation of the annotation and text navigation Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-11
SLIDE 11

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 Localization of relevant text segments:

Segmentation into sentences Identification of indexed references in the sentences: by finite state automata

2 Localization of the textual segments in which we will most

probably find the judgment of the author on another author:

Hypothesis: author judgments are localized in the textual space close to an indexed reference. Segments containing references carry potentially information on the type of relation between the authors.

3 Automatic semantic annotation: Contextual Exploration

Method (see Descl´ es 2006)

4 Exploitation of the annotation and text navigation Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-12
SLIDE 12

Introduction Implementation Demonstration Evaluation Conclusion Problem Semantic Annotation Strategy 1 Localization of relevant text segments:

Segmentation into sentences Identification of indexed references in the sentences: by finite state automata

2 Localization of the textual segments in which we will most

probably find the judgment of the author on another author:

Hypothesis: author judgments are localized in the textual space close to an indexed reference. Segments containing references carry potentially information on the type of relation between the authors.

3 Automatic semantic annotation: Contextual Exploration

Method (see Descl´ es 2006)

4 Exploitation of the annotation and text navigation Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-13
SLIDE 13

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

1

Introduction Problem Semantic Annotation Strategy

2

Implementation Text processing Corpus Semantic Map

3

Demonstration

4

Evaluation Methodology Results

5

Conclusion

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-14
SLIDE 14

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Linguistic approach: Contextual Exploration Method Semantic annotation tool: the EXCOM (Multilingual Contextual Exploration) system, developed by the LaLIC Laboratory The major objective for the EXCOM system is to explore the semantics of texts for enhancing information extraction and retrieval through automatic annotation of semantic relations.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-15
SLIDE 15

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

According to our protocol, we take into consideration texts containing indexed references and bibliography, such as scientific articles, reports, PHD dissertations, publications, etc. Our corpora are in French, in the domains of Social Sciences and Humanities.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-16
SLIDE 16

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

According to our protocol, we take into consideration texts containing indexed references and bibliography, such as scientific articles, reports, PHD dissertations, publications, etc. Our corpora are in French, in the domains of Social Sciences and Humanities.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-17
SLIDE 17

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Corpus Language Coverage Format CALS fr 33 texts pdf LaLIC fr 8 texts doc/pdf ALSIC fr/eng 1998-2007 pdf/html TALN fr/eng 1999-2005 pdf Intellectica fr/eng 1991-2002 pdf IRISA fr/eng 1984-2006 pdf/ps PhD Theses fr 6 PhD theses pdf

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-18
SLIDE 18

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-19
SLIDE 19

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-20
SLIDE 20

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (1)

Result:

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-21
SLIDE 21

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (1)

Result:

”Observations on individual MTs in a microscopic flow cell (Walker et al 1991) showed that polymerizing MTs transit very rapidly (within 14 s) upon dilution, suggesting that the cap size is fairly small, less than 100 subunits.”

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-22
SLIDE 22

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (1)

Result:

”Observations on individual MTs in a microscopic flow cell (Walker et al 1991) showed that polymerizing MTs transit very rapidly (within 14 s) upon dilution, suggesting that the cap size is fairly small, less than 100 subunits.” ”Measurements of the Pos of PBM vesicles by Niemietz & Tyerman (2000) yielded values that were lower than those measured by Rivers et al. (1997).”

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-23
SLIDE 23

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (1)

Result:

”Observations on individual MTs in a microscopic flow cell (Walker et al 1991) showed that polymerizing MTs transit very rapidly (within 14 s) upon dilution, suggesting that the cap size is fairly small, less than 100 subunits.” ”Measurements of the Pos of PBM vesicles by Niemietz & Tyerman (2000) yielded values that were lower than those measured by Rivers et al. (1997).”

Method:

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-24
SLIDE 24

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (1)

Result:

”Observations on individual MTs in a microscopic flow cell (Walker et al 1991) showed that polymerizing MTs transit very rapidly (within 14 s) upon dilution, suggesting that the cap size is fairly small, less than 100 subunits.” ”Measurements of the Pos of PBM vesicles by Niemietz & Tyerman (2000) yielded values that were lower than those measured by Rivers et al. (1997).”

Method:

”Extraction of DNA from filter samples followed a modification

  • f a method employed by Fuhrman et al. [33] as described by

Abell and Bowman [20].”

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-25
SLIDE 25

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (1)

Result:

”Observations on individual MTs in a microscopic flow cell (Walker et al 1991) showed that polymerizing MTs transit very rapidly (within 14 s) upon dilution, suggesting that the cap size is fairly small, less than 100 subunits.” ”Measurements of the Pos of PBM vesicles by Niemietz & Tyerman (2000) yielded values that were lower than those measured by Rivers et al. (1997).”

Method:

”Extraction of DNA from filter samples followed a modification

  • f a method employed by Fuhrman et al. [33] as described by

Abell and Bowman [20].” ”This model was based on early observations of a relatively long kinetic lag between tubulin polymerization and GTP hydrolysis (Carlier & Pantaloni 1981).”

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-26
SLIDE 26

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (2)

Information:

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-27
SLIDE 27

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (2)

Information:

”It was surmised that this was due to bacterial cells dispersing from particles as the particles decompose and sink, a phenomenon originally proposed by Azam [46].”

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-28
SLIDE 28

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (2)

Information:

”It was surmised that this was due to bacterial cells dispersing from particles as the particles decompose and sink, a phenomenon originally proposed by Azam [46].”

Similarity:

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-29
SLIDE 29

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (2)

Information:

”It was surmised that this was due to bacterial cells dispersing from particles as the particles decompose and sink, a phenomenon originally proposed by Azam [46].”

Similarity:

”Samuels et al. (35,36) reported similar results with the powdery mildewcucumber pathosystem, suggesting that in-soluble Si deposition is a common phenomenon in both dicots and monocots.”

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-30
SLIDE 30

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (2)

Information:

”It was surmised that this was due to bacterial cells dispersing from particles as the particles decompose and sink, a phenomenon originally proposed by Azam [46].”

Similarity:

”Samuels et al. (35,36) reported similar results with the powdery mildewcucumber pathosystem, suggesting that in-soluble Si deposition is a common phenomenon in both dicots and monocots.”

Hypothesis:

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-31
SLIDE 31

Introduction Implementation Demonstration Evaluation Conclusion Text processing Corpus Semantic Map

Semantic annotation: examples (2)

Information:

”It was surmised that this was due to bacterial cells dispersing from particles as the particles decompose and sink, a phenomenon originally proposed by Azam [46].”

Similarity:

”Samuels et al. (35,36) reported similar results with the powdery mildewcucumber pathosystem, suggesting that in-soluble Si deposition is a common phenomenon in both dicots and monocots.”

Hypothesis:

”This structure was originally postulated to be a cap of GTP-tubulin (Mitchison & Kirschner 1984a).”

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-32
SLIDE 32

Introduction Implementation Demonstration Evaluation Conclusion

1

Introduction Problem Semantic Annotation Strategy

2

Implementation Text processing Corpus Semantic Map

3

Demonstration

4

Evaluation Methodology Results

5

Conclusion

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-33
SLIDE 33

Introduction Implementation Demonstration Evaluation Conclusion

Information retrieval

go to

Bibliosemantics

go to

Categorization: PhD theses

go to Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-34
SLIDE 34

Introduction Implementation Demonstration Evaluation Conclusion Methodology Results

1

Introduction Problem Semantic Annotation Strategy

2

Implementation Text processing Corpus Semantic Map

3

Demonstration

4

Evaluation Methodology Results

5

Conclusion

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-35
SLIDE 35

Introduction Implementation Demonstration Evaluation Conclusion Methodology Results

Evaluations

First evaluation: measuring the accuracy of the retained indicators, or indexed references, which have been identified automatically by the Finite State Automata.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-36
SLIDE 36

Introduction Implementation Demonstration Evaluation Conclusion Methodology Results

Evaluations

First evaluation: measuring the accuracy of the retained indicators, or indexed references, which have been identified automatically by the Finite State Automata. Second evaluation: carrying out a session of concordance between human judges in order to evaluate the rates of agreement between them by the Kappa coefficient.

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-37
SLIDE 37

Introduction Implementation Demonstration Evaluation Conclusion Methodology Results

Evaluation 1: Precision and Recall measures

Measuring the capacity of the system to correctly identify the textual segments containing indicators: results, by taking into consideration only the indexed references in texts: Recall Precision 91, 09% 98, 91% results, by taking into consideration also the named entities in the corpus: Recall Precision 67, 15% 98, 91%

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-38
SLIDE 38

Introduction Implementation Demonstration Evaluation Conclusion Methodology Results

[1] [AUT-09] [Author, 2009] Einstein Nature Indexed reference Named entity Identification Regular expression Named entity identification Norms ISO 690 and ISO 690-2 Epistemology Frontier knowledge Core knowledge Out of context None Researcher Researcher General comprehension from the domain ⊲⊲ Growing complexity for the identification ⊲⊲

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-39
SLIDE 39

Introduction Implementation Demonstration Evaluation Conclusion Methodology Results

Evaluation 2: Kappa

Cohen’s weighed Kappa coefficient (Cohen 1960) provides a method to measure numerically the agreement between two or more observators or methods in the case when the judgments are qualitative in nature. Judge A Judge B Answers Correct Incorrect Total Correct 77 10 87 Incorrect 6 7 13 Total 83 17 100 K = 0, 83

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-40
SLIDE 40

Introduction Implementation Demonstration Evaluation Conclusion

1

Introduction Problem Semantic Annotation Strategy

2

Implementation Text processing Corpus Semantic Map

3

Demonstration

4

Evaluation Methodology Results

5

Conclusion

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-41
SLIDE 41

Introduction Implementation Demonstration Evaluation Conclusion

We have already developed:

Categorization of relations between authors using semantic annotation Analysis of the functions of citation Categorization of publications or sets of publications

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-42
SLIDE 42

Introduction Implementation Demonstration Evaluation Conclusion

We have already developed:

Relevant information extraction Concept identification Categorized text syntheses Text navigation

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-43
SLIDE 43

Introduction Implementation Demonstration Evaluation Conclusion

What can we do next?

Science policy by establishing guidelines and setting priorities Detecting emergence and innovation New method for mapping science Establishing author networks

Marc Bertin, Iana Atanassova and Jean-Pierre Descl´ es Automatic Analysis of Author Judgment

slide-44
SLIDE 44

Thank you for your attention!

Further information:

marc.bertin@paris-sorbonne.fr iana.atanassova@gmail.com jean-pierre.descles@paris-sorbonne.fr

Bibliography

Bertin, M.; Descl´ es, J.-P.; Djioua, B. and Krushkov, Y. 2006. Automatic annotation in text for bibliometrics use. FLAIRS 2006. Descl´ es, J.-P. 2006. Contextual exploration processing for discourse automatic annotations of texts. FLAIRS

  • 2006. Invited Speaker.

Moed, H. 2005. Citation Analysis in Research Evaluation. Springer. Small, H. 1982. Citation context

  • analysis. B. Dervin and M. Voigt (Eds.),

Progress in communication sciences 3:287310.