Making Sense of Massive Amounts of Scientific Publications: The - - PowerPoint PPT Presentation

making sense of massive amounts of scientific
SMART_READER_LITE
LIVE PREVIEW

Making Sense of Massive Amounts of Scientific Publications: The - - PowerPoint PPT Presentation

Making Sense of Massive Amounts of Scientific Publications: The Scientific Knowledge Miner Project Francesco Ronzano, Ana Freire, Diego Saez-Trumper, Horacio Saggion 20 seconds 1 paper The Rise of Open Access Science 04 Oct 2013 Vol. 342,


slide-1
SLIDE 1

Making Sense of Massive Amounts of Scientific Publications: The Scientific Knowledge Miner Project

Francesco Ronzano, Ana Freire, Diego Saez-Trumper, Horacio Saggion

slide-2
SLIDE 2

20 seconds… 1 paper

The Rise of Open Access Science 04 Oct 2013

  • Vol. 342, Issue 6154, pp. 58-59

The Scientific Knowledge Miner Project

slide-3
SLIDE 3

The Scientific Knowledge Miner Project

Information Overload (scientific repositories)

slide-4
SLIDE 4

The Scientific Knowledge Miner Project

24,6M 1M 90M 57M

Information Overload (scientific repositories)

slide-5
SLIDE 5

The Scientific Knowledge Miner Project

Sometimes between 2017 and 2021, more than half of the papers available globally are expected to be published as Open Access articles.

Lewis, David W. "The inevitability of open access." College & Research Libraries 73.5 (2012): 493-506.

slide-6
SLIDE 6

The Scientific Knowledge Miner Project

The peculiarities of research publications

TITLE (SUB)SECTION ABSTRACT BIBLIOGRAPHIC ENTRY CAPTION

slide-7
SLIDE 7

The Scientific Knowledge Miner Project

In

  • rder

to take full advantage

  • f

the knowledge present in scientific publications proper semantic indexing, search and content aggregation approaches, are required. Benefits: § Search of new information on specific scientific problems § Semi-automatic assessment of papers and research proposals § Hypothesis formulation § Tracking of scientific and technological advances § Scientific intelligence § Assisted report and review writing § Question answering § …

Scientific publications: claims

slide-8
SLIDE 8

The Scientific Knowledge Miner Project

The Scientific Knowledge Miner Project (SKM)

Facilitate the extraction of knowledge from scientific publications across many disciplines. Improve a variety of use cases such as:

  • Citation Characterization
  • Citation Recommendation
  • Summarization

Ø KEY: Papers are enriched with structural, linguistic and semantic information

Semantic Information

SKM

Datasets Software applications Better Scientific Knowledge

Scientific Publications

slide-9
SLIDE 9

The Scientific Knowledge Miner Project

The Scientific Knowledge Miner Project (SKM)

The SKM approach to the analysis of scientific literature:

  • Relies
  • n

a finer-grained analysis

  • f

the contents

  • f

publications

  • Is grounded on the automated characterization of a varied

set of semantic aspects of papers, including the rhetorical structure or the purpose of citations.

slide-10
SLIDE 10

The Scientific Knowledge Miner Project (SKM)

The Scientific Knowledge Miner Project

Online Scientific Publications Crawler Storage Indexing Analysis

METADATA + SEMANTIC INFORMATION + METADATA

slide-11
SLIDE 11

The Scientific Knowledge Miner Project (SKM)

The Scientific Knowledge Miner Project

Online Scientific Publications Crawler Storage Indexing Analysis

METADATA + SEMANTIC INFORMATION + METADATA CRAWLING

slide-12
SLIDE 12

Crawling

The Scientific Knowledge Miner Project

Data Base

METADATA Title, author, conference, year, etc. +

slide-13
SLIDE 13

The Scientific Knowledge Miner Project (SKM)

The Scientific Knowledge Miner Project

Online Scientific Publications Crawler Storage Indexing Analysis

METADATA + SEMANTIC INFORMATION + METADATA

slide-14
SLIDE 14

The Scientific Knowledge Miner Project (SKM)

The Scientific Knowledge Miner Project

Online Scientific Publications Crawler Storage Indexing Analysis

SEMANTIC INFORMATION + METADATA TEXT ANALYSIS METADATA +

slide-15
SLIDE 15
  • Dr. Inventor Text Mining Framework

http://backingdata.org/dri/library/

  • Integrate and customize text mining tools and on-line services

to enable and ease a wide range of scientificpublicationanalyses

  • Papers are enriched with structural, linguistic and semantic

information

  • Self-contained librarymanaged by
  • Focused on textual content
  • Relying on a shared data model (java classes) to representa paper
  • Exposinga convenient API to access the mined information
  • Based on

to managetextual annotations

The Scientific Knowledge Miner Project

slide-16
SLIDE 16
  • Dr. Inventor

Text Mining Framework

PDF to text converter Sentence splitter Inline citation spotter Web based reference parser Citation-aware dep. parser Rhetorical annotator Babelfy WSD and Entity Linker Citation Classifier Extractive summarizer

VIZ

The Scientific Knowledge Miner Project

  • Dr. Inventor Text Mining Framework
slide-17
SLIDE 17

The Scientific Knowledge Miner Project (SKM)

The Scientific Knowledge Miner Project

Online Scientific Publications Crawler Storage Indexing Analysis

METADATA + SEMANTIC INFORMATION + METADATA

slide-18
SLIDE 18

The Scientific Knowledge Miner Project (SKM)

The Scientific Knowledge Miner Project

Online Scientific Publications Crawler Storage Indexing Analysis

METADATA + SEMANTIC INFORMATION + METADATA CONTENT AGGREGATION AND INDEXING

slide-19
SLIDE 19

Indexing

The Scientific Knowledge Miner Project

slide-20
SLIDE 20

The Scientific Knowledge Miner Project (SKM)

The Scientific Knowledge Miner Project

Online Scientific Publications Crawler Storage Indexing Analysis

SEMANTIC INFORMATION + METADATA METADATA +

slide-21
SLIDE 21

The Scientific Knowledge Miner Project (SKM)

The Scientific Knowledge Miner Project

Online Scientific Publications Crawler Storage Indexing Analysis

METADATA + SEMANTIC INFORMATION + METADATA EXPLORATORY VISUAL ANALYTICS

slide-22
SLIDE 22

Analysis

The Scientific Knowledge Miner Project http://backingdata.org/dri/viz/

slide-23
SLIDE 23

The Scientific Knowledge Miner Project

Use Case 1: Citation Characterization

CITATION PURPOSE + 17 sub-purposes

Criticism Substantiation Use Comparison Basis Neutral Enrich citation counts with semantics Experiment new metrics: what do others say about one paper?

slide-24
SLIDE 24

The Scientific Knowledge Miner Project

Use Case 2: Citation Recommendation

SENTENCE RHETORICAL CATEGORY + 3 sub-categories

Background Outcome Challenge Approach Future Work Recommend similar papers / authors

slide-25
SLIDE 25

The Scientific Knowledge Miner Project

Use Case 3: Scientific Document Summarization

SENTENCE SUMMARY RELEVANCE (1 to 5 ratings) and HAND-WRITTEN SUMMARY

Extractive summarization

slide-26
SLIDE 26

The Scientific Knowledge Miner Project

Scientific Knowledge Miner (SKM) aims at facilitating the extraction, aggregation and navigation of knowledge from scientific publications.

  • Consolidate the SKM publication mining infrastructure
  • Exploit

the semantics

  • f

papers to perform large scale investigations of:

  • Alternative

metrics to evaluate a paper based

  • n

citation semantics

  • Semantically motivated recommendation of scientific

publications

  • Summarization of scientific literature

Conclusions and future work

slide-27
SLIDE 27

The Scientific Knowledge Miner Project

Acknowledgements

slide-28
SLIDE 28

Making Sense of Massive Amounts of Scientific Publications: The Scientific Knowledge Miner Project

{francesco.ronzano, ana.freire, diego.saez, horacio.saggion}@upf.edu