SLIDE 1
Making Sense of Massive Amounts of Scientific Publications: The Scientific Knowledge Miner Project
Francesco Ronzano, Ana Freire, Diego Saez-Trumper, Horacio Saggion
SLIDE 2 20 seconds… 1 paper
The Rise of Open Access Science 04 Oct 2013
- Vol. 342, Issue 6154, pp. 58-59
The Scientific Knowledge Miner Project
SLIDE 3
The Scientific Knowledge Miner Project
Information Overload (scientific repositories)
SLIDE 4
The Scientific Knowledge Miner Project
24,6M 1M 90M 57M
Information Overload (scientific repositories)
SLIDE 5
The Scientific Knowledge Miner Project
Sometimes between 2017 and 2021, more than half of the papers available globally are expected to be published as Open Access articles.
Lewis, David W. "The inevitability of open access." College & Research Libraries 73.5 (2012): 493-506.
SLIDE 6
The Scientific Knowledge Miner Project
The peculiarities of research publications
TITLE (SUB)SECTION ABSTRACT BIBLIOGRAPHIC ENTRY CAPTION
SLIDE 7 The Scientific Knowledge Miner Project
In
to take full advantage
the knowledge present in scientific publications proper semantic indexing, search and content aggregation approaches, are required. Benefits: § Search of new information on specific scientific problems § Semi-automatic assessment of papers and research proposals § Hypothesis formulation § Tracking of scientific and technological advances § Scientific intelligence § Assisted report and review writing § Question answering § …
Scientific publications: claims
SLIDE 8 The Scientific Knowledge Miner Project
The Scientific Knowledge Miner Project (SKM)
Facilitate the extraction of knowledge from scientific publications across many disciplines. Improve a variety of use cases such as:
- Citation Characterization
- Citation Recommendation
- Summarization
- …
Ø KEY: Papers are enriched with structural, linguistic and semantic information
Semantic Information
SKM
Datasets Software applications Better Scientific Knowledge
Scientific Publications
SLIDE 9 The Scientific Knowledge Miner Project
The Scientific Knowledge Miner Project (SKM)
The SKM approach to the analysis of scientific literature:
a finer-grained analysis
the contents
publications
- Is grounded on the automated characterization of a varied
set of semantic aspects of papers, including the rhetorical structure or the purpose of citations.
SLIDE 10
The Scientific Knowledge Miner Project (SKM)
The Scientific Knowledge Miner Project
Online Scientific Publications Crawler Storage Indexing Analysis
METADATA + SEMANTIC INFORMATION + METADATA
SLIDE 11
The Scientific Knowledge Miner Project (SKM)
The Scientific Knowledge Miner Project
Online Scientific Publications Crawler Storage Indexing Analysis
METADATA + SEMANTIC INFORMATION + METADATA CRAWLING
SLIDE 12
Crawling
The Scientific Knowledge Miner Project
Data Base
METADATA Title, author, conference, year, etc. +
SLIDE 13
The Scientific Knowledge Miner Project (SKM)
The Scientific Knowledge Miner Project
Online Scientific Publications Crawler Storage Indexing Analysis
METADATA + SEMANTIC INFORMATION + METADATA
SLIDE 14
The Scientific Knowledge Miner Project (SKM)
The Scientific Knowledge Miner Project
Online Scientific Publications Crawler Storage Indexing Analysis
SEMANTIC INFORMATION + METADATA TEXT ANALYSIS METADATA +
SLIDE 15
- Dr. Inventor Text Mining Framework
http://backingdata.org/dri/library/
- Integrate and customize text mining tools and on-line services
to enable and ease a wide range of scientificpublicationanalyses
- Papers are enriched with structural, linguistic and semantic
information
- Self-contained librarymanaged by
- Focused on textual content
- Relying on a shared data model (java classes) to representa paper
- Exposinga convenient API to access the mined information
- Based on
to managetextual annotations
The Scientific Knowledge Miner Project
SLIDE 16
Text Mining Framework
PDF to text converter Sentence splitter Inline citation spotter Web based reference parser Citation-aware dep. parser Rhetorical annotator Babelfy WSD and Entity Linker Citation Classifier Extractive summarizer
VIZ
The Scientific Knowledge Miner Project
- Dr. Inventor Text Mining Framework
SLIDE 17
The Scientific Knowledge Miner Project (SKM)
The Scientific Knowledge Miner Project
Online Scientific Publications Crawler Storage Indexing Analysis
METADATA + SEMANTIC INFORMATION + METADATA
SLIDE 18
The Scientific Knowledge Miner Project (SKM)
The Scientific Knowledge Miner Project
Online Scientific Publications Crawler Storage Indexing Analysis
METADATA + SEMANTIC INFORMATION + METADATA CONTENT AGGREGATION AND INDEXING
SLIDE 19
Indexing
The Scientific Knowledge Miner Project
SLIDE 20
The Scientific Knowledge Miner Project (SKM)
The Scientific Knowledge Miner Project
Online Scientific Publications Crawler Storage Indexing Analysis
SEMANTIC INFORMATION + METADATA METADATA +
SLIDE 21
The Scientific Knowledge Miner Project (SKM)
The Scientific Knowledge Miner Project
Online Scientific Publications Crawler Storage Indexing Analysis
METADATA + SEMANTIC INFORMATION + METADATA EXPLORATORY VISUAL ANALYTICS
SLIDE 22
Analysis
The Scientific Knowledge Miner Project http://backingdata.org/dri/viz/
SLIDE 23 The Scientific Knowledge Miner Project
Use Case 1: Citation Characterization
CITATION PURPOSE + 17 sub-purposes
Criticism Substantiation Use Comparison Basis Neutral Enrich citation counts with semantics Experiment new metrics: what do others say about one paper?
SLIDE 24 The Scientific Knowledge Miner Project
Use Case 2: Citation Recommendation
SENTENCE RHETORICAL CATEGORY + 3 sub-categories
Background Outcome Challenge Approach Future Work Recommend similar papers / authors
SLIDE 25 The Scientific Knowledge Miner Project
Use Case 3: Scientific Document Summarization
SENTENCE SUMMARY RELEVANCE (1 to 5 ratings) and HAND-WRITTEN SUMMARY
Extractive summarization
SLIDE 26 The Scientific Knowledge Miner Project
Scientific Knowledge Miner (SKM) aims at facilitating the extraction, aggregation and navigation of knowledge from scientific publications.
- Consolidate the SKM publication mining infrastructure
- Exploit
the semantics
papers to perform large scale investigations of:
metrics to evaluate a paper based
citation semantics
- Semantically motivated recommendation of scientific
publications
- Summarization of scientific literature
Conclusions and future work
SLIDE 27
The Scientific Knowledge Miner Project
Acknowledgements
SLIDE 28
Making Sense of Massive Amounts of Scientific Publications: The Scientific Knowledge Miner Project
{francesco.ronzano, ana.freire, diego.saez, horacio.saggion}@upf.edu