Ontology Matching for Patent Classification Christoph Quix, Sandra - - PowerPoint PPT Presentation

ontology matching for patent classification
SMART_READER_LITE
LIVE PREVIEW

Ontology Matching for Patent Classification Christoph Quix, Sandra - - PowerPoint PPT Presentation

Ontology Matching for Patent Classification Christoph Quix, Sandra Geisler, Rihan Hai, Sanchit Alekh Ontology Matching Workshop@ISWC, October 21, 2017 Agenda Motivation Ontology Modeling Overview of the approaches Evaluation


slide-1
SLIDE 1

Ontology Matching for Patent Classification

Christoph Quix, Sandra Geisler, Rihan Hai, Sanchit Alekh Ontology Matching Workshop@ISWC, October 21, 2017

slide-2
SLIDE 2

Agenda

Motivation Ontology Modeling Overview of the approaches Evaluation Results Conclusion

OM 2017 2

slide-3
SLIDE 3

Motivation for Patent Analysis

A significant amount of information about technological innovations are available only in patents Identification of new trends is important for industry & research, short innovation cycles Patents can be helpful to find partners for research projects, especially in interdisciplinary research fields, such as medical engineering (ME)

  • project at RWTH Aachen aims at building a

recommender system for projects in medical engineering

OM 2017 3

www.iem.rwth-aachen.de Wikipedia http://dbis.rwth-aachen.de/mi-Mappa

slide-4
SLIDE 4

Challenges for Patent Analysis

Patents have a special language and terminology Patent classification scheme IPC is not detailed enough to cover specific areas within a research field

  • Relevant patents for medical engineering are in A61

OM 2017 4

The computer program is stored on a computer-readable medium comprising software code adapted to perform the steps of the method 100 according some embodiments when executed on a data-processing apparatus.

slide-5
SLIDE 5

Goals of the mi-Mappa Project

Mapping of patents and their inventors to competence fields in medical engineering

  • Imaging Techniques
  • Prostheses and Implants
  • Telemedicine & Information Systems
  • Operative & Interventional Devices and Systems
  • In-Vitro Diagnostics
  • Special Therapies & Diagnosis Systems

Based on

  • Product-related information of ME products
  • Patents: Text and References
  • Publications: from PubMed, including MeSH terms

OM 2017 5

Defined by an expert board to identify the innovative areas in medical engineering

slide-6
SLIDE 6

Agenda

Motivation Ontology Modeling Overview of the approaches Evaluation Results Conclusion

OM 2017 6

slide-7
SLIDE 7

Modeling of Competence Fields in an Ontology

Coverage of the ME domain in existing ontologies is low Creation of a new ontology according to NeON methodology Modeled as an extension of existing ontologies (refers to existing classes, i.e., equivalence or subclass relationships)

OM 2017 7

slide-8
SLIDE 8

Example: Modeling of Imaging Techniques

OM 2017 8

slide-9
SLIDE 9

Agenda

Motivation Ontology Modeling Overview of the approaches Evaluation Results Conclusion

OM 2017 9

slide-10
SLIDE 10

Overall Architecture

Topic Modeling using LDA Extraction of references to scientific publications from patent data Lookup of publications in PubMed, retrieval of MeSH terms Mapping to CFO by using alignment to CFO

OM 2017 10

slide-11
SLIDE 11

Matching of MeSH and CF Ontology

CFO: 535 classes MeSH: 281.776 classes Alignment computed by AgreementMaker Light

  • Experiments with different settings
  • Simple matcher with low threshold and

cardinality filter had best results

  • Often high string similarity between concepts
  • Roughly one mapping per class in CFO in average

OM 2017 11

slide-12
SLIDE 12

Overview of approaches

OM 2017 12

  • 1. Baseline: Direct matching of

extracted topics to CFO

  • 2. Extracted MeSH terms from cited

publications are mapped to CFO according to computed alignment

  • 3. Topic terms are matched with

MeSH, then alignment to CFO is used

  • 4. Combination of #3 and #4
slide-13
SLIDE 13

Agenda

Motivation Ontology Modeling Overview of the approaches Evaluation Results Conclusion

OM 2017 13

slide-14
SLIDE 14

Evaluation Results

59 patents have been assigned to CFs individually by experts Multiple CFs can be assigned to a patent (random precision <10%) Approaches 2+3 clearly outperform baseline approach 1 Combined approach has best performance Approach 2+3 are complementary to each other Classification approach with SVM achieves about 80%

OM 2017 14

slide-15
SLIDE 15

Discussion of Results

Quality of mapping to CFO still low (f-measure about 50%, after some recent minor improvements and bug fixing >60%) Publications are annotated with very general MeSH terms (e.g., human, animal) Computed similarities are very low (because of combination by multiplication

  • f several similarities); thus, interpretation of raw values difficult

normalization on [0,1] Expert mappings highly subjective

  • Discussion & redefinition of mappings in expert group
  • riginal mappings had only 60-70% f-measure wrt. to the new mappings

OM 2017 15

slide-16
SLIDE 16

Agenda

Motivation Ontology Modeling Overview of the approaches Evaluation Results Conclusion

OM 2017 16

slide-17
SLIDE 17

Conclusion and Outlook

Patent analysis and classification can be an interesting field for ontology engineering & ontology matching Mapping along semantically rich ontology (MeSH) significantly better than direct matching (approach 1 vs. 2) Use of semantic annotations (MeSH terms of publications) can provide additional information Next steps

  • Further debugging and re-evaluation of the approach, additional CF „Others“
  • Improvement of SVM classification by using our approach as training data?
  • Larger expert validation is on the way

OM 2017 17