jules chevalier jules chevalier univ st etienne fr
play

Jules Chevalier jules.chevalier@univ-st-etienne.fr Laboratoire - PowerPoint PPT Presentation

Slider: an Efficient Incremental Reasoner Jules Chevalier jules.chevalier@univ-st-etienne.fr Laboratoire Hubert Curien, Tlcom Saint Etienne, Universit Jean Monnet March 2015 Supervisors : Frfrique Laforest Christophe Gravier Julien


  1. Slider: an Efficient Incremental Reasoner Jules Chevalier jules.chevalier@univ-st-etienne.fr Laboratoire Hubert Curien, Télécom Saint Etienne, Université Jean Monnet March 2015 Supervisors : Fréférique Laforest Christophe Gravier Julien Subercaze

  2. Summary Introduction State of the art Contribution Experimental results Conclusion 2 / 28

  3. Semantic Web ◮ Formalises concepts to represent them ◮ Standardizes this representation ◮ Makes it readable for both humans and computers ◮ Links these data together ◮ Allows automatic operations on these data ◮ Integrity constraint validation ◮ Query the knowledge base ◮ Extraction of implicit data 3 / 28

  4. Semantic Web ◮ Formalises concepts to represent them ◮ Standardizes this representation ◮ Makes it readable for both humans and computers ◮ Links these data together ◮ Allows automatic operations on these data ◮ Integrity constraint validation ◮ Query the knowledge base ◮ Extraction of implicit data = Reasoning 4 / 28

  5. Reasoning : Forward Chaining VS Backward Chaining ◮ What we know : ◮ Abraham father Homer ◮ Homer father Liza ◮ Homer father Bart ◮ Marge mother Liza ◮ Marge mother bart Abraham Homer Marge Bart Liza 5 / 28

  6. Reasoning : Forward Chaining VS Backward Chaining ◮ What we know : ◮ Abraham father Homer ◮ Homer father Liza ◮ Homer father Bart ◮ Marge mother Liza ◮ Marge mother bart Abraham ◮ What Forward Chaining do : ◮ Abraham grandfather Liza Homer Marge ◮ Abraham grandfather Bart ◮ ... Bart Liza ◮ Abraham grandfather Liza ? → yes 5 / 28

  7. Reasoning : Forward Chaining VS Backward Chaining ◮ What we know : ◮ Abraham father Homer ◮ Homer father Liza ◮ Homer father Bart ◮ Marge mother Liza ◮ Marge mother bart Abraham ◮ What Forward Chaining do : ◮ Abraham grandfather Liza Homer Marge ◮ Abraham grandfather Bart ◮ ... Bart Liza ◮ Abraham grandfather Liza ? → yes ◮ What Backward Chaining do : ◮ Abraham grandfather Liza ? ◮ Abraham father X & X father Liza ? ◮ Abraham father Homer & Homer father Liza → yes 5 / 28

  8. Rule-based Reasoning Rules ◮ An antecedent : Allows the rule to be executed ◮ A consequent : The statement inferred c 1 subClassOf c 2 , x type c 1 (cax-sco) x type c 2 Fragments ◮ A fragment is a set of inference rules ◮ Semantic Web standards suggest different pre defined fragments (RDFS, OWL Lite, OWL Full, OWL DL, ...) ◮ The more they have a high expressivity, the more the operations are complex (from P to NEXPTIME) ◮ Choosing one fragment is trade off between expressivity and computational complexity 6 / 28

  9. Reasoning kinds Classical Streaming Incremental Reasoning Reasoning Reasoning 7 / 28

  10. Problematic What we want to do ◮ Efficient and scalable incremental forward-chaining reasoning 8 / 28

  11. Problematic What we want to do ◮ Efficient and scalable incremental forward-chaining reasoning What are the problems ◮ Rules form a cyclic graph ◮ Complexity depends on the fragment ! ◮ The amount of triples generated is quite unpredictable ◮ The complexity also depends on data ! ◮ Big Data is not static ◮ We need to handle data streams ! 8 / 28

  12. Summary Introduction State of the art Contribution Experimental results Conclusion 9 / 28

  13. Batch reasoning approaches WebPie : a Web-scale Parallel Inference Engine ◮ 2009 - Jacopo Urbani Thesis [7] ◮ Uses MapReduce for OWL Horst and RDFS reasoning ◮ 2011 - Fix some issues to improve OWL Horst reasoning [8] ◮ Duplicates limitation ◮ Indexation for sameAs ◮ Greedy scheduling ◮ Cleaner Job after some rules, or at the end MapResolve [6] ◮ Based on WebPie to provide EL + classification ◮ Use 3 sets for triples : usable, used, inferred ◮ Limits overheads, optimise ◮ Points out MapReduce limitations 10 / 28

  14. Analysis : MapReduce approaches MapReduce WebPie and MapResolve Framework Contributions ◮ Allows to implement distributed ◮ Only provide batch reasoning tasks ◮ Nodes must wait for each other ◮ The Hadoop framework ◮ Generate a lot of duplicates ◮ Best suited to batch process huge amounts of data ◮ Fragment dependant ◮ Naive partitioning ◮ MapReduce requires an acyclic ◮ Critical letter for WebPie [5] dataflow ◮ Jobs run in isolation ◮ Not suitable network shuffling ◮ Hadoop distributed file system 11 / 28

  15. Incremental solutions History Matters: Incremental Ontology Reasoning Using Modules [3] ◮ Maintains classification of ontologies as they evolve ◮ Provides encouraging results ◮ Not viable for static hierarchy of ontologies ◮ Not adapted on high number of nominals Incremental Reasoning in OWL EL without Bookkeeping [4] ◮ Handles both addition and deletion of knowledge ◮ Incremental classification of TBox ◮ Limited to the classification on the TBox ◮ Dedicated to the EL + fragment 12 / 28

  16. Summary Introduction State of the art Contribution Experimental results Conclusion 13 / 28

  17. Proposed solution Slider ◮ Parallel and Scalable Execution ◮ Rules mapped to independent modules ◮ Multiple rule instances allowed to run in parallel ◮ Duplicates Limitation ◮ Shared triple store ◮ Vertical partitioning [1] and multiple indexing ◮ Data Stream Support ◮ Streamed architecture ◮ Parallel parsing/reasoning ◮ Fragment’s Customization ◮ Dynamic support of ruleset ◮ ρ df and RDFS natively supported ◮ Extendible to any other fragment 14 / 28

  18. Architecture Input Manager Rules Bu ff ers Thread Pool Distributors Rule Modules R 1 R 3 R 1 R 3 R 2 Distributor R 1 Bu ff er R 1 R 2 R 3 Incoming Input R 3 R 1 triples R 2 Manager R 2 R 2 Distributor R 2 Bu ff er R 2 Evolving Data R 3 R 1 R 1 R 2 New R 1 R 2 triples Distributor R 3 Bu ff er R 3 T RIPLE S TORE Explicit Triples Implicit Triples Streamed Triples Concurrent Access 15 / 28

  19. Architecture Input Manager Thread Pool ◮ Manages a pool instances ◮ Receives incoming triples ◮ Ensures scalability ◮ Sends them to ◮ The triple store ◮ The rules buffers Rule instance ◮ Execute the inference ◮ Access concurrently the Rules Buffers triple store ◮ A buffer for each rule ◮ Run the rule when full Distributor ◮ Run the rule when ◮ Stores inferred triples timed-out ◮ Dispatches them to the ◮ Ensures completeness buffers 16 / 28

  20. Inference: cax-sco 17 / 28

  21. Triple Store Concurrent Access Vertical Partitioning ◮ ReentrantReadWriteLock s T RIPLES E NCODING ensure concurrency 2 1 ◮ Write lock to add triples (1,2,3) 3 4 (4,2,5) ◮ Read lock for other methods 5 (6,7,8) 7 (6,7,9) 6 8 Duplicates Elimination 9 ◮ HashMap of MultiMap s ∗ ◮ Bans duplicates Near-optimal indexing ◮ Ensures uniqueness of triples ◮ Indexing by predicates, subjects and objects ◮ Best trade-off for nearly all rules from the OWL fragments ∗ Google’s Guava libraries 18 / 28

  22. Rules Dependency Graph ◮ Directed graph ◮ Created at initialisation time ◮ Edges represent rules ◮ Used to route new triples by ◮ The input manager ◮ A → B : B can use the output ◮ The distributors of A PRP- Universal Input SPO1 PRP- PRP- SCM- SCM- DOM RNG SCO SPO CAX- SCM- SCM- SCO DOM2 RNG2 Rules Dependency Graph for ρ df 19 / 28

  23. Architecture Input Manager Rules Bu ff ers Thread Pool Distributors Rule Modules R 1 R 3 R 1 R 3 R 2 Distributor R 1 Bu ff er R 1 R 2 R 3 Incoming Input R 3 R 1 triples R 2 Manager R 2 R 2 Distributor R 2 Bu ff er R 2 Evolving Data R 3 R 1 R 1 R 2 New R 1 R 2 triples Distributor R 3 Bu ff er R 3 T RIPLE S TORE Explicit Triples Implicit Triples Streamed Triples Concurrent Access 20 / 28

  24. Summary Introduction State of the art Contribution Experimental results Conclusion 21 / 28

  25. Experimentations Baseline ◮ OWLIM-SE (Standard Edition) ◮ Semantic repository with reasoning features ◮ Fastest reasoner available to the best of our knowledge ◮ Outperforms Jena and Sesame ◮ Natively supports RDFS, custom rule configuration for ρ df Dataset ◮ 13 ontologies from 3 sets: ◮ 2 Real life ontologies: WordNet and Wikipedia ◮ 5 generated by BSBM, from 100,000 to 5 million triples ◮ 6 subClassOf ontologies (closure computation, duplicates intensive) 22 / 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend