Reasoning over Stream Data Using a Rule-Based Programming Language - - PowerPoint PPT Presentation

reasoning over stream data using a rule based programming
SMART_READER_LITE
LIVE PREVIEW

Reasoning over Stream Data Using a Rule-Based Programming Language - - PowerPoint PPT Presentation

May 29, 2018 [iDS Workshop EIT Digital] Reasoning over Stream Data Using a Rule-Based Programming Language Joaqun Arias 1 , 2 Manuel Carro 1 , 2 1 IMDEA Software Institute, 2 Technical University of Madrid madrid institute for advanced studies


slide-1
SLIDE 1

Reasoning over Stream Data Using a Rule-Based Programming Language

May 29, 2018 [iDS Workshop EIT Digital]

Joaquín Arias1,2 Manuel Carro1,2

1IMDEA Software Institute, 2Technical University of Madrid

madrid institute for advanced studies in software development technologies

slide-2
SLIDE 2

1 / 17

www.software.imdea.org

Motivation We want efficient automated control systems

madrid institute for advanced studies in software development technologies

slide-3
SLIDE 3

2 / 17

www.software.imdea.org

Problems

1 The code of an automated system is very complex, it has to:

  • Reason with static knowledge (e.g., regulations for energy injection into the grid).
  • Process dynamic / stream data (e.g., energy market price or wind speed).

2 The code has to be modified due to:

  • New requirements (e.g., link energy production to the energy market price).
  • The requirements may differ (e.g., each country has it own regulations).

3 The bottleneck is on the human side rather than on the machine side [5].

madrid institute for advanced studies in software development technologies

slide-4
SLIDE 4

3 / 17

www.software.imdea.org

Proposal: Stream-TCLP

A rule-based high level programming language (based on logic and constraints): + makes it easier to translate requirements into code. + uses constraints to prune the search during the analysis. + memorizes the analysis results to reuse them. + dynamically re-compute the results updated by new stream data.

1

  • pen_blades(Windmill, T) :- blades_are_closed(Windmill, T-1),

2

not risk(Windmill, T).

3

risk(Windmill, T) :- near(Windmill, Near),

4

  • pen_blades(Near, T).

5

risk(Windmill, T) :- T1 #< T2, T2 #< T,

6

damage_detected(Windmill, T1),

7

not repaired(Windmill, T2). Figure: The query ?- findall(Windmill, (current_time(T), open_blade(Windmill,T)), List) to this program returns a List with the Windmills that can open its blades at T. madrid institute for advanced studies in software development technologies

slide-5
SLIDE 5

4 / 17

www.software.imdea.org

Fighting against giants? (M. Cervantes. Don Quijote de la Mancha. 1605)

madrid institute for advanced studies in software development technologies

slide-6
SLIDE 6

5 / 17

www.software.imdea.org

Preliminary Result

Modular TCLP: Is a modular design of TCLP implemented in Ciao (high performance Prolog implementation) which makes it easier to integrate tabling with different constraint solvers.

http://www.cliplab.org/papers/tplp2018-tclp/

s(CASP): Extends s(ASP), a novel non-monotonic reasoner developed at the UTD, with tabling and constraints using Ciao + Modular TCLP .

https://gitlab.software.imdea.org/joaquin.arias/sCASP madrid institute for advanced studies in software development technologies

slide-7
SLIDE 7

6 / 17

www.software.imdea.org

Modular TCLP [PPDP 2016, TPLP 2018]

Tabling engine Modular TCLP Interface Prolog-based CLP solver External CLP solver WAM

  • Extension of Prolog which combines the benefits of CLP + tabling.
  • Implemented in Ciao and described in [3].
  • Our design facilitates the integration of constraint solvers.
  • A new answer management strategy reduces execution time and space needs.
  • TCLP improves: declarativeness, termination properties, and performance.

madrid institute for advanced studies in software development technologies

slide-8
SLIDE 8

7 / 17

www.software.imdea.org

Modular TCLP I: Facilitate the integration

For each problem we need a specific constraint solver:

  • CLP(Lattice) Operates with partially ordered set of data.

E.g., inventory management, bearing ⊑ gearbox ⊑ turbine.

  • CLP(Intervals)

E.g., temporal reasoning over intervals, [T1, T2].

  • CLP(Q) Solves linear (in)equations over rationals.

E.g., cost management plan. gearbox turbine

madrid institute for advanced studies in software development technologies

slide-9
SLIDE 9

7 / 17

www.software.imdea.org

Modular TCLP I: Facilitate the integration

For each problem we need a specific constraint solver:

  • CLP(Lattice) Operates with partially ordered set of data.

E.g., inventory management, bearing ⊑ gearbox ⊑ turbine.

  • CLP(Intervals)

E.g., temporal reasoning over intervals, [T1, T2].

  • CLP(Q) Solves linear (in)equations over rationals.

E.g., cost management plan.

1

:- use_package(clpq).

2

:- active_tclp.

3

store_projection(V,st(V,St)) :- clpqr_dump_constraints(V, V, St).

4

call_entail(st(V,_ ),st(V,StGen)) :- clpq_entailed(StGen).

5

answer_compare(st(V,_ ),st(V,StAns),=<) :- clpq_entailed(StAns), !.

6

answer_compare(st(F,St),st(F,StAns), >) :- clpq_meta(StAns), clpq_entailed(St).

7

apply_answer(V,st(V,St)) :- clpq_meta(St). Figure: The TCLP interface of CLP(Q) [8] is a bridge to existing predicates madrid institute for advanced studies in software development technologies

slide-10
SLIDE 10

8 / 17

www.software.imdea.org

Modular TCLP II: Entailment

Call / answer entailment check is used to detect more particular queries / answers. c0 entails c1 (c0 ⊑ c1) if any solution of c0 is also a solution of c1.

{X > 5} ⊑ {X > 0}

1 The answers from a more general query / call are reused to answer more

particular queries / calls avoiding re-computation.

2 Call entailment check in the presence of recursive rules may avoid loops. 3 It retains only the most general answers avoiding repetitions and redundancy.

madrid institute for advanced studies in software development technologies

slide-11
SLIDE 11

9 / 17

www.software.imdea.org

Modular TCLP III: Answer Management Strategy

Discard Remove Discard+Remove 101 102 103 104 105 106 37,548 9.46 · 105 9,352 5.99 · 105 71,658 8.91 · 105 4,371 242 441 25 number of answers (log.)

Figure: Number of answers: saved, discarded, removed and returned to the query step_bound(WindA, WindB, Steps, Limit). madrid institute for advanced studies in software development technologies

slide-12
SLIDE 12

10 / 17

www.software.imdea.org

Modular TCLP IV: Declarativeness

z a b c d

1

near(WindA, WindB) :-

2

limit(K), D #< K,

3

dist(WindA, WindB, D).

4 5

dist(X, Y, D) :-

6

D1 #> 0, D2 #> 0,

7

D #= D1 + D2,

8

dist(X, Z, D1),

9

edge(Z, Y, D2).

10

dist(X, Y, D) :-

11

edge(X, Y, D).

12 13

edge(a, b, D) :-

14

D #> 7, D #< 11.

15

edge(... Figure: near/2 and dist/3 find the windmills within a distance limit K. madrid institute for advanced studies in software development technologies

slide-13
SLIDE 13

11 / 17

www.software.imdea.org

Modular TCLP V: Termination properties & Performance

z a b c d Prolog CLP(Q) Tabling TCLP(Q) Left recursion x x 2311 1286 Without Right recursion > 5 min. 5136 3672 2237 cycles Left recursion x x x 742 With Right recursion x 10992 x 1776 cycles Table: Run time (ms) for dist/3. A ‘x’ means no termination. madrid institute for advanced studies in software development technologies

slide-14
SLIDE 14

12 / 17

www.software.imdea.org

However...

... Prolog and TCLP do not handle recursive rules over negation where for a given problem we may have multiple possible models.

1

  • pen_blades(Windmill, T) :- blades_are_closed(Windmill, T-1),

2

not risk(Windmill, T).

3

risk(Windmill, T) :- near(Windmill, Near),

4

  • pen_blades(Near, T).

{ open_blades(quijote,T), not open_blades(sancho,T) } { not open_blades(quijote,T),

  • pen_blades(sancho,T) }

madrid institute for advanced studies in software development technologies

slide-15
SLIDE 15

13 / 17

www.software.imdea.org

s(CASP) [ICLP/TPLP 2018]

There are two approaches to implement stable model semantics:

  • Answer Set Programming: however, most ASP systems perform a grounding

phase which removes variables and their links / constraints.

  • s(ASP): is a top-down execution model which retains variables during execution

and in the answer sets (developed at the University of Texas at Dallas). Our proposal, s(CASP), is a non-monotonic reasoner, which extend s(ASP) with constraints and tabling, implemented using Ciao + Modular TCLP . + The constraint can be part of the answer set. + It returns a justification tree with the rules that support the model. + The implementation design is parametric w.r.t. the constraint solver.

madrid institute for advanced studies in software development technologies

slide-16
SLIDE 16

14 / 17

www.software.imdea.org

s(CASP) I: Performance

s(CASP) ASP ASP standard incremental n = 7 479 3,651 9,885 n = 8 1,499 54,104 174,224 n = 9 5,178 191,267

> 5 min

Table: Run time (ms) needed to return the plan of minimal movements that solves the Towers of Hanoi problem with n disks. madrid institute for advanced studies in software development technologies

slide-17
SLIDE 17

14 / 17

www.software.imdea.org

s(CASP) I: Performance

s(CASP) ASP ASP standard incremental n = 7 479 3,651 9,885 n = 8 1,499 54,104 174,224 n = 9 5,178 191,267

> 5 min

Table: Run time (ms) needed to return the plan of minimal movements that solves the Towers of Hanoi problem with n disks.

  • s(CASP) implements the algorithm that returns the minimal plan.
  • The ASP standard variant returns a plan for a given number of moves (if it exists).
  • The ASP incremental variant recomputes the problem, incrementing the allowed

number of moves, until it finds a valid plan.

madrid institute for advanced studies in software development technologies

slide-18
SLIDE 18

15 / 17

www.software.imdea.org

Why these results are important? [DC-ICLP 2016]

1 The use of logic languages [2, 5] drastically reduce the overall complexity.

E.g., Etalis or Yedalog.

2 Constraints can be used to describe and operate with complex data

representation [16]. E.g., using intervals or an ontology.

3 dist/3 is a typical query for the analysis of graph databases and social

networks [14]. E.g., Protocol Buffers by Google.

4 Default negation is needed to model common sense reasoning [7] used by

humans for everyday reasoning. E.g., general rules with exceptions.

madrid institute for advanced studies in software development technologies

slide-19
SLIDE 19

16 / 17

www.software.imdea.org

Future work

  • Complex knowledge: ontology hierarchy + ontology constraint solver.

E.g., in preventing maintenance plans, it retrieves more concise answers .

  • Temporal reasoning [1]: Stream-RDF + temporal constraint solver.

E.g., in cause and effect analysis, it allows forward and backward reasoning.

  • TCLP with answer on demand similar to [6, 12, 4].

E.g., in emergency detection, it returns the first answer ASAP .

  • Incremental TCLP as in [10, 13] but dealing with constraints.

E.g., in stream processing, it does not recompute everything from scratch.

  • Define a new semantics for aggregates [9, 11, 15] on recursive TCLP programs.

E.g., it computes the minimal distance discarding irrelevant results.

1

:- aggregate dist(_,_,min).

2

dist(X, Y, D) :- dist(X, Z, D1), edge(Z, Y, D2), D is D1 + D2.

3

dist(X, Y, D) :- edge(X, Y, D). madrid institute for advanced studies in software development technologies

slide-20
SLIDE 20

17 / 17

www.software.imdea.org

Thanks

madrid institute for advanced studies in software development technologies

slide-21
SLIDE 21

18 / 17

www.software.imdea.org

Bibliography I

[1] James F Allen. Maintaining Knowledge about Temporal Intervals. Communications of the ACM, 26(11):832–843, 1983. [2] Darko Anicic, Paul Fodor, Sebastian Rudolph, Roland Stühmer, Nenad Stojanovic, and Rudi Studer. A Rule-Based Language for Complex Event Processing and Reasoning. In International Conference on Web Reasoning and Rule Systems, pages 42–57. Springer, 2010. [3] J. Arias and M. Carro. Description and Evaluation of a Generic Design to Integrate CLP and Tabled Execution. In 18th Int’l. ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP’16), pages 10–23. ACM Press, September 2016. [4] P . Chico de Guzmán, M. Carro, and David S. Warren. Swapping Evaluation: A Memory-Scalable Solution for Answer-On-Demand Tabling. Theory and Practice of Logic Programming, 26th Int’l. Conference on Logic Programming (ICLP’10) Special Issue, 10 (4–6):401–416, July 2010. madrid institute for advanced studies in software development technologies

slide-22
SLIDE 22

19 / 17

www.software.imdea.org

Bibliography II

[5] Brian Chin, Daniel von Dincklage, Vuk Ercegovac, Peter Hawkins, Mark S Miller, Franz Och, Christopher Olston, and Fernando Pereira. Yedalog: Exploring Knowledge at

  • Scale. In LIPIcs-Leibniz International Proceedings in Informatics, volume 32. Schloss

Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015. [6] Juliana Freire, Terrance Swift, and David Scott Warren. Beyond Depth-First Strategies: Improving Tabled Logic Programs through Alternative Scheduling. Journal of Functional and Logic Programming, 1998(3), 1998. [7] Gopal Gupta, Elmer Salazar, Kyle Marple, Zhuo Chen, and Farhad Shakerin. A case for query-driven predicate answer set programming. In Giles Reger and Dmitriy Traytel, editors, ARCADE 2017. 1st International Workshop on Automated Reasoning: Challenges, Applications, Directions, Exemplary Achievements, volume 51 of EPiC Series in Computing, pages 64–68. EasyChair, 2017. [8] C. Holzbaur. OFAI CLP(Q,R) Manual, Edition 1.3.3. Technical Report TR-95-09, Austrian Research Institute for Artificial Intelligence, Vienna, 1995. madrid institute for advanced studies in software development technologies

slide-23
SLIDE 23

20 / 17

www.software.imdea.org

Bibliography III

[9] David B Kemp and Peter J Stuckey. Semantics of Logic Programs with Aggregates. In ISLP, volume 91, pages 387–401. Citeseer, 1991. [10] Danh Le-Phuoc. Operator-Aware Approach for Boosting Performance in RDF Stream

  • Processing. Web Semantics: Science, Services and Agents on the World Wide Web,

2016. [11] Nikolay Pelov, Marc Denecker, and Maurice Bruynooghe. Well-Founded and Stable Semantics of Logic Programs with Aggregates. TPLP, 7(3):301–353, 2007. [12] Konstantinos F. Sagonas and Peter J. Stuckey. Just Enough Tabling. In Principles and Practice of Declarative Programming, pages 78–89. ACM, August 2004. [13] Terrance Swift. Incremental Tabling in Support of Knowledge Representation and

  • Reasoning. Theory and Practice of Logic Programming, 14(4-5):553–567, 2014.

[14] Terrance Swift and David Scott Warren. Tabling with answer subsumption: Implementation, applications and performance. In Tomi Janhunen and Ilkka Niemelä, editors, JELIA, volume 6341 of Lecture Notes in Computer Science, pages 300–312. Springer, 2010. madrid institute for advanced studies in software development technologies

slide-24
SLIDE 24

21 / 17

www.software.imdea.org

Bibliography IV

[15] Alexander Vandenbroucke, Maciej Pirog, Benoit Desouter, and Tom Schrijvers. Tabling with Sound Answer Subsumption. Theory and Practice of Logic Programming, 32th Int’l. Conference on Logic Programming (ICLP’16), 16, October 2016. [16] Youyong Zou, Tim Finin, and Harry Chen. F-OWL: An Inference Engine for Semantic

  • Web. In Formal Approaches to Agent-Based Systems, volume 3228 of Lecture Notes in

Computer Science, pages 238–248. Springer Verlag, January 2005. madrid institute for advanced studies in software development technologies