datalog
play

Datalog. by Franois Gauthier It all began with a qualifying exam - PowerPoint PPT Presentation

Simplified data-flow analysis with Datalog. by Franois Gauthier It all began with a qualifying exam Among the heap of papers I had to read, there was this one: Cloning-based context-sensitive pointer alias analysis using binary decision


  1. Simplified data-flow analysis with Datalog. by François Gauthier

  2. It all began with a qualifying exam… Among the heap of papers I had to read, there was this one: Cloning-based context-sensitive pointer alias analysis using binary decision diagrams by J Whaley and MS Lam in the Programming Language Design and Implementation (PLDI) conference

  3. … and a points -to analysis Where the authors claimed that the following 4 lines compute a basic points-to analysis: vP(v, h) :− vP0(v, h). vP(v1, h) :− assign(v1, v2), vP(v2, h). hP(h1, f, h2) :− store(v1,f,v2), vP(v1,h1), vP(v2, h2). vP(v2, h2) :− load(v1,f, v2), vP(v1,h1), hP(h1,f, h2).

  4. Really?

  5. Yes!

  6. Datalog – Basics • Datalog is a logic programming language that is a subset of Prolog. • Datalog operates on facts and rules . • A fact is declared like this: – parent("Bill", "Mary"). • Can read as – Bill is the parent of Mary or – Mary is the parent of Bill. • Implementer choose the meaning.

  7. Datalog – Basics (cont.) • A Datalog program consists in a set rules that define new facts. • A rule consists of two parts: head and body : – ancestor(?X,?Y) :- parent(?X,?Y). – ancestor(?X,?Y) :- parent(?X,?Z), ancestor(?Z,?Y). • The :- symbol separates the head and the body . • Commas in the body stand for AND. • ? indicates a variable.

  8. Datalog – Understanding rules • The following rule: – ancestor(?X,?Y) :- parent(?X,?Y). reads as: Y is an ancestor of X if it is true that Y is a parent of X. • Similarly, the following rule: – ancestor(?X,?Y) :- parent(?X,?Z), ancestor(?Z,?Y). reads as: Y is an ancestor of X if it is true that Z is the parent of X and Y is the ancestor of Z.

  9. Ancestors - Initial facts parent("C", "D"). parent("Y", "D"). D Z parent("C", "Z"). parent("Y", "Z"). parent("A", "B"). X B C Y parent("A", "C"). parent("W", "Y"). A W parent("W", "X").

  10. Ancestors – Rules and queries • Recall the rules of our ancestors program: – ancestor(?X,?Y) :- parent(?X,?Y). – ancestor(?X,?Y) :- parent(?X,?Z), ancestor(?Z,?Y). • These rules will be evaluated iteratively until the head is not modified anymore (fixpoint). • A query in Datalog is expressed like this: – ?-ancestor("W", ?Ancestor).

  11. What about data-flow analysis? Java code Control-flow graph 1 public String name (String type){ 1: String a = “Anonymous”; 2: if(type.equals (“cat”)) 3: a = “Garfield”; 2 4: else if(type.equals (“dog”)) 5: a = “Snoopy"; 6: else 3 4 7: a = “Blob”; 8: return a; } 5 7 8

  12. Reaching definitions – Initial facts Java code Initial facts public String name (String type){ 1: String a = “Anonymous”; assign(1, "a"). 2: if(type.equals (“cat”)) 3: a = “Garfield”; assign(3, "a"). 4: else if(type.equals (“dog”)) 5: a = “Snoopy"; assign(5, "a"). 6: else 7: a = “Blob”; assign(7, "a"). 8: return a; }

  13. Reaching definitions – Initial facts Initial facts Control-flow graph 1 follows(1,2). follows(2,3). follows(2,4). 2 follows(4,5). follows(4,7). 3 4 follows(3,8). follows(5,8). 5 7 follows(7,8). 8

  14. Reaching definitions - Rules reach(?i,?x,?j) :- assign(?i,?x), follows(?i,?j). reach(?d,?x,?j) :- reach(?d,?x,?i), follows(?i,?j), !assign(?j,?x).

  15. Back to point- to… Java code Representation o1: Dog snoopy = new Dog(); o2: Dog odie = new Dog(); snoopy myDog odie o3: Food f1 = new Food(); snoopy.food = f1; Food f2 = snoopy.food; odie.food = f2; food food Dog myDog = odie; o1 o3 o2 f1 f2

  16. Initial facts – vPointsTo 0 Java code Facts o1: Dog snoopy = new Dog(); vPointsTo 0 ("snoopy","o1"). o2: Dog odie = new Dog(); vPointsTo 0 ("odie","o2"). vPointsTo 0 ("f1","o3"). o3: Food f1 = new Food(); snoopy.food = f1; Food f2 = snoopy.food; odie.food = f2; Dog myDog = odie;

  17. Initial facts – store Java code Facts o1: Dog snoopy = new Dog(); o2: Dog odie = new Dog(); o3: Food f1 = new Food(); snoopy.food = f1; store("snoopy","food","f1"). Food f2 = snoopy.food; odie.food = f2; store("odie","food","f2"). Dog myDog = odie;

  18. Initial facts – load Java code Facts o1: Dog snoopy = new Dog(); o2: Dog odie = new Dog(); o3: Food f1 = new Food(); snoopy.food = f1; Food f2 = snoopy.food; load("snoopy","food","f2"). odie.food = f2; Dog myDog = odie;

  19. Initial facts – assign Java code Facts o1: Dog snoopy = new Dog(); o2: Dog odie = new Dog(); o3: Food f1 = new Food(); snoopy.food = f1; Food f2 = snoopy.food; odie.food = f2; Dog myDog = odie; assign("myDog","odie").

  20. Initial facts – putting it all together Java code Facts o1: Dog snoopy = new Dog(); vPointsTo 0 ("snoopy","o1"). o2: Dog odie = new Dog(); vPointsTo 0 ("odie","o2"). vPointsTo 0 ("f1","o3"). o3: Food f1 = new Food(); snoopy.food = f1; store("snoopy","food","f1"). Food f2 = snoopy.food; load("snoopy","food","f2"). odie.food = f2; store("odie","food","f2"). Dog myDog = odie; assign("myDog","odie").

  21. Points-to – Rules We are interested in finding: 1. To which heap objects a variable can point to. 2. To which heap objects a field can point to. Outputs will be stored in two relations: 1. vPointsTo(?v, ?o) – Variable v points to object o 2. hPointsTo(?o 1 , ?f, ?o 2 ) – The field f of object o 1 points to object o 2 .

  22. Points-to – Rules (cont.) Initialization: vPointsTo(?v, ?o) :- vPointsTo 0 (?v, ?o). Assignments (v 1 = v 2 ): vPointsTo(?v 1 , ?o) :- assign(?v 1 , ?v 2 ), vPointsTo(?v 2 , ?o).

  23. Points-to – Rules (cont.) Stores (v 1 .f = v 2 ): hPointsTo(?o 1 , ?f, ?o 2 ) :- store(?v 1 , ?f, ?v 2 ), vPointsTo(?v 1 , ?o 1 ), vPointsTo(?v 2 , ?o 2 ).

  24. Points-to – Rules (cont.) Loads (v 2 = v 1 .f): vPointsTo(?v 2 , ?o 2 ) :- load(?v 1 , ?f, ?v 2 ), vPointsTo(?v 1 , ?o 1 ), hPointsTo(?o 1 , ?f, ?o 2 ).

  25. Points-to – Putting all rules together vPointsTo(?v, ?o) :- vPointsTo 0 (?v, ?o). vPointsTo(?v 1 , ?o) :- assign(?v 1 , ?v 2 ), vPointsTo(?v 2 , ?o). hPointsTo(?o 1 , ?f, ?o 2 ) :- store(?v 1 , ?f, ?v 2 ), vPointsTo(?v 1 , ?o 1 ), vPointsTo(?v 2 , ?o 2 ). vPointsTo(?v 2 , ?o 2 ) :- load(?v 1 ,?f, ?v 2 ), vPointsTo(?v 1 , ?o 1 ), hPointsTo(?o 1 , ?f, ?o 2 ).

  26. Application to security 3 function read($file, ) { $privilege if( ) $privilege 4 5 Protected by the $handle = fopen($file, "r"); ‘ read ’ privilege. else error (‘You cannot read that file’); } ... $file = ‘prescriptions.txt’; ... 1 = $canRd user_can (‘ read ’); ... 2 read($file, ); $canRd

  27. Results on Moodle 1.9.5 Syntactic analysis: 992 security checks detected. Intra-procedural, flow-insensitive: 1062 security checks detected. Intra-procedural, flow-sensitive: 1063 security checks detected (removed an ambiguity) Inter-procedural, flow-insensitive: 1072 security checks detected.

  28. Conclusion You can find the Datalog programs I developed (both intra and inter-procedural) in: Alias-aware propagation of simple pattern- based properties in PHP applications, SCAM 2012. That’s all folks!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend