Codept, a whole-project dependency analyzer for OCaml
Florian “octachron” Angeletti
INRIA
ICFP 2019, OCaml workshop, 23 August 2019
Codept, a whole-project dependency analyzer for OCaml Florian - - PowerPoint PPT Presentation
Codept, a whole-project dependency analyzer for OCaml Florian octachron Angeletti INRIA ICFP 2019, OCaml workshop, 23 August 2019 Discovering dependencies $ ls a.ml b.ml c.ml d.ml e.ml f.ml atlas.ml Discovering dependencies $ ls
Florian “octachron” Angeletti
INRIA
ICFP 2019, OCaml workshop, 23 August 2019
$ ls a.ml b.ml c.ml d.ml e.ml f.ml atlas.ml
$ ls a.ml b.ml c.ml d.ml e.ml f.ml atlas.ml ◮ Discovering project structure
$ ls a.ml b.ml c.ml d.ml e.ml f.ml atlas.ml ◮ Discovering project structure ◮ Building project
Atlas F E D C B A
Atlas F E D C B A
Atlas F E D C B A
(* Atlas.ml *) module X = A module Y = B module Z = C module W = D module R = E module S = F
$ ocamldep -map atlas.ml a.ml b.ml c.ml d.ml e.ml f.ml
B A
(* a.ml *)
(* b.ml *) module A = struct end
A
(* a.ml *) module type S = sig module A: sig end end let f () = let module M = struct module A = struct end end in (module M : S ) let () = let module M = (val f ()) in let open M in let open A in ()
A
(* a.ml *) module type S = sig module type T module M:T end module F(X:S)= struct module M = X.M end module X = struct module type T = sig module A: sig end end module M = struct module A = struct end end end
Compilation units
A concrete file or pair of files mapped to a module ◮ All compilation units are mapped to a module ◮ All modules do not come from compilation units
...
module A = struct ... end
(* start of the file *)
Dependency tracking in OCaml
Recognizing compilation units from submodules.
◮ All actual direct dependencies are recorded ◮ So many false positives ◮ Aliases are not tracked
Local analysis
module Sub = struct ... end include Sub ◮ Every modules that is not of the current compilation unit submodules is a compilation unit ◮ Nearly an over-approximation ◮ False positive: post-processing phase ◮ Alias: manual map tracking with -map option
Local analysis
module Sub = struct ... end include Sub ◮ Every modules that is not of the current compilation unit submodules is a compilation unit ◮ Nearly an over-approximation ◮ False positive: post-processing phase ◮ Alias: manual map tracking with -map option
How to get precise dependencies for a file A?
What do you need for precise dependencies
◮ A signature for the universe
How to get precise dependencies for a file A?
What do you need for precise dependencies
◮ A signature for the universe
◮ How to deal with first-class modules?
How to get precise dependencies for a file A?
What do you need for precise dependencies
◮ A signature for the universe
◮ How to deal with first-class modules?
◮ A signature for all compilation units
How to get precise dependencies for a file A?
What do you need for precise dependencies
◮ A signature for the universe
◮ How to deal with first-class modules?
◮ A signature for all compilation units ◮ A signature for all dependencies
Codept core idea
◮ A dependency and signature analyzer ◮ ...than can stop on a missing signature and resume later
◮ No warning, exact dependencies ◮ At worst, an over-approximation of dependencies ◮ All analysis results are serializable in machine readable formats
Secondary goal
◮ Full compatibility with ocamldep
◮ AST simplification ◮ Interruptible interpreter ◮ Dependency orchestration
◮ expressions ◮ patterns ◮ types ◮ classes ◮ modules ◮ module types
type expression = | Open of module_expr | Include of module_expr | SigInclude of module_type | Bind of module_expr bind | Bind_sig of module_type bind | Bind_rec of module_expr bind list | Minor of annotation | Extension_node of extension and ...
Full OCaml Parsetree: 960 LOC Simplified (M2l) AST : 80 LOC
module M = struct module X = struct end end
module C = struct end [module M = [module X = [] (l2.2-l3.5)] (l1.0-l4.3)
module C = [](l7.0-21) ]
◮ How to represent partial evaluation result partial result, , partial AST
◮ How to represent partial evaluation result partial result, , partial AST ◮ : a still unknown module name
◮ How to represent partial evaluation result partial result, , partial AST ◮ : a still unknown module name
Zipper
◮ Add holes to the AST data type ◮ Holes are to be filled by the environment
[module M = [module X = [] (l2.2-l3.5)] (l1.0-l4.3)
module C = [](l7.0-21) ] Computation halted at: ... open B? [module C = [] (l7.0-21)]
... and module_expr = | Ident of Paths.Simple.t | Apply of { f: module_expr ; x:module_expr } ... type 'hole me = | Ident: path_in_context me | Apply_left: M2l.module_expr
| Apply_right: module_expr
| ...
◮ Try to fill all holes ◮ Fail if there is a hole that the environment doesn’t know how to fill ◮ Return the signature and dependencies otherwise
◮ Different strategies to compute whole-project signature and dependencies ◮ What to do with cycles?
◮ Report them? ◮ Try to remove them and go on with the rest of the computation?
A H B G C D E F
How well does codept fare against its specification?
A C B Atlas D E F
(* Atlas.ml *) module X = A module Y = B module Z = C module W = D module R = E module S = F
A B
(* a.ml *)
(* b.ml *) module A = struct end
A
(* a.ml *) ... let () = let module M = (val f ()) in let open M in let open A in () [Warning]: a.ml:l7.6-12, first-class module M was opened while its signature was unknown. Local solution: (* a.ml *) ... let module M: S = (val f ()) in ...
Work-in-progress.
Slower than ocamldep Not the right question:
◮ Codept executable, fully compatible with ocamldep
◮ Codept executable, fully compatible with ocamldep ◮ Core library, to be published on version 1.0
◮ Codept executable, fully compatible with ocamldep ◮ Core library, to be published on version 1.0 ◮ Too many options, a lighter executable planned
◮ JSON and sexp format available ◮ for signature and dependencies ◮ for the M2l AST
{ "version" : [0, 10, 3], "dependencies" : [{ "file" : "a.ml", "deps" : [["C"], ["B"], ["Atlas"]] }, { "file" : "atlas.ml" }, { "file" : "b.ml", "deps" : [["C" { "file" : "c.ml", "deps" : [["Atlas"]] }, { "file" : "d.ml", "deps" : [["Atlas"]] }, { "file" : "e.ml", "deps" : [["Atlas"]] }, { "file" : "f.ml", "deps" : [["Atlas"]] }], "local" : [{ "module" : ["A"], "ml" : "a.ml" }, { "module" : ["Atlas"], "ml" : "atlas.ml" }, { "module" : ["B"], "ml" : "b.ml" }, { "module" : ["C"], "ml" { "module" : ["D"], "ml" : "d.ml" }, { "module" : ["E"], "ml" { "module" : ["F"], "ml" : "f.ml" }] } Support incremental compilation
◮ Library publication ◮ Lightweight executable ◮ Multi-zipper ◮ Dune integration
Dune integration
◮ Not that straightforward: a full new layer of dependency computation ◮ But more opportunities for caching Past features from the future ◮ Full support for decoupling module names from filenames ◮ Full support for nested namespaces with -nested
Thanks!
(* a.ml *) module Sub = struct module SubSub = struct end end
(* b.ml *) module Sub = struct end