Your Program as a Transpiler Applying Compiler Design to Everyday - - PowerPoint PPT Presentation

your program as a transpiler
SMART_READER_LITE
LIVE PREVIEW

Your Program as a Transpiler Applying Compiler Design to Everyday - - PowerPoint PPT Presentation

Your Program as a Transpiler Applying Compiler Design to Everyday Programming About Me Edoardo Vacchi @evacchi Research @ University of Milan Research @ UniCredit R&D Drools and jBPM Team @ Red Hat Motivation Motivation


slide-1
SLIDE 1

Your Program as a Transpiler

Applying Compiler Design to Everyday Programming

slide-2
SLIDE 2

About Me

  • Edoardo Vacchi @evacchi
  • Research @ University of Milan
  • Research @ UniCredit R&D
  • Drools and jBPM Team @ Red Hat
slide-3
SLIDE 3

Motivation

slide-4
SLIDE 4

Motivation

  • My first task in Red Hat: marshalling backend for jBPM
  • Data model mapping
  • From XML tree model to graph representation
  • Apparently boring, but challenging in a way
slide-5
SLIDE 5

Motivation

  • Language implementation is often seen as a dark art
  • But some design patterns are simple at their core
  • Best practices can be applied to everyday programming
slide-6
SLIDE 6

Motivation (cont'd)

  • Learning about language implementation will give you a

different angle to deal with many problems

  • It will lead you to a better understanding of how GraalVM

and Quarkus do their magic

slide-7
SLIDE 7

Goals

  • Programs have often a pre-processing phase where you

prepare for execution

  • Then, there's actual process execution phase
  • Learn to recognize and structure the pre-processing phase
slide-8
SLIDE 8

Transpilers

slide-9
SLIDE 9

Transpilers vs. Compilers

  • Compiler: translates code written in a language (source

code) into code written in a target language (object code). The target language may be at a lower level of abstraction

  • Transpiler: translates code written in a language into

code written in another language at the same level of abstraction (Source-to-Source Translator).

slide-10
SLIDE 10

Are transpilers simpler than compilers?

  • Lower-level languages are complex
  • They are not: if anything, they're simple
  • Syntactic sugar is not a higher-level of abstraction
  • It is: a concise construct is expanded at compile-time
  • Proper compilers do low-level optimizations
  • You are thinking of optimizing compilers.
slide-11
SLIDE 11

The distinction is moot

  • It is pretty easy to write a crappy compiler, call it a

transpiler and feel at peace with yourself

  • Writing a good transpiler is no different or harder than

writing a good compiler

  • So, how do you write a good compiler?
slide-12
SLIDE 12

Your Program as a Compiler

Applying Compiler Design to Everyday Programming

slide-13
SLIDE 13

Compiler-like workflows

  • At least two classes of problems can be solved with

compiler-like workflows

  • Boot time optimization problems
  • Data transformation problems
slide-14
SLIDE 14

Compiler-like workflows

  • At least two classes of problems can be solved with

compiler-like workflows

  • Boot time optimization problems
  • Data transformation problems
slide-15
SLIDE 15

Running Example

Function Orchestration

slide-16
SLIDE 16

Function Orchestration

  • You are building an immutable Dockerized serverless

function

f g

slide-17
SLIDE 17

Function Orchestration

  • Problem
  • No standard* way to describe function orchestration yet

* Yes, I know about https://github.com/cncf/wg-serverless

f g

slide-18
SLIDE 18

process: elements:

  • start: &_1

name: Start

  • function: &_2:

name: Hello

  • end: &_3

name: End

  • edge:

source: *_1 target: *_2

  • edge:

source: *_2 target: *_3 Start End

Hello

Solution: Roll your own YAML format

Congratulations ! Enjoy attending conferences worldwide

slide-19
SLIDE 19

Alternate Solution

  • You are describing a workflow
  • There is a perfectly fine standard: BPMN
  • Business Process Model and Notation

Task 1 Task 2

slide-20
SLIDE 20

<process id="Minimal" name="Minimal Process"> <startEvent id="_1" name="Start"/> <scriptTask id="_2" name="Hello"> <script>System.out.println("Hello World");</script> </scriptTask> <endEvent id="_3" name="End"> <terminateEventDefinition/> </endEvent> <sequenceFlow id="_1-_2" sourceRef="_1" targetRef="_2"/> <sequenceFlow id="_2-_3" sourceRef="_2" targetRef="_3"/> </process>

https://github.com/evacchi/ypaat

Start End

Hello

slide-21
SLIDE 21

Start End

Hello

Downside: Nobody will invite you at their conference to talk about BPM.

slide-22
SLIDE 22

Start End

Hello

Unless you trick them.

Downside: Nobody will invite you at their conference to talk about BPM.

slide-23
SLIDE 23

Bonuses for choosing BPMN

  • Standard XML-based serialization format
  • that's not the bonus
  • There is standard tooling to validate and parse
  • that is a bonus
  • Moreover:
  • Different types of nodes included in the main spec
  • Optional spec for laying out nodes on a diagram

Start End

Hello

slide-24
SLIDE 24

Goals

  • Read a BPMN workflow
  • Execute that workflow
  • Visualize that workflow

Start End

Hello

slide-25
SLIDE 25

Step 1

Recognize your compilation phase

slide-26
SLIDE 26

What's a compilation phase?

  • It's your setup phase.
  • You do it only once before the actual processing begins
slide-27
SLIDE 27

Configuring the application

  • Problem. Use config values from a file/env vars/etc
  • Do you validate config values each time you read them?
  • Compile-time:
  • Read config values into a validated data structure
  • Run-time:
  • Use validated config values
slide-28
SLIDE 28

Data Transformation Pipeline

  • Problem. Manipulate data to produce analytics
  • Compile-time:
  • Define transformations (e.g. map, filter, etc. operations)
  • Decide the execution plan (local, distributed, etc.)
  • Run-time:
  • Evaluate the execution plan
slide-29
SLIDE 29

Example: BPMN Execution

  • Problem. Execute a workflow description.
  • Compile-time:
  • Read BPMN into a visitable structure (StartEvent)
  • Run-time:
  • Visit the structure
  • For each node, execute tasks

Start End

Hello

slide-30
SLIDE 30

Example: BPMN Visualization

  • Problem. Visualize a workflow diagram.
  • Compile-time:
  • Read BPMN into a graph
  • Run-time:
  • For each node and edge, draw on a canvas

Start End

Hello

slide-31
SLIDE 31

Read BPMN into a Data Structure

  • Full XML Schema Definition* is automatically mapped
  • nto Java classes, validated against schema constraints

TDefinitions tdefs = JAXB.unmarshal( resource, TDefinitions.class); * Yes kids, we have working schemas

slide-32
SLIDE 32

BPMN: From Tree to Graph

  • No ordering imposed
  • n the description

<process id="Minimal" name="Minimal Process"> <sequenceFlow id="_1-_2" sourceRef="_1" targetRef="_2"/> <endEvent id="_3" name="End"> <terminateEventDefinition/> </endEvent> <sequenceFlow id="_2-_3" sourceRef="_2" targetRef="_3"/> <scriptTask id="_2" name="Hello"> <script>System.out.println("Hello World");</script> </scriptTask> <startEvent id="_1" name="Start"/> </process>

Forward References

slide-33
SLIDE 33

<definitions> <process id="Minimal" name="Minimal Process"> <startEvent id="_1" name="Start"/> <scriptTask id="_2" name="Hello"> <script>System.out.println("Hello World");</script> </scriptTask> <endEvent id="_3" name="End"> <terminateEventDefinition/> </endEvent> <sequenceFlow id="_1-_2" sourceRef="_1" targetRef="_2"/> <sequenceFlow id="_2-_3" sourceRef="_2" targetRef="_3"/> </process>

https://github.com/evacchi/ypaat

<bpmndi:BPMNDiagram> <bpmndi:BPMNPlane bpmnElement="SubProcess"> <bpmndi:BPMNShape bpmnElement="_1"> <dc:Bounds x="11" y="30" width="48" height="48"/> </bpmndi:BPMNShape> <bpmndi:BPMNShape bpmnElement="_2"> <dc:Bounds x="193" y="30" width="80" height="48"/> </bpmndi:BPMNShape> <bpmndi:BPMNShape bpmnElement="_3"> <dc:Bounds x="396" y="30" width="48" height="48"/> </bpmndi:BPMNShape> <bpmndi:BPMNEdge bpmnElement="_1-_2"> <di:waypoint x="35" y="50"/> <di:waypoint x="229" y="50"/> </bpmndi:BPMNEdge> <bpmndi:BPMNEdge bpmnElement="_2-_3"> <di:waypoint x="229" y="50"/> <di:waypoint x="441" y="50"/> </bpmndi:BPMNEdge> </bpmndi:BPMNPlane> </bpmndi:BPMNDiagram> </definitions>

Separate Layout Definition

slide-34
SLIDE 34

<definitions> <process id="Minimal" name="Minimal Process"> <startEvent id="_1" name="Start"/> <scriptTask id="_2" name="Hello"> <script>System.out.println("Hello World");</script> </scriptTask> <endEvent id="_3" name="End"> <terminateEventDefinition/> </endEvent> <sequenceFlow id="_1-_2" sourceRef="_1" targetRef="_2"/> <sequenceFlow id="_2-_3" sourceRef="_2" targetRef="_3"/> </process>

https://github.com/evacchi/ypaat

<bpmndi:BPMNDiagram> <bpmndi:BPMNPlane bpmnElement="SubProcess"> <bpmndi:BPMNShape bpmnElement="_1"> <dc:Bounds x="11" y="30" width="48" height="48"/> </bpmndi:BPMNShape> <bpmndi:BPMNShape bpmnElement="_2"> <dc:Bounds x="193" y="30" width="80" height="48"/> </bpmndi:BPMNShape> <bpmndi:BPMNShape bpmnElement="_3"> <dc:Bounds x="396" y="30" width="48" height="48"/> </bpmndi:BPMNShape> <bpmndi:BPMNEdge bpmnElement="_1-_2"> <di:waypoint x="35" y="50"/> <di:waypoint x="229" y="50"/> </bpmndi:BPMNEdge> <bpmndi:BPMNEdge bpmnElement="_2-_3"> <di:waypoint x="229" y="50"/> <di:waypoint x="441" y="50"/> </bpmndi:BPMNEdge> </bpmndi:BPMNPlane> </bpmndi:BPMNDiagram> </definitions>

Separate Layout Definition

slide-35
SLIDE 35

Step 2

Work like a compiler

slide-36
SLIDE 36

Compiling a programming language

  • You start from a text representation of a program
  • The text representation is fed to a parser
  • The parser returns a parse tree
  • The parse tree is refined into an abstract syntax tree (AST)
  • The AST is further refined through intermediate representations (IRs)
  • Up until the final representation is returned
slide-37
SLIDE 37

Compiling a programming language

  • You start from a text representation of a program
  • The text representation is fed to a parser
  • The parser returns a parse tree
  • The parse tree is refined into an abstract syntax tree (AST)
  • The AST is further refined through intermediate representations (IRs)
  • Up until the final representation is returned
slide-38
SLIDE 38

What makes a compiler a proper compiler

  • Not optimization
  • Compilation Phases
  • You can have as many as you like
slide-39
SLIDE 39
  • Example. A Configuration File

3 Sanitize values 2 Unmarshall file into a typed object 1 Read file from (class)path 5 Coerce to typed values 4 Validate values

slide-40
SLIDE 40
  • Example. Produce a Report

3 Merge into single data stream 2 Discard invalid values 1 Fetch data from different sources 5 Generate synthesis data structure 4 Compute aggregates (sums, avgs, etc.)

slide-41
SLIDE 41
  • Example. A Workflow Engine

2 Collect nodes 1 Read BPMN file 4 Prepare for visit/layout 3 Collect edges

Start End

Hello

slide-42
SLIDE 42

Compilation Phases

  • Better separation of concerns
  • Better testability
  • You can test each intermediate result
  • You can choose when and where each phase gets evaluated
  • More Requirements = More Phases !
slide-43
SLIDE 43

Phase vs Pass

  • Many phases do not necessarily mean as many passes
  • You could do several phases in one pass
  • Logically phases are still distinct
slide-44
SLIDE 44

One Pass vs. Multi-Pass

for value in config: sanitized = sanitize(value) validated = validate(sanitized) coerced = coerce(validated) for value in config: sanitized += sanitize(value) for value in sanitized: validated += validate(value) for value in validated: coerced += coerce(value)

Myth: one pass doing many things is better than doing many passes, each doing one thing

slide-45
SLIDE 45

It is not: Complexity

for value in config: sanitized = sanitize(value) validated = validate(sanitized) coerced = coerce(validated) n times: sanitize = 1 op validate = 1 op coerce = 1 op (1 op + 1 op + 1 op) × n = 3n for value in config: sanitized += sanitize(value) for value in sanitized: validated += validate(value) for value in validated: coerced += coerce(value) n times: sanitize = n op n times: validate = n op n times: coerce = n op (n + n + n) = 3n

slide-46
SLIDE 46

Single-pass is not always possible

However, doing one pass may be be cumbersome or plain impossible to do

<process id="Minimal" name="Minimal Process"> <sequenceFlow id="_1-_2" sourceRef="_1" targetRef="_2"/> <endEvent id="_3" name="End"> <terminateEventDefinition/> </endEvent> <sequenceFlow id="_2-_3" sourceRef="_2" targetRef="_3"/> <scriptTask id="_2" name="Hello"> <script>System.out.println("Hello World");</script> </scriptTask> <startEvent id="_1" name="Start"/> </process>

Forward References

slide-47
SLIDE 47

Workflow Phases: Evaluation

var resource = getResourceAsStream("/example.bpmn2"); var tdefs = unmarshall(resource, TDefinitions.class); var graphBuilder = new GraphBuilder(); // collect nodes on the builder var nodeCollector = new NodeCollector(graphBuilder); nodeCollector.visitFlowElements(tdefs.getFlowElements()); // collect edges on the builder var edgeCollector = new EdgeCollector(graphBuilder); edgeCollector.visitFlowElements(tdefs.getFlowElements());

https://github.com/evacchi/ypaat

2 3 4 5 1

// prepare graph for visit var engineGraph = EngineGraph.of(graphBuilder); // “interpret” the graph var engine = new Engine(engineGraph); engine.eval();

slide-48
SLIDE 48

Workflow Phases: Layout

<?xml version="1.0" encoding="UTF-8"?> <definitions ...> <process id="Minimal" name="Minimal Process"> <startEvent id="_1" name="Start"/> ... </process> <bpmndi:BPMNDiagram> <bpmndi:BPMNPlane bpmnElement="SubProcess"> <bpmndi:BPMNShape bpmnElement="_1"> <dc:Bounds x="11" y="30" width="48" height="48"/> ... </bpmndi:BPMNDiagram> </definitions>

https://github.com/evacchi/ypaat

var resource = getResourceAsStream("/example.bpmn2"); var tdefs = unmarshall(resource, TDefinitions.class); var graphBuilder = new GraphBuilder(); // collect nodes on the builder var nodeCollector = new NodeCollector(graphBuilder); nodeCollector.visitFlowElements(tdefs.getFlowElements()); // collect edges on the builder var edgeCollector = new EdgeCollector(graphBuilder); edgeCollector.visitFlowElements(tdefs.getFlowElements());

2 3 4 5 1

// extract layout information var extractor = new LayoutExtractor(); extractor.visit(tdefs); var index = extractor.index(); // “compile” into buffered image var canvas = new Canvas(graphBuilder, index); var bufferedImage canvas.eval();

slide-49
SLIDE 49

Visitors

slide-50
SLIDE 50

Data Structures

TFlowElement | +---- StartEventNode | +---- EndEventNode | `---- ScriptTask

slide-51
SLIDE 51

Pattern Matching

nodeCollector.visit(node) def visit(node: TFlowElement) = { node match { case StartEventNode(...) => ... case EndEventNode(...) => ... case ScriptTask(...) => ... } }

slide-52
SLIDE 52

The Poor Man's Alternatives

interface Visitor { void visit(TFlowElement el); void visit(TStartEventNode start); void visit(TEndEventNode end); void visit(TScriptTask task); } interface Visitable { void accept(Visitor v); } if (node instanceof StartEventNode) { StartEventNode evt = (StartEventNode) node; ... } else if (node instanceof EndEventNode) { EndEventNode evt = (EndEventNode) node; ... } else if (node instanceof ScriptTask) ScriptTask evt = (ScriptTask) node; ... }

slide-53
SLIDE 53

Visitor Pattern

class NodeCollector implements Visitor { void visit(TStartEventNode start) { graphBuilder.add( new StartEventNode(evt.getId(), evt)); } void visit(TEndEvent evt) { graphBuilder.add( new EndEventNode(evt.getId(), evt)); } void visit(TScriptTask task) { graphBuilder.add( new ScriptTaskNode(task.getId(), task)); } } class EdgeCollector implements Visitor { void visit(TSequenceFlow seq) { graphBuilder.addEdge( seq.getId(), seq.getSourceRef(), seq.getTargetRef()); } }

https://github.com/evacchi/ypaat

slide-54
SLIDE 54

Step 3

Choose a run-time representation

slide-55
SLIDE 55

Workflow Evaluation

  • Choose a representation suitable for

evaluation

  • In our case, for each node, we need to get

the outgoing edges with the next node to visit

  • The most convenient representation of

the graph is adjacency lists

  • adj( p ) = { q | ( p, q ) edges }

var graphBuilder = new GraphBuilder(); ... // prepare graph for visit var engineGraph = EngineGraph.of(graphBuilder); // decorate with an evaluator var engine = new Engine(engineGraph); // evaluate the graph by visiting once more engine.eval(); Map<Node, List<Node>> outgoing;

slide-56
SLIDE 56

Workflow Evaluation

  • The most convenient representation of the graph is adjacency lists
  • adj( p ) ↦ { q | ( p, q ) edges }
  • Map<Node, List<Node>> outgoing
slide-57
SLIDE 57

Evaluation

class Engine implements GraphVisitor { void visit(StartEventNode node) { logger.info("Process '{}' started.", graph.name()); graph.outgoing(node).forEach(this::visit); } void visit(EndEventNode node) { logger.info("Process ended."); // no outgoing edges } void visit(ScriptTaskNode node) { logger.info("Evaluating script task: {}", node.element().getScript().getContent()); graph.outgoing(node).forEach(this::visit); } ... }

https://github.com/evacchi/ypaat

slide-58
SLIDE 58

Workflow Layout

  • In this case, for each node and edge,

we need to get the shape and position

  • No particular ordering is required
  • e.g. first render edges and then shapes

<?xml version="1.0" encoding="UTF-8"?> <definitions ...> <process id="Minimal" name="Minimal Process"> <startEvent id="_1" name="Start"/> ... </process> <bpmndi:BPMNDiagram> <bpmndi:BPMNPlane bpmnElement="SubProcess"> <bpmndi:BPMNShape bpmnElement="_1"> <dc:Bounds x="11" y="30" width="48" height="48"/> ... </bpmndi:BPMNDiagram> </definitions>

var canvas = new Canvas(graph, index); var bufferedImage canvas.eval(); void eval() { graph.edges().forEach(this::draw); graph.nodes().forEach(this::visit); }

https://github.com/evacchi/ypaat

slide-59
SLIDE 59

Layout

class Canvas implements GraphVisitor { void draw(Edge edge) { var pts = index.edge(edge.id()); setStroke(Color.BLACK); var left = pts.get(0); for (int i = 1; i < pts.size(); i++) { var right = pts.get(i); drawLine(left.x, left.y, right.x, right.y); left = right; } } void visit(StartEventNode node) { var shape = shapeOf(node); setStroke(Color.BLACK); setFill(Color.GREEN); drawEllipse(shape.x, shape.y, shape.width, shape.height); drawLabel(element.getName()); } ... }

Start End

Hello

slide-60
SLIDE 60

Bonus Step 4

Generate code at compile-time

slide-61
SLIDE 61

The Killer App

  • Move pre-processing out of program run-time
  • Generate code
  • Run-time effectively consists only in pure processing
slide-62
SLIDE 62
slide-63
SLIDE 63

AI and Automation Platform

  • Drools rule engine
  • jBPM workflow platform
  • OptaPlanner constraint solver
slide-64
SLIDE 64

The Submarine Initiative

“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.” Edsger W. Dijkstra

slide-65
SLIDE 65

GraalVM: “One VM to Rule Them All”

  • Polyglot VM with cross-language JIT
  • Java Bytecode and JVM Languages
  • Dynamic Languages (Truffle API)
  • Native binary compilation (SubstrateVM)
slide-66
SLIDE 66

GraalVM: “One VM to Rule Them All”

  • Polyglot VM with cross-language JIT
  • Java Bytecode and JVM Languages
  • Dynamic Languages (Truffle API)
  • Native binary compilation (SubstrateVM)
slide-67
SLIDE 67

Native Image: Restrictions

  • Native binary compilation
  • Restriction: “closed-world assumption”
  • No dynamic code loading
  • You must declare classes you want to reflect upon
slide-68
SLIDE 68

Quarkus

slide-69
SLIDE 69

Drools and jBPM

rule R1 when // constraints $r : Result() $p : Person( age >= 18 ) then // consequence $r.setValue( $p.getName() + " can drink"); end

Drools jBPM

slide-70
SLIDE 70

Drools DRL

rule R1 when // constraints $r : Result() $p : Person( age >= 18 ) then // consequence $r.setValue( $p.getName() + " can drink"); end var r = declarationOf(Result.class, "$r"); var p = declarationOf(Person.class, "$p"); var rule = rule("com.example", "R1").build( pattern(r), pattern(p) .expr("e", p -> p.getAge() >= 18), alphaIndexedBy( int.class, GREATER_OR_EQUAL, 1, this::getAge, 18), reactOn("age")),

  • n(p, r).execute(

($p, $r) -> $r.setValue( $p.getName() + " can drink")));

slide-71
SLIDE 71

jBPM

RuleFlowProcessFactory factory = RuleFlowProcessFactory.createProcess("demo.orderItems"); factory.variable("order", new ObjectDataType("com.myspace.demo.Order")); factory.variable("item", new ObjectDataType("java.lang.String")); factory.name("orderItems"); factory.packageName("com.myspace.demo"); factory.dynamic(false); factory.version("1.0"); factory.visibility("Private"); factory.metaData("TargetNamespace", "http://www.omg.org/bpmn20");

  • rg.jbpm.ruleflow.core.factory.StartNodeFactory startNode1 = factory.startNode(1);

startNode1.name("Start"); startNode1.done();

  • rg.jbpm.ruleflow.core.factory.ActionNodeFactory actionNode2 = factory.actionNode(2);

actionNode2.name("Show order details"); actionNode2.action(kcontext -> {

slide-72
SLIDE 72

Startup Time

slide-73
SLIDE 73

Conclusion

slide-74
SLIDE 74

Take Aways

  • Process in phases
  • Do more in the pre-processing phase (compile-time)
  • Do less during the processing phase (run-time)
  • In other words, separate what you can do once from what you

have to do repeatedly

  • Move all or some of your phases to compile-time
slide-75
SLIDE 75

Resources

  • Full Source Code https://github.com/evacchi/ypaat
  • Your Program as a Transpiler (part I)
  • Improving Application Performance by Applying Compiler Design

http://bit.ly/ypaat-performance

  • Other resources
  • Submarine https://github.com/kiegroup/submarine-examples
  • Drools Blog http://blog.athico.com
  • Crafting Interpreters http://craftinginterpreters.com
  • GraalVM.org
  • Quarkus.io

Edoardo Vacchi @evacchi

slide-76
SLIDE 76

Q&A