Excel on the Java VM Generating Fast Code from Spreadsheet Models - - PowerPoint PPT Presentation

excel on the java vm
SMART_READER_LITE
LIVE PREVIEW

Excel on the Java VM Generating Fast Code from Spreadsheet Models - - PowerPoint PPT Presentation

Excel on the Java VM Generating Fast Code from Spreadsheet Models Peter Arrenbrecht codewise.ch / Abacus Research AG Submission ID: 30 AGENDA > Problem: Customization Is Hard > Idea: Let Users Use Spreadsheets > Implementation:


slide-1
SLIDE 1

Excel on the Java VM

Generating Fast Code from Spreadsheet Models

Peter Arrenbrecht codewise.ch / Abacus Research AG Submission ID: 30

slide-2
SLIDE 2

2

AGENDA

> Problem: Customization Is Hard > Idea: Let Users Use Spreadsheets > Implementation: Abacus Formula Compiler

slide-3
SLIDE 3

3

It's All About Customization

> Calculation of key values varies – From user to user – Over time > Examples – Prices and conditions – Shipping rates – Salary components

Variations in employment terms

Variations in local regulations

– Asset valuations – Insurance premiums – Risk estimates – ...

slide-4
SLIDE 4

4

Traditional Customization Approaches

> Custom development (options targeted at needs of specific customers)

Hard to maintain due to code and data bloat

Reduced quality in UI, documentation and testing (“it's only for a single customer”)

Costly and slow for customer when he needs a change

> Parametrization (options trying to capture reusable aspects of specific needs)

Often hard to discover and configure; complex interplay

Testing nightmare due to endless configuration combinations

Hard to maintain due to code and data bloat

Still costly and slow when customer needs unanticipated feature

> Expression builder (generic expressions with access to relevant input values)

Very costly to implement with sufficient flexibility, usability and quality

Many implementations basically just support defining one-liners

Can still lead to testing nightmare

Can be slow when used heavily

> Hard on both you and your customers

slide-5
SLIDE 5

5

A Typical Expression Builder

> Abacus Payroll: Definition of salary components (Lohn 411) > Fairly generic expression builder

Basic set of operators, structured multi-line expressions

Can reference input values and results of other calculations

> Nice for simple cases

Can define simple one-liners quickly

> Hard for complex cases

Grouping? If-then-else? Debugging? Documenting? Change control?

Table lookup

slide-6
SLIDE 6

6

Trying To Support Complex Cases

> Abacus Payroll: Definition of lookup tables (Lohn 421)

Criteria can reference inputs or defined expressions

So can values

> Shortcomings

Separate UI

To and fro with expressions

Again no visual feedback etc.

slide-7
SLIDE 7

Fixing the Expression Builder: For Users

> Modelling power

Complex composition of basic functions (advanced functions are rarely needed)

Named intermediate values to structure and to reuse code

Multiple distinct cases with complex decision logic

Lookup tables (price/rebate by category/threshold) with computed entries

> Debugging / Verification

See effects of different sample input values to gain confidence in correctness

Trace effects through intermediate values

Trace and debug computations on actual data sets

Allow to record test set of input/output values to be verified after system updates

> Documentation

Annotate intermediate steps with to-the-point explanatory text

Just a plain-text field requires too much duplication

> Versioning

Can keep textual log of changes, but preferably keep log of old versions around

slide-8
SLIDE 8

Solution: Spreadsheets (Excel, OpenOffice Calc)

> Familiar to most users (at least those who would do customization) > Very good modelling power

Handling multiple distinct cases can be a little non-visual, but manageable

> Very interactive

Immediate feedback

Dependency tracing

What-if

Goal finding

> Can be documented

Annotate individual cells

Use colors and fonts

> Can be tracked

Keep versions of files around at will (file system, blobs, version control)

Compare versions of files

Integrate changelog

slide-9
SLIDE 9

But we need to add

> A way to specify input and output values

Especially a way to discover available inputs and expected outputs

– Ideas

Excel-/OpenOffice-Plugin

Decoupled: Users name input/output cells arbitrarily, then they can define bindings for these names in a separate step in your app

> Access to actual input values for debugging > Running tests with sets of input and expected output values

slide-10
SLIDE 10

Salary Component Revisited

> Abacus supports same calculcation in a spreadsheet

Immediate feedback

Everything at a glance

Keep notes

Keep versions

slide-11
SLIDE 11

Example

> Abacus Service and Contract Management > Calculate the price of an Abacus ERP module – Lots of compositions of simple functions – Lots of decisions (IF) – Lots of lookup tables – Integrated documentation

Changelog

Notes on individual cells

– Use of colors and fonts

slide-12
SLIDE 12

Fixing the Expression Builder: For Developers

> Integrating the expression builder

Access to input values and provided output values is an external API

Needs all the care that should always go into an external API

Decoupling, stability, discoverability, documentation

> Testing the expression evaluation engine

Usage patterns not predictable

Need very high code coverage

> Testing its use in the application

Users can access input values in any order

Keep input value accessors side-effect free

Then testing input value accessors in isolation should do

> Performance

Usually not critical at first, when used in interactive applications

But can quickly become so for batch updates

slide-13
SLIDE 13

Solution: Abacus Formula Compiler (AFC)

> Compiles formulaic content of spreadsheets to plain Java classes

http://www.formulacompiler.org/

Supports Excel and OpenOffice formats (.xls, .xlsx, .ods)

Supports a number of numeric data types (double, BigDecimal, scaled long)

> Compiler and compiled classes run on a plain JRE

Does not require Excel or OpenOffice to compile or run engines

Generates bytecode directly, so does not need a full JDK (tools.jar)

But integrates a decompiler so you can look at generated code

> Performs well

Compiled classes are JITted like any other Java class

Tuned for heavy use of a bunch of computations

Not for thousands of different computations each used rarely

Cheap instantiation supports thread isolation via per-computation instances

Choice of stateless mode or stateful mode that caches intermediate cell values

slide-14
SLIDE 14

AFC continued (1)

> Integrates well with application code

Pulls input values from specified application-specific Java interfaces

Supplies output values as implementations of application-specific Java interfaces

So can provide custom computation strategy implementations in addition to classically coded default ones

Binding of cells to input and output values can be fully controlled

Optional simple by-name binding of input and output cells

> Handles repeating data

Designated spreadsheet areas are treated as repeating sections

Iterates over iterables and arrays

slide-15
SLIDE 15

Example

> Compiling the Abacus ERP module price calculation – Simple setup – Uses and implements our own interfaces – Can switch to BigDecimal – Decompile to see generated Java code

slide-16
SLIDE 16

AFC continued (2)

> Modular design – Compact runtime for loading and running precompiled engines – Support for different spreadsheet formats – Build spreadsheet models in- memory to replace home- grown expression engines > Extensively tested and documented > Available under GPL and commercially

slide-17
SLIDE 17

AFC Internals

> Compile-time constant folding using interpreter > Template methods – Used by interpreter to compute subexpressions – Decompiled at build-time to generate methods emitting equivalent byte-code > High-level functional language – To implement complex operations

Define SUM in terms of iteration and +

– Interpreted by interpreter – Compiled to low-level constructs by compiler > Yes, we have developer documentation!

http://www.formulacompiler.org/contribute/hacking/index.htm

slide-18
SLIDE 18

Summary

> Users need to customize calculations > Use spreadsheets as their UI

Well known, interactive, traceable

Needs good application integration

> Abacus Formula Compiler was built for this

Integrates efficiently with your interfaces

Compiles to fast code

Can use BigDecimal for financial applications

No Excel, OpenOffice, nor even a JDK needed

In production use at Abacus Research AG

Extensive tests

Comprehensive documentation

GPL and commercial version available

> Still requires discipline to remain testable!

Spreadsheets are free of side-effects – your APIs should be too

> http://www.formulacompiler.org/

slide-19
SLIDE 19

Peter Arrenbrecht http://arrenbrecht.ch/ codewise.ch peter@arrenbrecht.ch Abacus Formula Compiler

http://formulacompiler.org/

Abacus Research AG http://abacus.ch/