scala for business automation
play

Scala For Business Automation Solving Real Business Problems with - PowerPoint PPT Presentation

Scala For Business Automation Solving Real Business Problems with Streams, POI and a Dash of BDD Conor Svensson conor10@gmail.com The Spreadsheet The PCs first killer app The London Whale the model operated through a series of Excel


  1. Scala For Business Automation Solving Real Business Problems with Streams, POI and a Dash of BDD Conor Svensson conor10@gmail.com

  2. The Spreadsheet The PC’s first killer app

  3. The London Whale …the model operated through a series of Excel spreadsheets, which had to be completed manually, by a process of copying and pasting data from one spreadsheet to another… Report of JPMorgan Chase & Co. Management Task Force Regarding 2012 CIO Losses

  4. A brief history… • 1979 VisiCalc - The first killer app for PCs on Apple II • 1983 Lotus 1-2-3 • 1985 Excel 1.0 on Apple • 1987 Excel 2.0 on MS-DOS • 1992 Excel 3.0 on Windows • 1994 StarCalc (Star Office) • 2001 Calc (Open Office) • 2006 Google Spreadsheets • 2007 iWork Numbers

  5. Today

  6. Use cases • Traditional • Accounting/inventory • Project management • Simple data analysis • Reporting • Exotic • Business process workflows • Risk management • Transaction capture • Reconciliations

  7. Why? And then… Then… Initially… “I can solve this problem It becomes business Management/audit/ with a spreadsheet” critical regulators arrive… Avoid more • “W hy do we have • expensive build via these decentralised a proper solution business critical (bureaucratic spreadsheets?” change processes don’t help here) “We need a proper • solution” Lack of • development knowledge

  8. Complex spreadsheets • Multiple lookup tables with reference data • Data from multiple data sources • Series of complex calculations • Output reports • Centralised via a file share (if lucky), multiple copies tend to exist due to e-mail

  9. The result

  10. Automation

  11. 1. Apache POI • Poor Obfuscation Implementation • The Java API for Microsoft Documents • Work with Office files without a license (Excel, Word, PowerPoint, Outlook, Visio, Publisher) • Text extraction applications - web spiders, index builders, content management systems • Business automation…

  12. 2. Scala • Expressive and concise • Streams, for comprehensions, and case classes • Many organisations have not yet made the switch to Java 8 • Slots easily into existing Java dominant environments (e.g. financial institutions)

  13. Streams

  14. For comprehension Instead of: We can use:

  15. Case classes Scala provides: • field accessors • constructors • equals • hashCode • toString • copy

  16. 3. BDD in Scala Examples use ScalaTest’s FlatSpec with Matchers

  17. An example Q: Is there a correlation between Bitcoin and commodity returns?

  18. Source data • Bitcoin prices • ASX SPI 200 (Australian commodities index)

  19. Our model • Reorder data • Create named ranges • Merge price series by date • Calculate: 1. Returns of our price series 2. Mean 3. Standard deviation 4. Covariance 5. Correlation

  20. The Code ExcelReader • Use named ranges or sheet/row/col indices Stats • Statistical library Correlation • Incremental extraction of business logic with Streams/for-comprehensions

  21. ExcelReader

  22. Stats

  23. Correlation

  24. End to end validation Tests that all can understand!!!

  25. Clear test output

  26. Building on this • Rapid prototyping/MVP • Spreadsheet used as initial datasource • Micro-spreadsheet-services, etc… (sorry)

  27. Limitations • VBA evaluations are not supported, only formula evaluations are • Some Excel functions are not yet implemented (but you can contribute easily) • It’s not a silver bullet for avoiding spreadsheets, merely a tool to ease the pain of moving off them • Additional overhead of more granular stream processing

  28. In summary • Useful tools for rapid prototyping/MVP without risking data quality • Reduced implementation error • Spreadsheet can be modelled incrementally • BDD test cases verified against original source • Use FP techniques • Very well suited to streams • Promoting immutability of our data structures

  29. Further resources • https://github.com/conor10/spreadsheets - source code and example code supporting this talk • https://www.quandl.com - Quandl for data • https://poi.apache.org/ - Apache POI Project • http://scalatest.org/ - ScalaTest • Power, D. J., "A Brief History of Spreadsheets", DSSResources.COM, World Wide Web, http://dssresources.com/history/sshistory.html, version 3.6, 08/30/2004 • http://blogs.office.com/2012/09/13/introducing-spreadsheet-controls-in-the- new-office/ - Controls for critical spreadsheets • http://fsprojects.github.io/ExcelProvider/ - F# Excel Type Provider

  30. Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend