designing for scalability
play

Designing for Scalability Patrick Linskey pcl@apache.org Patrick - PowerPoint PPT Presentation

Designing for Scalability Patrick Linskey pcl@apache.org Patrick Linskey Apache OpenJPA Committer JPA 1, 2 EG Member EJB3, EJB3.1 EG Member Agenda Define and discuss scalability Vertical Horizontal Examine ways to make


  1. Designing for Scalability Patrick Linskey pcl@apache.org

  2. Patrick Linskey Apache OpenJPA Committer JPA 1, 2 EG Member EJB3, EJB3.1 EG Member

  3. Agenda  Define and discuss scalability • Vertical • Horizontal  Examine ways to make software scale • Code / Algorithms • Asynchronous Libraries • Other Languages

  4. Scalability  Ability to increase the total number of operations performed in a unit of time  Vertical Scalability: • “Make the machine bigger”  Horizontal Scalability • “Add more machines”

  5. Bottlenecks  Limit the scalability of a system  Intrinsic bottlenecks  Artificial bottlenecks

  6. Example Problem Domain  Financial fund management  Multiple in-house engineering needs • Trade Execution • Trade Settlement • Strategy Definition • Strategy Simulation • Portfolio Risk Analysis

  7. Vertical Scalability Translated into Java: Scaling Within a Machine

  8. Vertical Scale Factors In Your Control  Improve code efficiency • Memory • CPU  Optimize I/O between physical tiers • Web 2.0: beware!  Make code scale across multiple cores / CPUs

  9. Code Optimization Possibilities  Performance and scalability are linked  Scalability: more operations per time unit time time “Quick and dirty” time Architectural

  10. “Scale” Vertically via Code Optimization  Reduce copying, looping, etc. • “Write good code”  SQL statement batching • PreparedStatement.addBatch() • ORM frameworks  Transaction batching • Especially powerful in XA environments • JMS message batching

  11. Synchronization  synchronized is for asynchronous execution • “Execute this block of code in its entirety before others that share this lock”  Modern computers handle high* concurrency • synchronized is often a bottleneck • Avoid synchronization at runtime at all costs  uncontended synchronization is cheap

  12. Write-Once Shared Memory class SlowTradeManager { class FastTradeManager { private Set types; private Set types; public synchronized Set public Set getTradeTypes() { getTradeTypes() { if (types == null) if (types == null) types = loadTypeData(); types = loadTypeData(); return types; return types; } } } } loadTypeData() might be called more than once

  13. Fund Risk Balancing  Problem • Multiple traders act on the same security  Solution • Maintain fund-global position data • Mutable shared state!

  14. Multi-machine solution (circa 1998) time

  15. Multi-core / CPU synchronization sync sync sync sync sync time

  16. Mutable Shared Memory import java.util.concurrent.atomic.AtomicDouble; class AggregateFundPosition { private AtomicDouble totalExposure = new AtomicDouble(0); public double incrementBy(double amount) { while (true) { double old = totalExposure.get(); double next = old + amount; if (counter.compareAndSet(old, next)) return next; } } }

  17. Synchronization-free shared state CAS CAS CAS CAS CAS CAS time

  18. Horizontal Scalability Translated into Java: Scaling Across Machines

  19. Horizontal Scaling: Add More Servers  All doing the same thing  Partitioned by infrastructure layer  Partitioned by application role  Partitioned along data graph boundaries

  20. Build a Farm App App OS OS Load Balancer App App App OS OS OS App App OS OS

  21. Slow Down EJB Web App OS OS OS 237ms 983ms

  22. Divide and Conquer  Old as ` time ` itself • mail, news, telnet all on different servers  You use partitioning every day • Telephone call routing • ATM card transactions • Stock markets • Elevator banks

  23. Break Up Stateful Services Worldwide Trade Execution, Clearing, Position Analysis Apps Apps Apps Apps OS OS OS OS

  24. Partition Along Application Boundaries Trade Clearing Position Analysis Trade Execution Apps Apps Apps Apps Apps Apps OS OS OS OS OS OS

  25. Partition along data set “fault lines” US Asia Europe Apps Apps Apps Apps Apps Apps OS OS OS OS OS OS

  26. Asynchrony in Java  Java is a mostly synchronous environment  Business algorithms often aren’t  Take advantage of this where possible • JMS message queues • java.util.concurrent.ExecutorService • commonj.work.WorkManager • Scheduled jobs

  27. Async Tasks and Resource Utilization  Good JMS servers / ExecutorServices / WorkManagers do resource tuning and optimization • Limit threads allocated to async processing • Configure priority of async vs. sync (i.e., HTTP request) async tasks throttled async task backlog handled Trade Settlement Strategy Analysis 100 Trade Execution and Strategy Definition 75 50 25 0

  28. Adapt Requirements to Concurrency  Identify slow-running / expensive parts of the user experience  Work with requirements team to replace these with asynchronous processes • Website usage statistics generated nightly instead of on-demand • Dynamic PDF delivery via email instead of embedded web content

  29. Starting from Scratch

  30. Choose Your Toolset  Java makes synchronization easy • ... but synchronization != scalability  Other languages avoid shared state • Rely on message-passing instead

  31. Erlang: Functional, Asynchronous, Mature  Designed for concurrency in the language • Parallel execution • Intrinsic hot-redeploy • State can only be assigned once  Communication happens via message-passing between actors • No threads no shared state! • JMS-like behavior; language-native syntax

  32. Scala: Functional Programming for the JVM  Java-integrated • Designed by Java stalwart Martin Odersky  JVM-optimized  Supports Erlang-style concurrency

  33. Compute Grids  Federate your data around a cluster  Decompose your algorithm into serializable work items  Let the compute grid send your work items to the data

  34. Decision Factors  What are your application requirements? • How many concurrent operations? • How big of a workload? • What sorts of SLAs?  Tolerance of deployment complexity? • How about your operations, QA teams?

  35. Recap  Concepts  Technology • java.util.concurrent • Scalability • j.u.concurrent.atomic • Bottlenecks • Operation batching • Synchronization  Transactions  SQL • Asynchrony vs. concurrency • JMS; Executor; WorkManager • Compare-and-set • Scala and Erlang • Application Partitioning • Hibernate Shards • OpenJPA Slice • Synchronous tasks vs. asynchronous tasks

  36. Questions Patrick Linskey pcl@apache.org

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend