high performance network applications in the capital

High Performance Network Applications in the Capital Markets Todd - PowerPoint PPT Presentation

High Performance Network Applications in the Capital Markets Todd L. Montgomery VP Architecture, Messaging Business Unit @toddlmontgomery 1 1 Why do Developers use Messaging? Message-Oriented Middleware (MOM) Abstraction (Pub-Sub,

  1. High Performance Network Applications in the Capital Markets Todd L. Montgomery VP Architecture, Messaging Business Unit @toddlmontgomery 1 1

  2. Why do Developers use Messaging? Message-Oriented Middleware (MOM) • Abstraction (Pub-Sub, Req/Resp, Queuing) • Separate physical systems from communication • Easily modify logic and scale applications • Functionality • Guaranteed delivery, fault tolerance, load balancing… • Efficiency • Well designed messaging systems reduce infrastructure • Leverage broad, deep and detailed expertise • Focus on core competencies, Faster Time-to-Market 2 2 2

  3. Market Data Growth Data Deluge Aggregated One Minute Peak Messages Per Second Rates 7,174 Arca, CTS, CQS, OPRA, NQDS (in thousands ) 5,957 > 1Terabyte of Data per Day 4,380 3,410 Total Options 2,562 Equities 1,925 1,562 1,100 696 160 310 559 120 26 5 7 10 13 Dec-00 Dec-01 Dec-02 Dec-03 Dec-04 Dec-05 Dec-06 Dec-07 Dec-08 Dec-09 Dec-10 Dec-11 3 3

  4. The Trader Why Latency Matters Fast Market Data Feed Handler Execution Ultra Messaging Got INFA at 40.00 INFA at 40.00 Slow You Lost! Market Data Feed Handler Execution TIBCO RV and EMS Got Starting Line INFA at 41.00 4 4

  5. The Exchange Why Latency Matters Exchanges Alpha Order Ultra Messaging Tango Trader Acknowledge TIBCO RV / EMS You Both Lost! Hotel Cancel Homegrown 5 5

  6. (Ultra) Low Latency Timeline Race to Zero – Less than 8 years, 10,000x-100,000x decrease! 4-5 ms Predictions – Technology • <1 μs Eth (2012) • <500 ns Eth (2015) • <100 ns Eth (2020) 200 μs Predictions – Technique • <100 ns IPC (2012) 10 μs • 1G mps ITC (2012) 2 μs <400 ns IPC/ITC only <50 ns Limited by CPU! ≤2003 2004 2008 2010 2010 2011 (Ethernet) (Ethernet) (IB) (10G,IB) (IPC) (ITC) Application to Application Latency 6 6

  7. Legacy Messaging Designs Before 2004 Daemon Based Design 6 Data Hops Broker Based Design 4 Data Hops 7 7

  8. 2004 – Need for a State Change More Efficient, More Scalable, More MORE … • Motivations / Challenges • Systems not scaling to todays (yet alone tomorrows!) demands • Systems not resilient to failure • Trends: • Need Efficiency, Need Consolidation, More with Less, Need Competitive Advantage (No Vendor Innovation) • Broker-based Solutions are a Bottleneck • Broker is a source of contention that limits scaling • Broker failure disastrous to latency and stability Remove the Broker from the Message Path! 8 8

  9. Shared Nothing Messaging MOM for Todays Demands • Peer-to-Peer Messaging • No broker, No daemons • Direct connectivity between sources and receivers • Parallel Persistence • Broker out of message path and off to the side • Broker consulted only for recovery • Evolution of Queuing • Single Messaging API across all Use Cases • Source-based (vs. Immediate), Event Driven • No need for separate Queuing (or PTP) API 9 9

  10. Topic Resolution Connecting Sources and Receivers (Peer-to-Peer) Traditionally, brokers handled the task of S X S Z providing transparent R Z connectivity between sources and receivers R Y S Z R X R X R Z Separate the message S Y delivery path and the topic discovery mechanism! “Service” Location Paradigms • Static – manual, difficult scaling with topics Avoid including topic string • Server-based – (non)caching variants in each message! • Multicast – (un)reliable variants 10 10

  11. Data Transport Choices Customization of Connectivity • Transport Types – No One Size Fits All! • Unicast (Optimize for single receivers) • TCP (with varying buffering behaviors), Reliable Unicast (without congestion control) • Multicast (Optimize for multiple receivers) • (Un)Reliable Multicast (NAK-based) • Intra-Host (Optimize for lowest latency) • IPC (Shared Memory), Inter-Thread (ITC) • Source Configuration • Runtime choice 11 11

  12. Less Controlled Infrastructures Architecture for Conflation and Rate Adaptation All Receivers are Not Equal! S Z • Desktops R Z • Web (HTML5/WebSockets Ideally) • Mobile Apps DS Rate Adaptation TCP • “Non - ”Intelligent Data Drops … • Tail, Oldest, Head, etc. R Z R Z R Z • Per-Topic vs. Per-Receiver vs. Per- Connection Need Per-Receiver backpressure in Conflation order to adapt. TCP provides ideal • Conflate Data from multiple flow and congestion control in these buffered messages into one environments and thus ideal • Data Representation Specific backpressure signaling. 12 12

  13. Traditional Persistence Store and Forward Architecture Receiver/Delivery Durability • Receiver can crash or go down gracefully without loss of messages upon restart • Recovery is the act of restarting and S Z recovering missed messages • Durability can be extended to Sources also Brokered Architecture Limits Broker • Broker is point of contention • Slow receiver impacts source and, more importantly, other receivers • Broker typically SAN backed (scaling limited) R Z R Z • Recovery is “pushed” to receiver by broker Deployments can only scale by adding brokers and splitting the topic space 13 13

  14. Parallel Persistence Durable Delivery without Penalty Store not in the Message Path • Stores receive data in parallel to receivers Store • Consumption Feedback (ACKs) are out-of-band S Z • Recovery can occur in parallel to “live” data delivery Store • Receiver-driven recovery • Receivers pull data from stores Store • Stores maintain much less state and do much less • No need to track receiver recovery, for example R Z R Z • Recovery does not impact source or other non- recovering receivers • Dissemination from source to stores and receivers Consumption uses normal peer-to-peer messaging Information Store ≠ Broker Stores do less work, maintain less state, and can scale! 14 14

  15. Quorum Shared Nothing Approach to Persistence Message Message Message Message 1 2 3 4 … Store 1 … S Z Store 2 … Store 3 Resiliency • Avoids “Split - Brain” (majority must be reachable post failure) • Stores persist locally independently • Only need Quorum (majority) to withstand failure of minority • Zero Latency Failover – no need to stop or change behavior Performance • Per-Message Striping (+50% per store as shown) 15 15

  16. Consensus Receiver Recovery and Arbitration Receiver Recovery • Receivers ask Stores for message consumption Store status and take majority or highest (arbitration) • Receivers “pull” messages from stores S Z Store • Load balancing across Stores to spread out impact of recovery Store • Rate of recovery up to individual receivers • Rate of recovery not bound by individual store • Handling the “live” stream from the Source R Z R Z • Ignore it or Buffer it (up to individual receiver) • Seamless cutover from recovery to live • Live Recovering Source too fast? • Receiver can ignore live stream and pull from stores at slower pace 16 16

  17. Messaging API – Sending Simplifying the Semantics – Publish/Subscribe Immediate Sends Source-Based Sends send(“topic A”, data, length); srcA = create_src (“topic A”); s end(“topic B”, dataB, lengthB); srcB = create_src (“topic B”); … send(srcA, data, length); send(srcB, dataB, lengthB); … delete_src(srcA); delete_src(srcB); JMS JMS Create MessageProducer without Create Topic and TopicPublisher Destination and specify Destination on each send Source-Based APIs Can leverage Topic Resolution in order to reduce message path latency 17 17

  18. Messaging API – Receiving Simplifying the Semantics – Publish/Subscribe Event-Driven Reception How do you handle receiving on thousands to millions of topics? int msg_proc(msg *m, void *cd) { /* handle m based on cd value (rA_state or rB_state) and/or m contents */ } … rcv1 = create_rcv (“topic A”, msg_proc, rA_state); Rcv2 = create_rcv (“topic B”, msg_proc, rB_state); … JMS Create Topic and TopicSubscriber Attach MessageListener 18 18

  19. Queuing Semantics Load Balancing + De-Coupling • What semantics are needed for Queuing? • Load Balancing (Once-and-Only-Once) • Decoupling • Source Rate vs. Receiver Consumption Rate • Source Lifetime vs. Receiver Lifetime • What APIs are needed for Queuing? • JMS has the Point-to-Point API • PTP and Pub/Sub share most calls and interfaces Does this need to be different than Pub/Sub?!? 19 19

  20. Queuing is Dead, Long Live Queuing! No Need For Point-to-Point to be Different Replace “Queue” with “Topic” Queuing Publish/Subscribe Sources send to Queues Sources send to Topics Receivers receive from Queues Receivers receive from Topics Single Semantic – Publish/Subscribe • A queue can be considered a topic • Need Load Balancing per topic • Need Rate and Lifetime Decoupling per topic Point-to-Point API – Redundant • Subsume the PTP receive call into Pub/Sub 20 20

  21. Persistence + Queuing Semantics Load Balancing + De-Coupling Load Balancing • Assignment separate from Data Dissemination Store • Source Assigned • Receivers up-to-pace S Z Store • Consumption can backpressure source • Store Assigned Store • Receivers request messages (i.e. pull) • Assignments sent out-of-band from Data R Z R Z Rate and Lifetime Decoupling Already Done by Assignment and Consumption Parallel Persistence! 21 21


More recommend