numa friendly stack
play

NUMA-Friendly Stack (using Delegation and Elimination) Irina Calciu - PowerPoint PPT Presentation

NUMA-Friendly Stack (using Delegation and Elimination) Irina Calciu Justin Gottschlich Maurice Herlihy HotPar 13 1 Trends for Future Architectures 2 Uniform Memory Access (UMA) 3 Non-Uniform Memory Access (NUMA) NUMA NODE (multiple


  1. NUMA-Friendly Stack (using Delegation and Elimination) Irina Calciu Justin Gottschlich Maurice Herlihy HotPar ‘13 1

  2. Trends for Future Architectures 2

  3. Uniform Memory Access (UMA) 3

  4. Non-Uniform Memory Access (NUMA) NUMA NODE (multiple cores, shared NUMA NODE (multiple cores, shared Last Level Cache) Last Level Cache) ( interconnect ) NUMA NODE (multiple cores, shared NUMA NODE (multiple cores, shared Last Level Cache) Last Level Cache) Cache coherency maintained between caches on different NUMA nodes 4

  5. Overview • Motivation • Algorithms • Results • Conclusions 5

  6. Delegation NUMA node 0 NUMA node 1 Clients Clients Server SEQ STACK 6

  7. Delegation NUMA node 0 NUMA node 1 SEQ STACK Slots Slots Client 1 Client 5 Client 2 Client 6 Server Loops through Client 3 Client 7 Client 4 all slots Client 8 7

  8. Elimination, Rendezvous 8

  9. Local Rendezvous NUMA node 0 NUMA node 1 STACK 9

  10. Delegation + Elimination NUMA node 0 NUMA node 1 Clients Clients Server SEQ STACK 10

  11. Delegation + LOCAL Elimination NUMA node 0 NUMA node 1 Clients Clients Server SEQ STACK 11

  12. Effect of Elimination Throughput (Better) 90% push 10% pop 50% push 50% pop 12

  13. Effect of Delegation Throughput (Better) 90% push 10% pop 50% push 50% pop 13

  14. Number of Slots Throughput (Better) 90% push 10% pop 50% push 50% pop 14

  15. Workloads: Balanced vs. Unbalanced Throughput (Better) 70% push 30% pop 50% push 50% pop 15

  16. Advantages • Memory and cache locality • Reduced bus traffic • Increased parallelism through elimination 16

  17. Drawbacks • Communication cost between clients and server thread o Insignificant compared to the benefits • Serializing otherwise parallel data structures o Parallelism through elimination • Elimination opportunities decrease as workload more unbalanced 17

  18. Open Questions • Are there other data structures where we can use delegation and elimination? • Are there data structures where direct access is much better? • What can we do for those data structures? 18

  19. Thank you! Questions? 19

  20. References • A Scalable Lock-free Stack Algorithm http://www.inf.ufsc.br/~dovicchi/pos-ed/pos/artigos/p206- hendler.pdf • Flat Combining and the Synchronization-Parallelism Tradeoff http://www.cs.bgu.ac.il/~hendlerd/papers/flat-combining.pdf • Fast and Scalable Rendezvousing http://www.cs.tau.ac.il/~afek/rendezvous.pdf 20

  21. Cache to Cache Traffic Better 21

  22. Coefficient of Variation Better 22

  23. Flat Combining 23

  24. Delegation SERVER CLIENT Loop through all slots: Find corresponding slot If slot has message: (by NUMA node and cpuid) Post message Wait for response Take message Process message Send response Get response Time 24

  25. Delegation SERVER CLIENT Loop through all slots: Find corresponding slot If slot has message: (by NUMA node and cpuid) try_elimination: if (eliminate) return Post message Wait for response Take message Process message Send response Get response else try_elimination Time 25

  26. Delegation SERVER CLIENT Loop through all slots: Find corresponding slot If slot has message: (by NUMA node and cpuid) try_elimination: if (eliminate) return if (Acquire slot lock) Post message Wait for response Take message Process message Send response Get response Release slot lock else try_elimination Time 26

  27. Open Questions • Performance • Scalability • Power 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend