hornet an efficient data structure for dynamic sparse
play

Hornet: An Efficient Data Structure for Dynamic Sparse Graphs and - PowerPoint PPT Presentation

Hornet: An Efficient Data Structure for Dynamic Sparse Graphs and Matrices Oded Green Hornet A scalable and dynamic data structure for Sparse data Graph algorithms Linear algebra based problems Formerly known as cuSTINGER


  1. Hornet: An Efficient Data Structure for Dynamic Sparse Graphs and Matrices Oded Green

  2. Hornet • A scalable and dynamic data structure for – Sparse data – Graph algorithms – Linear algebra based problems • Formerly known as cuSTINGER – Hornet initialization is hundreds of times faster – Hornet updates are 4X-10X faster – The Hornet data structure offers is more robust and scalable than cuSTINGER. • Essentially a dynamic CSR data structure • Easy to use Oded Green, GTC-18 2

  3. “Separation of powers” • Dynamic graph data structure and dynamic graph algorithms are in two different repositories – Easy to integrate with external library – Can also be used with matrices • This talk focuses on the data structure Oded Green, GTC-18 3

  4. Graph Primitives – Upfront summary • Great performance for static and dynamic graph algorithms • Scalable • Simple to use • Will discuss algorithm framework later today – 1:00pm – Same room as this talk Oded Green, GTC-18 4

  5. Hornet – Upfront Summary • Can support over 150 million updates per second • Can easily scale to graphs with billions of vertices • CSR comparison – Initializing is also relatively in-expensive – usually less than 3X slower – Hornet requires 30% more storage – Identical performance • COO (edge-list) comparison – Hornet requires 20% less storage – Hornet has better locality Oded Green, GTC-18 5

  6. Big Data problems need Graph Analysis Commu mmuni nicat cation ion netwo works ks: • World-wide connectivity • High velocity changes • Different types of extracted data: – Physical communication network. – Person-to-person communication network. Financi ncial al netwo works ks: He Health th-Care Care networks: orks: • Transactions between • Various players. players. • Pattern matching and • Different transactions epidemic monitoring. types (property graph) • Problem sizes have doubled in last 5 years. Oded Green, GTC-18 6

  7. Hornet Properties ✓ A Simple programming model ✓ Enable algorithm designers to implement dynamic & streaming graph algorithms with ease. ✓ Can easily grows 1000X initial size (no restart needed) ✓ Millions of updates per second to graph ✓ Updates are not bottlenecks for analytics. ✓ Automated data management ✓ Transfers data between host and device automatically ✓ Reduces fragmentation ✓ Supports memory reclamation • Scalable data structure cuSTINGER paper: [Green&Bader; HPEC, 2016]: cuSTINGE INGER: Supporti porting ng dynami mic graph h algorithms hms for GPUs Oded Green, GTC-18 7

  8. Definitions • Dynamic graphs – Graph can change over time. – Changes can be to topology, edges, or vertices. • For example new edges between two vertices. – Changes to edge or vertex weights • Streaming graphs: – Graphs changing at high rates. – 100s of thousands of updates per second. • Dynamic matrices – Adding a perturbation to the matrix Oded Green, GTC-18 8

  9. Dynamic graph example • Only a subset of the entire graph… • Dynamic: – At time 𝑢 : • 𝑤 and 𝑥 become friends. • 𝑗𝑜𝑡𝑓𝑠𝑢_𝑓𝑒𝑕𝑓 (𝑤, 𝑥) – At time Ƹ 𝑢 : • 𝑣 and 𝑤 no longer friends 𝑥 𝑤 • d𝑓𝑚𝑓𝑢𝑓 𝑓𝑒𝑕𝑓 𝑣,𝑤 𝑣 • Additional operations include vertex insertions & deletions Oded Green, GTC-18 9

  10. Widely used graph data structures Na Name mes Pr Pros Cons ons Dense Adjacency • Supports updates • Poor locality Matrix • Massive storage requirements Linked lists • Flexible • Poor locality • Limited parallelism • Allocation time is costly COO (Edge list) - • Has some flexibility • Poor locality unsorted • Updates are simple • Stores both the source and • Lots of parallelism destination CSR • Uses exact amount of • Inflexible memory • Good locality • Lots of parallelism These data structures don’t cut it Oded Green, GTC-18 10

  11. Compressed Sparse Row (CSR) Pros: • Uses precise storage requirements • Great locality – Good for GPUs • Handful of arrays – Simple to use and manage Cons: ns: • Inflexible. Src/Row 0 1 2 3 4 5 6 7 • Network growth Offset 0 2 4 7 9 11 13 14 14 unsupported • Topology changes Dest./Col. 1 2 0 5 0 3 4 2 6 2 5 1 4 3 2 5 2 7 4 1 4 1 2 4 1 7 1 2 unsupported Value • Property graphs not supported Oded Green, GTC-18 11

  12. Hornet – A High Level View U SER -I NTERFACE 0 1 2 3 4 5 6 7 Ver ertex Id Used ed 2 2 3 2 2 2 1 0 Over-allocated space Pointer er 3 1 2 0 5 2 6 2 5 1 4 0 3 4 Dest./Col. 2 2 5 2 7 1 2 4 1 7 1 4 1 4 Value • Every vertex points at its own array • Many edges array (blocks) • Block size is determined by the number of neighbors (always powers of 2) • Extra space left at the end of the block Oded Green, GTC-18 12

  13. Hornet – Property Graph Support U SER -I NTERFACE 0 1 2 3 4 5 6 7 Ver ertex Id Used ed 2 2 3 2 2 2 1 0 Pointer er Dest./Col. 3 1 2 0 5 2 6 2 5 1 4 0 3 4 2 2 5 2 7 1 2 4 1 7 1 4 1 4 Weight Type Time 1 User 1 User 2 …. • Programmers can add fields per edge • Easy to mange for static graph data structures • Hornet manages the data movement Oded Green, GTC-18 13

  14. Hornet in Detail Over-allocated space U SER -I NTERFACE 0 1 2 3 4 5 6 7 Vertex Id for vertex insertions Used ed (#Neigh eighbor bors/ s/nnz) 2 2 3 2 2 2 1 0 Pointer er Over-allocated space for power-of-two rule 3 1 2 0 5 2 6 2 5 1 4 0 3 4 2 5 2 5 7 1 2 4 1 7 1 2 1 4 0 0 0 0 0 1 1 1 1 1 1 1 0 1 1 1 1 0 𝑪𝑩 𝟏,𝟐 bsize = 2 𝑪𝑩 𝟐,𝟐 bsize = 2 𝑪𝑩 𝟐,𝟑 bsize = 4 𝑪𝑩 𝟑,𝟐 bsize= 1 Dest./Col. Vec-Tree Weight Bit status M EMORY MANAGER Oded Green, GTC-18 14

  15. Hornet Performance • Memory Utilization – Independent of the GPU being used • Initialization overhead • Update rate Oded Green, GTC-18 15

  16. Hornet Performance Analysis • All performance analysis is for the P100 – 56 SMs – 3584 SPs – 16GB HBM2 memory Oded Green, GTC-18 16

  17. Inputs Graphs • DIMACS 10 Graph Implementation Challenge • SNAP – Stanford Network Analysis Project • Florida Matrix Collection The following is only a subset of these graphs: Name Type |𝑭| * Source |𝑾| Collaboration DIMACS 𝑑𝑝𝐵𝑣𝑢ℎ𝑝𝑠𝑡𝐸𝐶𝑀𝑄 299𝑙 1.95𝑁 Trace route SNAP 𝑏𝑡 − 𝑡𝑙𝑗𝑢𝑢𝑓𝑠 1.69𝑁 11.1𝑁 Random DIMACS 𝑙𝑠𝑝𝑜_21 2𝑁 201𝑁 Citation SNAP 𝑑𝑗𝑢 − 𝑞𝑏𝑢𝑓𝑜𝑢𝑡 3.77𝑁 16.5𝑁 Matrix DIMACS 𝑑𝑏𝑕𝑓15 5.15𝑁 94𝑁 Webcrawl DIMACS 𝑣𝑙 − 2002 18.52𝑁 523𝑁 Oded Green, GTC-18 17

  18. Memory Utilization - Overall 100% Space Efficiency 80% 60% 40% 20% 0% 2 16 Hornet COO cuSTINGER AIM • BlockArrays of size 2 16 • 70% average utilization of CSR • Better utilization then: COO, cuSTINGER, AIM – AIM allocates all GPU memory Oded Green, GTC-18 18

  19. Initialization overhead 1,000 Slowdown versus CSR 100 10 1 Hornet cuSTINGER • Time to initialize data structure in comparison to CSR • In most cases 2X-3X slower – One time penalty • Much faster than cuSTINGER Oded Green, GTC-18 19

  20. Insertion Rates • Supports over 150M updates per second • Hornet – 4𝑌 − 10𝑌 faster than cuSTINGER – Does not have 𝑞𝑓𝑠𝑔𝑝𝑠𝑛𝑏𝑜𝑑𝑓 𝑒𝑗𝑞 like cuSTINGER • Scalable growth in update rate cuSTIN INGE GER Horne Ho net 10 9 10 9 1,000,000,000 1,000,000,000 Update Rate (edges per second) Update Rate (edges per second) 10 8 10 8 100,000,000 100,000,000 10 7 10,000,000 10 7 10,000,000 10 6 1,000,000 1,000,000 10 6 100,000 100,000 10 5 10 5 10,000 10,000 10 4 10 4 1,000 1,000 10 3 10 3 in-2004 soc-LiveJournal1 cage15 kron_g500-logn21 in-2004 soc-LiveJournal1 cage15 kron_g500-logn21 Oded Green, GTC-18 20

  21. Take away • Anything you can do with CSR you can also do with Hornet (other way is not true) • Supports high update rates • Scalable in both data size and in performance • Simple and high-level programming model – See you at 1:00pm • Also, look for James Fox’s talk on a cool algorithm for finding the maximal K-Truss in a graph – Uses dynamic triangle counting and the Hornet’s deletion… Oded Green, GTC-18 21

  22. Hornet Team (Current & Alumni) Oded Green, GTC-18 22

  23. Thank you • Email: ogreen@gatech.edu • Hornet: – https://github.com/hornet-gt/hornet • HornetsNest: – https://github.com/hornet-gt/hornetsnest Oded Green, GTC-18 23

  24. Backup slides Oded Green, GTC-18 24

  25. Memory Utilization - Overall 100% Space Efficiency 80% 60% 40% 20% 0% 2 16 2 18 2 22 Hornet Hornet Hornet COO cuSTINGER AIM • 70% average utilization of CSR • Better utilization in comparison to: COO, cuSTINGER, AIMS Oded Green, GTC-18 25

  26. Part 2: HornetsNest • Algorithm framework for Hornet data structure – We support CSR as well • All algorithms are implemented using a small set of operations – We show that these operators are efficient for static graph algorithms and can be used for dynamic graph algorithms • Uses features from C++11 and C++14 Oded Green, GTC-18 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend