brie a specialized trie for concurrent datalog
play

Brie: A Specialized Trie for Concurrent Datalog Jordan 1 , - PowerPoint PPT Presentation

Brie: A Specialized Trie for Concurrent Datalog Jordan 1 , Pavle Suboti 3 , Herbert He David Zhao 2 , and Bernhard Scholz 2 PMAM 2019, 17 February 2019, Washington, DC 1) University of Innsbruck 2) University of Sydney 3)


  1. Brie: A Specialized Trie for Concurrent Datalog Jordan 1 , Pavle Suboti ć 3 , Herbert He David Zhao 2 , and Bernhard Scholz 2 PMAM 2019, 17 February 2019, Washington, DC 1) University of Innsbruck 2) University of Sydney 3) Amazon

  2. Datalog (by Example) from to a a b c b a c Are there cycles? b f d e c e d a g f d c … … graph edge relation 2

  3. Datalog (by Example) from to a a b c b a c Is the graph b f d e connected? c e d a g f d c … … graph edge relation 3

  4. Datalog (by Example) from to a a b c b a c Which nodes b f d e are connected? c e d a g f d c … … graph edge relation 4

  5. Datalog (by Example) from to a a b c b a c path (X,Y) :- edge (X,Y). b f d e c e path (X,Z) :- path (X,Y), d a g edge (Y,Z). f d c … … Da Datalog graph edge relation query 5

  6. Datalog › Benefits: – a concise formalism for powerful data analysis – lately major performance improvements and tool support › Applications: – data base queries – program analysis 100s of relations and rules, – security vulnerability analysis billions of tuples, all in-memory – network analysis 6

  7. Query Processing relations set of integer tuples sequence of rules relational algebra operations on sets 7

  8. Example path (X,Z) :- path (X,Y), edge (Y,Z). ,(-#" ← !"#ℎ while ( ,(-#" ≠ ∅ ) { computational expensive and '() ← *(,(-#" ⋈ (,/() ∖ !"#ℎ dominating part !"#ℎ ← !"#ℎ ∪ '() ,(-#" ← '() } 8

  9. Needed › efficient data structure for relations – maintain set of n-dimensional tuples – efficient support for › insertion, › scans, well supported › range queries, by B-tr trees › membership tests, › emptiness checks – efficient synchronization of challenging concurrent inserts 9

  10. B-tree Issues (5,3) (8,2) (1,1) (1,2) (3,2) (4,7) (6,9) (7,4) (8,7) (9,2) (9,4) › Concurrent inserts: – require sophisticated locking scheme – while holding locks, costly operations are performed › binary search operations, and inserts in sorted arrays 10

  11. Brie 11

  12. Brie – Inner Node 12

  13. Brie – Leaf Node 13

  14. Synchronizing Inserts › Insertion 1. navigate down the tree › insert sub-trees on demand using CAS 2. If inner node tree needs to grow › introduce new root node using CAS 3. add 1-bit to leaf level mask › using atomic bitwise or 14

  15. Data Density Performance is density dependent: 7 0 3 3 (7,2) (3,1) (3,3) (3,4) (0,3) (3,1) high density low density Density: ratio of included points vs . spanned interval 15

  16. Memory Usage btree brie 100% brie 10% brie 5% brie 2% brie 1% brie 0.5% brie 0.1% 2 1.8 1.6 1.4 memory [GB] 1.2 1 0.8 0.6 0.4 0.2 0 0 10 20 30 40 50 60 70 80 90 100 elements in million 16

  17. Sequential Performance std::set std::hash_set concurrent btree std::set std::hash_set concurrent btree brie 0.1% brie 1% brie 100% brie 0.1% brie 1% brie 100% 100 50 insertions/s insertions/s 80 40 60 30 40 20 20 10 0 0 million million 1000² 2000² 5000² 10000² 1000² 2000² 5000² 10000² total elements inserted total elements inserted ordered insertion random order insertion 17

  18. Sequential Performance (2) std::set std::hash_set concurrent btree std::set std::hash_set concurrent btree brie 0.1% brie 1% brie 100% brie 0.1% brie 1% brie 100% 100 queries/s entries/s 80 400 60 40 200 20 million million 0 0 1000² 2000² 5000² 10000² 1000² 2000² 5000² 10000² elements in set and number of queries elements in set membership test (random order) full range scan 18

  19. Parallel Performance tbb::hash_set concurrent btree tbb::hash_set concurrent btree brie 0.1% brie 1% brie 0.1% brie 1% brie 100% brie 100% 10000 1000 insertions/s insertions/s 1000 100 100 10 10 1 1 0.1 0.1 million million 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 number of threads number of threads ordered insertion random order insertion 4x8 core Intel Xeon E5-4650 19

  20. Parallel Performance up to 11x up to 15x faster than B-trees faster than B-trees tbb::hash_set concurrent btree tbb::hash_set concurrent btree brie 0.1% brie 1% brie 0.1% brie 1% brie 100% brie 100% 2000 150 insertions/s insertions/s 1500 100 1000 50 500 0 0 million million 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 number of threads number of threads ordered insertion random order insertion 4x8 core Intel Xeon E5-4650 20

  21. Datalog Query Processing -50% ~4x faster memory btree brie mixed btree brie mixed 900 16 800 14 700 [GB] [s] 12 600 time 10 usage 500 8 query 400 Memory 6 300 total 4 200 2 100 0 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 number of threads number of threads context sensitive var-points-to analysis 21

  22. Conclusion › Developed concurrent set for Datalog relations: – Trie derived structure + blocked nodes › enables fast relational operations – Low overhead synchronization › atomic operation based synchronization sufficient › Results: – up to 5-17 17x faster for sequential insert and query operations – up to 15 15x faster for parallel insertion operations – up to 4x 4x faster and 50% 50% less memory for real-world qu quer ery proces essing › Future work: – investigate other data structures for specialized use cases 22

  23. Thank you! visit us on https://souffle-lang.github.io sources: https://github.com/souffle- lang/souffle/blob/master/src/Brie.h 23

  24. Parallel Performance tbb::hash_set concurrent btree tbb::hash_set concurrent btree brie 0.1% brie 1% brie 0.1% brie 1% brie 100% reduction btree brie 100% reduction btree 10000 1000 insertions/s insertions/s 1000 100 100 10 10 1 1 0.1 0.1 million million 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 number of threads number of threads ordered insertion random order insertion 4x8 core Intel Xeon E5-4650 27

  25. Parallel Performance up to 11x up to 15x faster than B-trees faster than B-trees tbb::hash_set concurrent btree tbb::hash_set concurrent btree brie 0.1% brie 1% brie 0.1% brie 1% brie 100% reduction btree brie 100% reduction btree 2000 150 insertions/s insertions/s 1500 100 1000 50 500 0 0 million million 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 number of threads number of threads ordered insertion random order insertion 4x8 core Intel Xeon E5-4650 28

  26. Example ,(-#" ← !"#ℎ while ( ,(-#" ≠ ∅ ) { '() ← *(,(-#" ⋈ (,/() ∖ !"#ℎ path (X,Z) :- path (X,Y), edge (Y,Z). !"#ℎ ← !"#ℎ ∪ '() ,(-#" ← '() } 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend