s2graph a large scale graph database
play

S2Graph : A large-scale graph database with Hbase Reference 1. - PowerPoint PPT Presentation

daumkakao S2Graph : A large-scale graph database with Hbase Reference 1. HBase Conference 2015 1.http://www.slideshare.net/HBaseCon/use-cases-session-5 2.https://vimeo.com/128203919 2. Deview 2015 3. Apache Con BigData Europe


  1. daumkakao S2Graph : A large-scale graph database with Hbase

  2. Reference 1. HBase Conference 2015 1.http://www.slideshare.net/HBaseCon/use-cases-session-5 2.https://vimeo.com/128203919 2. Deview 2015 3. Apache Con BigData Europe 1.http://sched.co/3ztM 4. Github: https://github.com/daumkakao/s2graph 2

  3. Our Social Graph Listen count : Advertise Coupon Message price : Emoticon affinity affinity affinity affinity: Like count : 7 affinity Friend Style share : 3 affinity Eat Write rating : length : affinity Play affinity level: 6 affinity View Comment count : Read affinity Present price : 3 Search Group keyword size : 6 : 3

  4. Our Social Graph Music ID : 603 Ad ID : 603 Listen count : 6 Advertise ctr : 0.32 Message ID : 201 Message length : 9 affinity 4 affinity 3 affinity 6 affinity: 9 affinity 9 Item ID : 13 Style Friend share : 3 affinity 1 Write length : 3 affinity 3 Play affinity 3 Post ID : 97 level: 6 affinity 2 Search Comment keyword length : 15 : “HBase" affinity 2 4 Game ID : 1984

  5. Technical Challenges 1. Large social graph constantly changing a. Scale more than, social network: 10 billion edges, 200 million vertices, 50 million update on existing edges. user activities: over 1 billion new edges per day 5

  6. Technical Challenges (cont) 2. Low latency for breadth first search traversal on connected data. a. performance requirement peak graph-traversing query per second: 20000 response time: 100ms 6

  7. Technical Challenges (cont) 3. Realtime update capabilities for viral effects Fast Fast Fast Person A Person B Person C Person D Post Comment Sharing Mention 7

  8. Technical Challenges (cont) 4. Support for Dynamic Ranking logic a. Push strategy: Hard to change data ranking logic dynamically. b. Pull strategy: Enables user to try out various data ranking logics. 8

  9. Before Messaging SNS Blog App App App Friend relationship SNS feeds Blog user activities Messaging Each app server should know each DB’s sharding logic. Highly inter-connected architecture 9

  10. After Messaging SNS Blog App App App S2Graph DB stateless app servers 10

  11. daumkakao What is S2Graph?

  12. What is S2Graph? Storage-as-a-Service + Graph API = Realtime Breadth First Search 12

  13. Chat Room Message 1 Message 1 Message 1 Example: Messanger Data Model Contains Participates Recent messages in my chat rooms. SELECT a.* FROM user_chat_rooms a, chat_room_messages b WHERE a.user_id = 1 AND a.chat_room_id = b.chat_room_id WHERE b.created_at >= yesterday 13

  14. Chat Room Message 1 Message 1 Message 1 Example: Messanger Data Model Contains Participates Recent messages in my chat rooms. curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "s2graph", "columnName": “user_id", "id":1}], "steps": [ [{"label": "user_chat_rooms", "direction": "out", "limit": 100}], // step [{"label": "chat_room_messages", "direction": "out", "limit": 10, “where”: “created_at >= yesterday”}] ] } 14 '

  15. Post1 Post 2 Post 3 Example: News Feed (cont) create/like/share posts Friends Posts that my friends interacted. SELECT a.*, b.* FROM friends a, user_posts b WHERE a.user_id = b.user_id WHERE b.updated_at >= yesterday and b.action_type in (‘create’, ‘like’, ‘share’) 15

  16. Post1 Post 2 Post 3 Example: News Feed (cont) create/like/share posts Friends Posts that my friends interacted. curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "s2graph", "columnName": “user_id", "id":1}], "steps": [ [{"label": "friends", "direction": "out", "limit": 100}], // step [{"label": “user_posts", "direction": "out", "limit": 10, “where”: “created_at >= yesterday”}] ] } 16 '

  17. Product 1 Product2 Product 3 Example: Recommendation(User-based CF) (cont) Batch user-product interaction (click/buy/like/share) Similar Users Products that similar user interact recently. SELECT a.* , b.* FROM similar_users a, user_products b WHERE a.sim_user_id = b.user_id AND b.updated_at >= yesterday 17

  18. Product 1 Product2 Product 3 Example: Recommendation(User-based CF) (cont) Batch user-product interaction (click/buy/like/share) Similar Users Products that similar user interact recently. curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { “filterOut”: {“srcVertices”: [{“serviceName”: “s2graph”, “columnName”: “user_id”, “id”: 1}], “steps”: [[{“label”: “user_products_interact”}]] }, "srcVertices": [{"serviceName": "s2graph", "columnName": “user_id", "id":1}], "steps": [ [{"label": “similar_users", "direction": "out", "limit": 100, “where”: “similarity > 0.2”}], // step [{"label": “user_products_interact”, "direction": "out", "limit": 10, “where”: “created_at >= yesterday and price >= 1000”}] ] } 18 '

  19. Product 1 Product2 Product 3 Product 1 Product 1 Product 1 Example: Recommendation(Item-based CF) (cont) Batch user-product interaction Similar Products (click/buy/like/share) Products that are similar to what I have interested. SELECT a.* , b.* FROM similar_ a, user_products b WHERE a.sim_user_id = b.user_id AND b.updated_at >= yesterday 19

  20. Product 1 Product 1 Product 1 Product 3 Product2 Product 1 Example: Recommendation(Item-based CF) (cont) Batch user-product interaction Similar Products (click/buy/like/share) Products that are similar to what I have interested. curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "s2graph", "columnName": “user_id", "id":1}], "steps": [ [{"label": “user_products_interact", "direction": "out", "limit": 100, “where”: “created_at >= yesterday and price >= 1000”}], [{"label": “similar_products”, "direction": "out", "limit": 10, “where”: “similarity > 0.2”}] ] } 20 '

  21. Product10 Product20 Product1 Product2 Product 3 Product20 Category1 Category2 Product10 Example: Recommendation(Content + Most popular) (cont) user-product interaction TopK(k=1) product per timeUnit(day) (click/buy/like/share) Today Yesterday Today Yesterday Daily top product per categories in products that I liked. SELECT c.* FROM user_products a, product_categories b, category_daily_top_products c WHERE a.user_id = 1 and a.product_id = b.product_id and b.category_id = c.category_id and c.time between (yesterday, today) 21

  22. Product1 Product 3 Product10 Product20 Product20 Product10 Category2 Category1 Product2 Example: Recommendation(Content + Most popular) (cont) user-product interaction TopK(k=1) product per timeUnit(day) (click/buy/like/share) Today Yesterday Today Yesterday Daily top product per categories in products that I liked. curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "s2graph", "columnName": “user_id", "id":1}], "steps": [ [{"label": “user_products_interact", "direction": "out", "limit": 100, “where”: “created_at >= yesterday and price >= 1000”}], [{“label”: “product_cates”, “direction”: “out”, “limit”: 3}], [{"label": “category_products_topK”, "direction": "out", "limit": 10] ] } 22 '

  23. Product 1 Product2 Product 3 Example: Recommendation(Spreading Activation) (cont) user-product interaction (click/buy/like/share) Products that is interacted by users who interacted on products that I interact SELECT b.product_id, count(*) FROM user_products a, user_products b WHERE a.user_id = 1 AND a.product_id = b.product_id GROUP BY b.product_id 23

  24. Product 1 Product2 Product 3 Example: Recommendation(Spreading Activation) (cont) user-product interaction (click/buy/like/share) Products that is interacted by users who interacted on products that I interact curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "s2graph", "columnName": “user_id", "id":1}], "steps": [ [{"label": “user_products_interact", "direction": "out", "limit": 100, “where”: “created_at >= yesterday and price >= 1000”}], [{"label": “user_products_interact", "direction": "in", "limit": 10, “where”: “created_at >= today”}], [{"label": “user_products_interact", "direction": "out", "limit": 10, “where”: “created_at >= 1 hour ago”}], ] } 24 '

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend