Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian - PowerPoint PPT Presentation

Apr 02, 2024 •427 likes •605 views

Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian 1 Need for Data Analysis Performance monitoring Detect unexpected performance drops/rises Pattern mining Understand user response to new features Ad revenue

Scuba: Diving into Data at Facebook Presenter: Lavanya Subramanian 1
Need for Data Analysis • Performance monitoring – Detect unexpected performance drops/rises • Pattern mining – Understand user response to new features • Ad revenue monitoring – Identify regional drops/rises in ad clicks and revenue 2
Data Analysis at Facebook • Large data volumes • Real time analysis of this data • Key Requirements – Low latency – Flexibility – Scalability 3
Proposed Solution: Scuba • Structure – In-memory database – Across hundreds of servers • How does it work? – Holds and processes sampled real-time data – Query interface to access data – Visualization interface to analyze data 4
Architecture Server Leaf nodes 5
Data Layout • Data stored in tables • Data types supported – Integers, strings, sets of strings, vectors of strings • Different compression for different data types Table Characteristics • Table is created upon data arrival at a leaf node • Table can have empty columns; treated as null 6
Data Ingestion into Scuba Scribe Leaf nodes 7
Data Ingestion into Scuba • Events are sampled to reduce the data volume • Use Scribe, a distributed messaging system to – Collect, aggregate and deliver data to Scuba • For each batch of incoming data – Pick two leaf nodes at random – Send the batch to the node with more free memory • Data compressed and sent to disk • Data then read back and stored in memory 8
Dealing with Old Data • Memory capacity is a concern • Need to add new servers every 2-3 weeks • Delete data based on – Age: Sample and preserve a fraction of old data – Space: When exceeding space limits, delete old data 9
Querying Scuba • Three kinds of interfaces – Web-based – SQL – API to support querying from application code • Queries supported – Different forms of aggregation – Percentiles, histograms • Joins not supported by Scuba 10
Query Execution Root Aggregator Intermediate Aggregators Leaf Aggregators Leaf nodes 11
Query Execution • Leaf node may or may not contain a table’s data – Depends on the table size and age • Data scanning is usually by time range – Time is Scuba’s only notion of index • Results of a node are omitted beyond a time out – Small missing pieces of data do not affect accuracy of computations much – Lower response time is a bigger requirement 12
Performance Model • Breaks down the latencies of different components • Function of fanout, processing time at each aggregator, depth of tree 13
Experimental Setup and Queries • 4 racks of 40 machines • Machine configuration – Intel Xeon E5-2660 – 2.2 GHz – 144 GB DRAM memory • 10G ethernet • Scan query, Time series query 14
Speedup and Scaleup 15
Throughput 16
Discussion • Details on the kind of data stored and analyzed • Performance numbers for a wider set of queries • Are these query throughputs good enough? – Might be fine for an internal system 17

Recommend

SCUBA: the (not-so) dangerous underwater sport Dilan Ustek The what SCUBA = Self-Contained

SCUBA: the (not-so) dangerous underwater sport Dilan Ustek The what SCUBA = Self-Contained Underwater Breathing Apparatus Snorkeling != SCUBA Free diving != SCUBA SCUBA Snorkeling SCUBA Free diving SCUBA The how - How to float

661 views • 17 slides

Scuba: Diving into Data at Facebook - Lior et. al Presented By - Sidharth Singla MMATH CS

Scuba: Diving into Data at Facebook - Lior et. al Presented By - Sidharth Singla MMATH CS OUTLINE Importance History Scuba: Introduction Use Cases Scuba Overview - Data Model; Data Layout; Data ingestion,

450 views • 23 slides

Training Scuba Divers: A Fatality Training Scuba Divers: A Fatality and Risk Analysis and Risk

Training Scuba Divers: A Fatality Training Scuba Divers: A Fatality and Risk Analysis and Risk Analysis DAN Diving Fatality Workshop 8 10 April, 2010 Durham Hotel Durham, North Carolina, USA Dr. Drew Richardson President and Chief Operating

448 views • 44 slides

Scuba Diving Without Air & Other QS Annual Conference Show & Tell Impossible Sept.

Scuba Diving Without Air & Other QS Annual Conference Show & Tell Impossible Sept. 2018 Feats Jessica Ching Which is Impossible? Scuba: No air Fly: No wings Run: No food x x x x Type 1 diabetes = All 3 Impossible 2

492 views • 14 slides

Diving into Mastery Guidance for Educators Diving Deeper Deepest Aim Read Roman numerals to

Diving into Mastery Guidance for Educators Diving Deeper Deepest Aim Read Roman numerals to 1000 (M) and recognise years written in Roman numerals. Roman Numerals to 1000 Diving Correctly match the Roman numerals to the words and digits.

520 views • 13 slides

Recent advances in diving medicine research DAN Europe VGE Studies A Century of Diving Medicine

Recent advances in diving medicine research DAN Europe VGE Studies A Century of Diving Medicine Research 1908 : J.S. Haldane staged decompression prevents DCS 2003 : D. Elliott After nearly 100 years of diving

612 views • 44 slides

Day 2: Diving Deeper into Day 2: Diving Deeper into Data Visualization with R Data Visualization

Day 2: Diving Deeper into Day 2: Diving Deeper into Data Visualization with R Data Visualization with R Introduction Introduction Presented by Di Cook Department of Econometrics and Business Statistics 12 Nov 2020 @ Statistical Society of

250 views • 8 slides

Scuba diving as Mediterranean Culture. page 20 preservation and presentation of gozos maritime

Scuba diving as Mediterranean Culture. page 20 preservation and presentation of gozos maritime heritage by Sara Rich September 2006 S u M M a r y heritage Malta and the Malta tourism authority have recently endeavored to increase quality

422 views • 6 slides

Disclosures I have no financial conflicts of interest Deep: Scuba diving associated I

Disclosures I have no financial conflicts of interest Deep: Scuba diving associated I am a US government Employee, Risks and Complications however this lecture represents my Kyle Petersen, DO, FACP, FIDSA professional opinions and

482 views • 9 slides

6 and 8 Times Table and Division Facts Diving into Mastery Guidance for Educators Diving Deeper

6 and 8 Times Table and Division Facts Diving into Mastery Guidance for Educators Diving Deeper Deepest Aim This week we are really focusing on securing the 6 and 8 multiplication tables . It is really important to be able to recall these

198 views • 16 slides

7, 11 and 12 Multiplication Tables Diving into Mastery Guidance for Educators Diving Deeper

7, 11 and 12 Multiplication Tables Diving into Mastery Guidance for Educators Diving Deeper Deepest Aim This week we are really focusing on securing the 7, 11 and 12 multiplication tables . It is really important to be able to recall these

665 views • 18 slides

Diving Group RNLN Dive into the future with the RNLN Royal Netherlands Navy Anton van Dijk 19

Diving Group RNLN Dive into the future with the RNLN Royal Netherlands Navy Anton van Dijk 19 mei 2019 Introduction - LtCdr Anton van Dijk - RNLN since 1998 - Diving Officer since 2007 Royal Netherlands Navy 2 Diving Group RNLN 19 mei

421 views • 25 slides

The SCUBA-2 Cosmology Legacy Survey and beyond Jim Geach on behalf of the S2CLS consortium #SMG20

The SCUBA-2 Cosmology Legacy Survey and beyond Jim Geach on behalf of the S2CLS consortium #SMG20 / Durham / 31 st July 2017 Take home message Please exploit the S2CLS data! What is was the SCUBA-2 Cosmology Legacy Survey? Largest of the JCMT

595 views • 25 slides

Southern Diving Group SDU1 (Plymouth) / SDU2 (Portsmouth) Commander Del McKnight RN Fleet Diving

Southern Diving Group SDU1 (Plymouth) / SDU2 (Portsmouth) Commander Del McKnight RN Fleet Diving Squadron UDT Conference 2 The Strategic Responsibility Island Nation 90% per volume of world trade by sea Freedom of Navigation/access

505 views • 24 slides

Diving into Mastery Guidance for Educators Each activity sheet is split into three sections,

Diving into Mastery Guidance for Educators Each activity sheet is split into three sections, diving, deeper and deepest, which are represented by the following icons: These carefully designed activities take your children through a learning

156 views • 15 slides

where were going Cdr Al Nekrews QGM RN CO Fleet Diving Squadron Royal Navy Scope: - RN

Global Maritime EOD: Where we are and where were going Cdr Al Nekrews QGM RN CO Fleet Diving Squadron Royal Navy Scope: - RN Fleet Diving Squadron - UK Minewarfare Laydown - Maritime EOD where we are: . Conventional Threats .

527 views • 15 slides

Computer Science 135: Diving into the Deluge of Data Contact Information: Brent Heeringa Email:

Computer Science 135: Diving into the Deluge of Data Contact Information: Brent Heeringa Email: heeringa@cs.williams.edu Office: Thompson Chemistry Laboratory 306 Phone: 413.597.4711 Course Information Classroom: Thompson Physics 205 Lecture

311 views • 7 slides

Primal heuristic for MINLPs in SCIP Ambros Gleixner, and Felipe Serrano Zuse Institute Berlin

Primal heuristic for MINLPs in SCIP Ambros Gleixner, and Felipe Serrano Zuse Institute Berlin serrano@zib.de SCIP Optimization Suite http://scip.zib.de Workshop on Discrepancy Theory and Integer Programming Amsterdam June 12, 2018

754 views • 53 slides

III. A Deep Dive into the CA English Learner Roadmap Laurie Olsen, Ph.D.; Strategic Advisor,

III. A Deep Dive into the CA English Learner Roadmap Laurie Olsen, Ph.D.; Strategic Advisor, Sobrato Family Foundation California English Learner Roadmap Launch Event At Your Tables: Have you heard about the CA EL Roadmap ? Where and from

214 views • 19 slides

Pro po sing a n a lte rna tive me tho d o f de mo nstra ting c o mplia nc e with 46 CF R Pa rt

I MCA a nd ADCI wo rking to g e the r to b e ne fit o ffsho re c o mme rc ia l diving I mpro ving pe rfo rmanc e in the marine c o ntrac ting industry Pro po sing a n a lte rna tive me tho d o f de mo nstra ting c o mplia nc e with

456 views • 10 slides

Next Generation ACO Model Open Door Forum: Financial Deep Dive March 31, 2015 Agenda

Next Generation ACO Model Open Door Forum: Financial Deep Dive March 31, 2015 Agenda Preliminary Financial Timeline Financial Model Deep Dive Benchmark Prospective Benchmark Example Example Discount Calculations Risk

917 views • 28 slides

Initial Submission Id IHV Publication Id IHV DUA Submission Id IHV DUA Publication Id Publish

ID2 ID1 ID3 ID4 Initial Submission Id IHV Publication Id IHV DUA Submission Id IHV DUA Publication Id Publish to WU Initial Driver DUA Publish to WU Resell Submission Id ID6 ID7 ID5 ID8 OEM Publication Id OEM DUA Submission Id OEM

408 views • 15 slides

Insect Division of Labour Applied to Online Scheduling Koen van der Blom Leiden Institute of

Introduction Problem Algorithms Experiments Results Conclusion Further work Summary Questions References Insect Division of Labour Applied to Online Scheduling Koen van der Blom Leiden Institute of Advanced Computer Science Leiden

522 views • 23 slides

HFST: A new division of labour between software industry and linguists Kimmo Koskenniemi

HFST: A new division of labour between software industry and linguists Kimmo Koskenniemi University of Helsinki LT is not exploited in software products Very few software products make use of language technology (LT) Not because there

397 views • 21 slides