routing trillions of events per day twitter
play

Routing Trillions of Events Per Day @Twitter #ApacheBigData 2017 - PowerPoint PPT Presentation

Routing Trillions of Events Per Day @Twitter #ApacheBigData 2017 Lohit VijayaRenu & Gary Steelman @lohitvijayarenu @efsie In this talk Event Logs at Twitter 1. Log Collection 2. Log Processing 3. Log Replication 4. The Future


  1. Routing Trillions of Events Per Day @Twitter #ApacheBigData 2017 Lohit VijayaRenu & Gary Steelman @lohitvijayarenu @efsie

  2. In this talk Event Logs at Twitter 1. Log Collection 2. Log Processing 3. Log Replication 4. The Future 5. Questions 6.

  3. Overview

  4. Life of an Event Http Clients Http Endpoint Clients Client Daemon Clients log events specifying a Category ● Clients Client Daemon name. Eg ads_view, login_event ... Client Daemon Events are grouped together across all ● clients into the Category Events are stored on Hadoop Distributed ● File System, bucketed every hour into Aggregated by Category separate directories /logs/ads_view/2017/05/01/23 ○ /logs/login_event/2017/05/01/23 ○ Storage HDFS

  5. Event Log Stats >1T ~3PB Trillion Events a Day of Data a Day Across millions of Incoming clients uncompressed >600 <1500 Categories Nodes Event groups by Collocated with category HDFS datanodes

  6. Event Log Architecture Remote Clients Inside Clients Clients HTTP DataCenter Local log collection daemon Log Aggregate log events grouped Processor by Category Storage (HDFS) Storage Storage (HDFS) (Streaming) Storage (HDFS) Log Storage (HDFS) Replicator

  7. Event Log Architecture Remote Clients Inside Clients Clients HTTP DataCenter Local log collection daemon Log Aggregate log events grouped Processor by Category Storage (HDFS) Storage Storage (HDFS) (Streaming) Storage (HDFS) Log Storage (HDFS) Replicator

  8. Event Log Architecture Remote Clients Inside Clients Clients HTTP DataCenter Local log collection daemon Log Aggregate log events grouped Processor by Category Storage (HDFS) Storage Storage (HDFS) (Streaming) Storage (HDFS) Log Storage (HDFS) Replicator

  9. Event Log Architecture Remote Clients Inside Clients Clients HTTP DataCenter Local log collection daemon Log Aggregate log events grouped Processor by Category Storage (HDFS) Storage Storage (HDFS) (Streaming) Storage (HDFS) Log Storage (HDFS) Replicator

  10. Event Log Architecture Remote Clients Inside Clients Clients HTTP DataCenter Local log collection daemon Log Aggregate log events grouped Processor by Category Storage (HDFS) Storage Storage (HDFS) (Streaming) Storage (HDFS) Log Storage (HDFS) Replicator

  11. Event Log Architecture Inside Inside Events Events Events Events DC1 DC2 RT Storage (HDFS) RT Storage (HDFS) DW Storage (HDFS) DW Storage (HDFS) Cold Storage (HDFS) Prod Storage (HDFS) Prod Storage (HDFS)

  12. Event Log Architecture Inside Inside Events Events Events Events DC1 DC2 RT Storage (HDFS) RT Storage (HDFS) DW Storage (HDFS) DW Storage (HDFS) Cold Storage (HDFS) Prod Storage (HDFS) Prod Storage (HDFS)

  13. Event Log Architecture Inside Inside Events Events Events Events DC1 DC2 RT Storage (HDFS) RT Storage (HDFS) DW Storage (HDFS) DW Storage (HDFS) Cold Storage (HDFS) Prod Storage (HDFS) Prod Storage (HDFS)

  14. Collection

  15. Event Log Architecture Remote Clients Inside Clients Clients HTTP DataCenter Local log collection daemon Log Aggregate log events grouped Processor by Category Storage (HDFS) Storage Storage (HDFS) (Streaming) Storage (HDFS) Log Storage (HDFS) Replicator

  16. Event Collection Overview Past Future Present Scribe Scribe Flume Client Client Client Daemon Daemon Daemon Scribe Flume Flume Aggregator Aggregator Aggregator Daemons Daemon Daemon

  17. Event Collection Past Challenges with Scribe Too many open file handles to HDFS ● 600 categories x 1500 aggregators x 6 per hour =~ 5.4M files per hour ○ High IO wait on DataNodes at scale ● Max limit on throughput per aggregator ● Difficult to track message drops ● No longer active open source development ●

  18. Event Collection Present Apache Flume Flume Agent Source Sink HDFS Client Well defined interfaces ● Open source ● Concept of transactions ● Existing implementations of ● Channel interfaces

  19. Event Collection Category 1 Category 3 Category 2 Present Category Group Combine multiple related ● categories into a category group Provide different ● properties per group Agent 3 Agent 2 Agent 1 Contains multiple events ● to generate fewer combined sequence files Category Group

  20. Category Groups Event Collection Group 1 Present Group 2 Aggregator Group A set of aggregators ● Aggregator Group 1 Aggregator Group 2 hosting same set of Agent 1 Agent 2 Agent 3 Agent 8 category groups Easy to manage ● group of aggregators hosting subset of categories

  21. Event Collection Present Flume features to support groups Extend Interceptor to multiplex events into groups ● Implement Memory Channel Group to have separate memory ● channel per category group ZooKeeper registration per category group for service discovery ● Metrics for category groups ●

  22. Event Collection Present Flume performance improvements HDFSEventSink batching increased (5x) throughput reducing ● spikes on memory channel Implement buffering in HDFSEventSink instead of using ● SpillableMemoryChannel Stream events close to network speed ●

  23. Processing

  24. Event Log Architecture Remote Clients Inside Clients Clients HTTP DataCenter Local log collection daemon Log Aggregate log events grouped Processor by Category Storage (HDFS) Storage Storage (HDFS) (Streaming) Storage (HDFS) Log Storage (HDFS) Replicator

  25. Log Processor Stats Processing Trillion Events per Day 8 >1PB 20-50% Wall Clock Hours Data per Day Disk Space To process one Saved by Output of cleaned, day of data processing Flume compressed, sequence files consolidated, and converted

  26. Log Processor Needs Processing Trillion Events per Day Make processing log data easier for analytics teams ● Disk space is at a premium on analytics clusters ● Still too many files cause increased pressure on the NameNode ● Log data is read many times and different teams all perform the same ● pre-processing steps on the same data sets

  27. Log Processor Steps Datacenter 1 Category Groups Demux Jobs Categories ads_click/yyyy/mm/dd/hh ads_group/yyyy/mm/dd/hh ads_group_demuxer ads_view/yyyy/mm/dd/hh login_group_demuxer login_event/yyyy/mm/dd/hh login_group/yyyy/mm/dd/hh

  28. Log Processor Steps Datacenter 1 Category Groups Demux Jobs Categories ads_click/yyyy/mm/dd/hh ads_group/yyyy/mm/dd/hh ads_group_demuxer ads_view/yyyy/mm/dd/hh login_group_demuxer login_event/yyyy/mm/dd/hh login_group/yyyy/mm/dd/hh

  29. Log Processor Steps Datacenter 1 Category Groups Demux Jobs Categories ads_click/yyyy/mm/dd/hh ads_group/yyyy/mm/dd/hh ads_group_demuxer ads_view/yyyy/mm/dd/hh login_group_demuxer login_event/yyyy/mm/dd/hh login_group/yyyy/mm/dd/hh

  30. Log Processor Steps 1 4 Decode Compress Base64 encoding from logged Logged data to the highest level to data save disk space. From LZO level 3 to LZO level 7 2 5 Consolidate Demux Category groups into individual Small files to reduce pressure on categories for easier consumption by the NameNode analytics teams 3 6 Convert Clean Corrupt, empty, or invalid records Some categories into Parquet for so data sets are more reliable fastest use in ad-hoc exploratory tools

  31. Why Base64 Decoding? Legacy Choices ● Scribe’s contract amounts to sending a binary blob to a port ● Scribe used new line characters to delimit records in a binary blob batch of records ● Valid records may include new line characters ● Scribe base64 encoded received binary blobs to avoid confusion with record delimiter ● Base 64 encoding is no longer necessary because we have moved to one serialized Thrift object per binary blob

  32. Log Demux Visual /raw/ads_group/yyyy/mm/dd/hh/ads_group_1.seq DEMUX /logs/ads_view/yyyy/mm/dd/hh/1.lzo /logs/ads_click/yyyy/mm/dd/hh/1.lzo /logs/ads_view/yyyy/mm/dd/hh/1.lzo

  33. Log Demux Visual /raw/ads_group/yyyy/mm/dd/hh/ads_group_1.seq DEMUX /logs/ads_view/yyyy/mm/dd/hh/1.lzo /logs/ads_click/yyyy/mm/dd/hh/1.lzo /logs/ads_view/yyyy/mm/dd/hh/1.lzo

  34. Log Demux Visual /raw/ads_group/yyyy/mm/dd/hh/ads_group_1.seq DEMUX /logs/ads_view/yyyy/mm/dd/hh/1.lzo /logs/ads_click/yyyy/mm/dd/hh/1.lzo /logs/ads_view/yyyy/mm/dd/hh/1.lzo

  35. Log Processor Daemon One log processor daemon per RT Hadoop cluster, where Flume ● aggregates logs Primarily responsible for demuxing category groups out of the Flume ● sequence files The daemon schedules Tez jobs every hour for every category group in a ● thread pool Daemon atomically presents processed category instances so partial data ● can’t be read Processing proceeds according to criticality of data or “tiers” ●

  36. Why Tez? ● Some categories are significantly larger than other categories (KBs v TBs) ● MapReduce demux? Each reducer handles a single category ● Streaming demux? Each spout or channel handles a single category ● Massive skew in partitioning by category causes long running tasks which slows down job completion time ● Relatively well understood fault tolerance semantics similar to MapReduce, Spark, etc

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend