Better TV & Broadband with Kafka & Spark
Phill Radley Chief Data Architect British Telecommunications plc
Better TV & Broadband with Kafka & Spark Phill Radley - - PowerPoint PPT Presentation
Better TV & Broadband with Kafka & Spark Phill Radley Chief Data Architect British Telecommunications plc In the beginning ( 2012 ) Hadoop HaaS Hadoop - Admin as a Service Admin Group Early adoption Spark will replace
Phill Radley Chief Data Architect British Telecommunications plc
In the beginning ( 2012 )
HaaS Hadoop - Admin
Early adoption
Doug Cutting – Sep 2015
Denser Nodes
doubled #cores trebled RAM
Same node count
Cluder migration
TV Set Top Box Broadband Home Hub
TV & BB Data Pipeline Overview
Gateway Firewall
ESB
Impala
CRM
enrichment data
YARN Cluster
Enrich Aggregate every
Spark
Producer consumer
HDFS
flume
HIVE Tables
Atomic metrics big XML payload Kafka Broker
HAAS
Kafka Producer
Data Ingest Kafka - Raw topic
Data Serving – Impala Concurrency
Schema Design … on read … DEVOPS approach
Impala Tuning…
unnecessary physical I/O ( these are hot tables keep them in memory )
Conclusions after months in production….
dozens of ad-hoc self-service analytics and data science users