OpenTSDB + Bigtable Integrating time series database with Google - - PowerPoint PPT Presentation

opentsdb bigtable
SMART_READER_LITE
LIVE PREVIEW

OpenTSDB + Bigtable Integrating time series database with Google - - PowerPoint PPT Presentation

OpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice Lead - @zburivsky Christos Soulios, Big Data Architect - @c_soulios Pythian specializes in design, implementation, and management


slide-1
SLIDE 1

OpenTSDB + Bigtable

Integrating time series database with Google Cloud Bigtable

Danil Zburivsky, Big Data Practice Lead - @zburivsky Christos Soulios, Big Data Architect - @c_soulios

slide-2
SLIDE 2

Pythian specializes in design, implementation, and management of systems that directly contribute to revenue and business success. History 19 years in business Growing at 30+% per year 400+ employees 300+ customers worldwide HQ Ottawa, Canada - global reach Technology agnostic = trusted advisor Deep expertise: Oracle, Oracle Apps, MySQL, AWS, SQL Server, Cassandra/DataStax, Azure, PostgreSQL, Cloudera, MapR, Hortonworks etc. Google Premier Partner Status (as of end Aug) 5 Certified Developers (soon to be 12) Dedicated Google Technical Champion Launch partner for: Kubernetes, Dataflow, Cloud SQL, Dataproc Integrated OpenTSDB with Bigtable DW Explorers Program Partner Upcoming BigQuery & Cloud ML Launch Partner

slide-3
SLIDE 3
  • (time, metric, value)
  • OS and apps metrics
  • Industrial equipment
  • Web traffic

Time series data

slide-4
SLIDE 4
  • Volume can be explosive
  • Data arrival and access

patterns are different

Storing time series data is a challenge

slide-5
SLIDE 5
  • Volume can be explosive
  • Data arrival and access

patterns are different

Storing time series data is a challenge

slide-6
SLIDE 6
  • NoSQL
  • Data model and storage optimized for time series
  • Separate query language

Better alternatives — specialized stores

slide-7
SLIDE 7
  • Open source
  • Uses HBase as a data store
  • Data model optimized for time series
  • REST API

OpenTSDB

<metric_uid><timestamp><tagk1><tagv1>[...<tagkN><tagvN>] <col_t+1>[...<col_t+N>]

slide-8
SLIDE 8

OpenTSDB Architecture

Server Server Server Server TSD TSD HBase

TSD RPC HBase RPC Web UI Scripts/Alerting HTTP TSD RPC

slide-9
SLIDE 9
  • HBase requires a full Hadoop setup

(3xZK, 2xNN, 3xDN, 2xHMaster, 3xHRegion)

  • HBase tuning is a job for the brave

(HFiles, WAL, MemStore, BucketCache, BlockCache)

HBase can be too much

slide-10
SLIDE 10

HBase can be too much

slide-11
SLIDE 11

But all I wanted was a time series database

slide-12
SLIDE 12

Google Cloud Bigtable

  • Highly Scalable NoSQL database
  • Low latency, high throughput
  • Powers most Google products
  • Available as a Google Cloud Service
slide-13
SLIDE 13

Migrate HBase apps to Cloud Bigtable

  • The Bigtable client is API compatible with HBase client
  • Only replace hbase-client.jar with bigtable-hbase.jar
  • No code changes required!
slide-14
SLIDE 14

Migrate OpenTSDB to Cloud Bigtable

  • OpenTSDB does not use standard hbase-client.jar
  • OpenTSDB is based on AsyncHBase library
slide-15
SLIDE 15

AsyncHBase library

  • Open source HBase client library
  • Multi-threaded

Multiple threads use the same instance

  • Fully asynchronous, non-blocking
  • Implements the low level HBase RPCs
slide-16
SLIDE 16

Detour: Asynchronous programming

slide-17
SLIDE 17

Detour: Why asynchronous?

  • Efficient thread usage
  • Less threads = less memory
  • CPU scheduler friendly
  • Extremely high concurrency
slide-18
SLIDE 18

AsyncHBase library

http://www.tsunanet.net/~tsuna/asynchbase/benchmark/viz.html

slide-19
SLIDE 19

AsyncHBase library

“AsyncHBase client differs significantly from HBase's

  • client. Switching to it is not easy as it requires to rewrite all

the code that was interacting with any HBase API”

AsyncHBase documentation

slide-20
SLIDE 20

AsyncBigtable library

  • Complete rewrite of AsyncHBase API
  • Uses standard hbase-client for Bigtable access
  • Compatible with the bigtable-hbase API
slide-21
SLIDE 21

AsyncBigtable challenges

  • OpenTSDB jar dependencies
  • AsyncBigtable is not async!
  • BufferedMutator + Threadpool to emulate async
slide-22
SLIDE 22

AsyncBigtable library

slide-23
SLIDE 23

AsyncBigtable library

  • Merged upstream OpenTSDB v2.3.0
  • http://opentsdb.net/docs/build/html/user_guide/backend

s/bigtable.html

  • https://github.com/OpenTSDB/asyncbigtable
slide-24
SLIDE 24

Future work

  • Native Bigtable API
  • Fully asynchronous
  • Improve performance
  • Add more unit tests
slide-25
SLIDE 25

Questions?

https://github.com/opentsdb/asyncbigtable