An API for Reading the MySQL Binary Log Lars Thalmann Mats Kindahl - - PowerPoint PPT Presentation

an api for reading the mysql binary log
SMART_READER_LITE
LIVE PREVIEW

An API for Reading the MySQL Binary Log Lars Thalmann Mats Kindahl - - PowerPoint PPT Presentation

<Insert Picture Here> An API for Reading the MySQL Binary Log Lars Thalmann Mats Kindahl Development Director, MySQL Lead Software Engineer, MySQL Replication, Backup & Connectors Replication & Utilities The following is


slide-1
SLIDE 1
slide-2
SLIDE 2

<Insert Picture Here>

Lars Thalmann

Development Director, MySQL Replication, Backup & Connectors

An API for Reading the MySQL Binary Log

Mats Kindahl

Lead Software Engineer, MySQL Replication & Utilities

slide-3
SLIDE 3

3

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any

  • contract. It is not a commitment to deliver any

material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

slide-4
SLIDE 4

4

Outline

  • Replication Architecture
  • Binary logs
  • Binary log event
  • Reading binary log

– Connecting to server – Reading from files

  • Reading events

– Queries – Reading rows (row-based replication) – Other events

slide-5
SLIDE 5

5

Replication Architecture

Master Slave(s)

Clients

Changes

slide-6
SLIDE 6

6

I/O SQL Databa se

Replication Architecture

Client Dump Dump Dump Binary Log Session Session Session I/O SQL Databa se I/O SQL Databa se

Master Slave(s)

slide-7
SLIDE 7

7

I/O SQL Databa se

Replication Architecture

Client Dump Dump Dump Binary Log Relay Log Session Session Session I/O SQL Databa se I/O SQL Database

Master Slave(s)

slide-8
SLIDE 8

8

Replication to other systems

Client Binary Log Dump Session Session Session

Master

Relay Log I/O SQL Database

Slave(s)

HBase SOLR

Full-text indexing Data Mining

? Dump Dump Dump

slide-9
SLIDE 9

9

Transforming events

SOLR API Transformer Server F i l e Subject of our presentation

“Change Data Capture”

slide-10
SLIDE 10

10

Binlog API

  • Library to process replication events
  • API is ready for use
  • Goals:

–Simple –Extensible –Efficient

slide-11
SLIDE 11

11

<Insert Picture Here>

How to capture events Binlog API

  • The replication listener
slide-12
SLIDE 12

12

<Insert Picture Here>

First example

#include <cstdlib> #include <iostream> #include <binlog_api.h> int main(int argc, char *argv[]) { const char *url = “mysql://root@127.0.0.1:3360”; Binary_log binlog(create_transport(url)); binlog.connect(); Binary_log_event *event; while (true) { int result = binlog.wait_for_next_event(&event); if (result == ERR_EOF) break; cout << “ at “ << binlog.get_position() << “ event type “ << event.get_type_code() << endl; } return EXIT_SUCCESS; }

slide-13
SLIDE 13

13

<Insert Picture Here>

Create network transport

#include <cstdlib> #include <iostream> #include <binlog_api.h> int main(int argc, char *argv[]) { const char *url = “mysql://root@127.0.0.1:3360”; Binary_log binlog(create_transport(url)); binlog.connect(); Binary_log_event *event; while (true) { int result = binlog.wait_for_next_event(&event); if (result == ERR_EOF) break; cout << “ at “ << binlog.get_position() << “ event type “ << event.get_type_code() << endl; } return EXIT_SUCCESS; }

slide-14
SLIDE 14

14

<Insert Picture Here>

… or file transport

#include <cstdlib> #include <iostream> #include <binlog_api.h> int main(int argc, char *argv[]) { const char *url = “file:///tmp/binlog.0000001”; Binary_log binlog(create_transport(url)); binlog.connect(); Binary_log_event *event; while (true) { int result = binlog.wait_for_next_event(&event); if (result == ERR_EOF) break; cout << “ at “ << binlog.get_position() << “ event type “ << event.get_type_code() << endl; } return EXIT_SUCCESS; }

slide-15
SLIDE 15

15

<Insert Picture Here>

Connect the transport

#include <cstdlib> #include <iostream> #include <binlog_api.h> int main(int argc, char *argv[]) { const char *url = “file:///tmp/binlog.0000001”; Binary_log binlog(create_transport(url)); binlog.connect(); Binary_log_event *event; while (true) { int result = binlog.wait_for_next_event(&event); if (result == ERR_EOF) break; cout << “ at “ << binlog.get_position() << “ event type “ << event.get_type_code() << endl; } return EXIT_SUCCESS; }

slide-16
SLIDE 16

16

<Insert Picture Here>

Digression: set read position

  • Default: start at beginning
  • Set position explicitly:

if (binlog.set_position(file, pos)) { /* Handle error */ }

slide-17
SLIDE 17

17

<Insert Picture Here>

Read events

#include <cstdlib> #include <iostream> #include <binlog_api.h> int main(int argc, char *argv[]) { const char *url = “file:///tmp/binlog.0000001”; Binary_log binlog(create_transport(url)); binlog.connect(); Binary_log_event *event; while (true) { int result = binlog.wait_for_next_event(&event); if (result == ERR_EOF) break; cout << “ at “ << binlog.get_position() << “ event type “ << event->get_type_code() << endl; } return EXIT_SUCCESS; }

Get event

slide-18
SLIDE 18

18

Steps summary

  • Create a transport

– create_transport

  • Connect to server

– connect

  • Set position

– set_position

  • Start event loop

– wait_for_next_event

slide-19
SLIDE 19

19

<Insert Picture Here>

Reading information in events Binlog API

  • The replication listener
slide-20
SLIDE 20

20

Binlog Event Structure

Post-header Variable Part Com m

  • n Header
  • Common header
  • Generic data
  • Fixed size
  • Post-header
  • Event-specific data
  • Fixed size
  • Variable part
  • Event-specific data
  • Variable size
slide-21
SLIDE 21

21

<Insert Picture Here>

Reading the header

  • Read common header

– header()

  • Access fields

switch (event->header()->type_code) { case QUERY_EVENT: … case USER_VAR_EVENT: … case FORMAT_DESCRIPTION_EVENT: … }

slide-22
SLIDE 22

22

Binlog Event Common Header

19 Bytes

4 bytes

timestamp type_code server_id event_length flags next_position

  • Data common to all events
  • Next Position

– One-after-end of event

  • Timestamp

– Statement start time

  • Flags

– Binlog-in-use – Thread-specific – Suppress “use” – Artificial – Relay-log event

slide-23
SLIDE 23

23

Binlog Event Structure

Post-header Variable Part Com m

  • n Header
  • Common header
  • Generic data
  • Fixed size
  • Post-header
  • Event-specific data
  • Fixed size
  • Variable part
  • Event-specific data
  • Variable size
slide-24
SLIDE 24

24

Query Event

  • Most common event
  • Used for statements
  • Statement logged

literally

– … in almost all cases

query

Common Header thread_id exec_time error_code

std::vector<uint8_t> variables Special case: need to be decoded db_name

slide-25
SLIDE 25

25

<Insert Picture Here>

Reading event data

  • Cast to correct event type
  • Access fields

switch (event->header()->type_code) { case QUERY_EVENT: Query_event *qev = static_cast<Query_event*>(event); cout << qev->query << endl; break; case USER_VAR_EVENT: … case FORMAT_DESCRIPTION_EVENT: … }

slide-26
SLIDE 26

26

<Insert Picture Here>

Event-driven API

slide-27
SLIDE 27

27

<Insert Picture Here>

Event-driven API

  • Content handlers

wait_for_next_event

slide-28
SLIDE 28

28

Saving user-defined variables

class Save_handler : public Content_handler { … }; Save_handler::Map vars; Save_handler save_vars(vars); binlog.content_handler_pipeline()

  • >push_back(&save_vars);
slide-29
SLIDE 29

29

<Insert Picture Here>

User-defined variables

class Save_handler : public Content_handler { public: typedef std::map<std::string, std::string> Map; Save_handler(Map &container) : m_var(container) { } Binary_log_event * process_event(User_var_event *event) { m_var[event->name] = event->value; return NULL; } private: Map &m_var; };

slide-30
SLIDE 30

30

Replace handler

class Replace_vars : public Content_handler { Binary_log_event * process_event(Query_log_event *event) { /* Code to replace variables */ } };

Full example: basic-2.cpp

slide-31
SLIDE 31

31

<Insert Picture Here>

Example two: How to capture live row changes Binlog API

  • The replication listener
slide-32
SLIDE 32

32

Row events in the binlog

Table map Write rows Write rows Write rows Table map Delete rows Transaction Transaction Header Map table definition to table ID We'll cover this soon (trust me) A bunch of rows

slide-33
SLIDE 33

33

Capturing row events

class Row_event_handler : public Content_handler { public: Binary_log_event * process_event(Row_event *event) { switch(ev->header()->type_code) { case WRITE_ROWS_EVENT: case UPDATE_ROWS_EVENT: case DELETE_ROWS_EVENT: ...

slide-34
SLIDE 34

34

Capturing row events

  • The *_ROWS_EVENT

uint64_t table_id; uint16_t flags; uint64_t columns_len; uint32_t null_bits_len; vector<uint8_t> columns_before_image; vector<uint8_t> used_columns; vector<uint8_t> row;

Raw row data Defined in the table map event

slide-35
SLIDE 35

35

Reading rows

  • Wrap raw row data in Row_event_set
  • Iterate over rows using iterator

Row_event_set rows(row_event, table_map_event); Row_event_set::iterator it= rows.begin(); You need to have captured this before!

slide-36
SLIDE 36

36

Reading fields of a row

  • Row_of_fields to iterate fields of a row

–Turns row into row of fields sequence

Row_event_set rows(row_event, table_map_event); for (Row_event_set::iterator it = rows.begin() ; it != rows.end() ; ++it) table_delete(os.str(), Row_of_fields(*it));

slide-37
SLIDE 37

37

Reading fields of a row

  • Iterate over fields in Row_of_fields

void table_delete (..., const Row_of_fields& fields) { Row_of_fields::iterator it= fields.begin(); for (int id = 0 ; it =! fields.end() ; ++it, ++id) { std::string str; Converter().to(str, *it); std::cout << id << "= " << str << std::endl; } }

slide-38
SLIDE 38

38

Decoding a field

  • Iterate over fields in Row_of_fields

void table_delete (..., const Row_of_fields& fields) { Row_of_fields::iterator it= fields.begin(); for (int id = 0 ; it =! fields.end() ; ++it, ++id) { std::string str; Converter().to(str, *it); std::cout << id << "= " << str << std::endl; } }

slide-39
SLIDE 39

39

Summary – what's it for?

  • Replicate to other systems

– Hbase, SOLR, etc.

  • Triggering on specific events

– Call the DBA when tables are dropped? – Monitor objects in database

  • Browsing binary logs

– Extract subset of changes (by table, by execution time, etc.) – Statistics

  • Component in building other solutions

– Point-in-time restore – Sharding / Load-balancing

slide-40
SLIDE 40

40

Summary – what we've covered

  • Reading events
  • Creating content handlers
  • Processing queries
  • Processing rows
  • Reading fields
  • … but there is a lot more
slide-41
SLIDE 41

41

  • Available at labs

– http://labs.mysql.com/

  • Source code available at launchpad

– http://launchpad.net/mysql-replication-listener

  • MySQL High Availability

Get it as free ebook: http://oreilly.com/go/ebookrequest Valid this week, mention event “MySQL Replication Update”

slide-42
SLIDE 42