Whats New in Alibabas X-DB SQL Engine Min Qiu, Alibaba Group Santa - PowerPoint PPT Presentation

What’s New in Alibaba’s X-DB SQL Engine Min Qiu, Alibaba Group Santa Clara, California | April 23th – 25th, 2018

Agenda • Introduction to X-DB • Features in X-DB SQL Engine • Query Plan Cache • Remote Execution • Distributed SQL Processing • Future Work 2

Introduction to X-DB

What’s X -DB • Alibaba home-made distributed database • MySQL Compatible syntax/interfaces/protocols • High performance at low cost • Horizontal scalability with high availability and fault tolerance • Data Strong consistency guarantee • Globally distributed • SW/HW co-designed 4

X-DB Architecture App App App App App App • Key Components XDriver XDriver XDriver XDriver XDriver XDriver - HA(X-Paxos) - Global Meta Management(GMS) AZ1 AZ2 - Storage Engine(X-Engine) AZ3 X-Server X-Server X-Server - Distributed SQL Engine GMS GMS GMS • Features Partition Service Partition Service Partition Service - Self-contained SQL/KV Engine SQL/KV Engine SQL/KV Engine X-RPC X-RPC X-RPC - Decoupling storage and compute Coordinator Coordinator Coordinator X- X- X- allowing separate scaling Engine Engine Engine X-Paxos X-Paxos X-Paxos - Multiple replicas allowing multiple IO IO IO reads Distributed FS Distributed FS Distributed FS 5

Features in X-DB SQL Engine

Features in X-DB SQL Engine • Window Function • Global Sequence • Query Plan Cache • Distributed Query Processing 7

Query Plan Cache What and Why? • Plan is cached to skip compilation efforts in subsequent runs • Parameter bind is probably necessary • Good for short-running queries • Good for query plan insensitive to bind parameters 8

Query Plan Cache How? • Options - Full text match (case sensitive) select * from t where c = 1 ！ = select * from t where c = 2 - Parameterized SQL template, i.e. Prepared Statement(PS) select * from t where c = ? • Our Solution: Extension of existing PS solution - Queries from Alibaba online system fall into limited number of patterns - No extra work is required to parameterize SQL template 9

Query Plan Cache • Problems with MySQL PS - PS objects are saved in client connection, OOM risk - No cache invalidation mechanism - PS only saves parsing time, optimization is still needed • Our Enhancement - Cache PS and plan related objects in worker threads • JOIN • best_ref • QEP_TAB • … - Cache invalidation when capacity limit is hit 10

Query Plan Cache Implementation • Generate query plan with first-run parameters - Good for cases where plan is insensitive to input parameters - Performance regression is possible if cached plan is suboptimal for specific parameters • Parameter substitution - New parameter values should be mapped to correct location of relevant data structures • Decouple execution from optimization • Cache Management - New system variable plan_cache_size to control cache size on each worker thread - Use LRU to evict if memory limit is hit 11

Query Plan Cache Something interesting • Disable the optimization against const table • Prevent Impossible Plan from being generated at the first run create table t (c1 INT primary key, c2 INT); insert into t values (1, 1); prepare stmt from 'select * from t where c1 = ?’; set @a=NULL; execute stmt using @a; set @a=1; execute stmt using @a; • Disallow the JOIN_TYPE to be set to ref for between predicate create table t (c1 INT primary key, c2 INT); insert into t values (1, 1), (2,2); prepare stmt from 'select * from t where c1 between ? and ?’; set @a=1;set @b=1; execute stmt using @a, @b; set @a=1;set @b=2; execute stmt using @a, @b; 12

Query Plan Cache Plan Cache Invalidation • Cache schema version(i.e. V1) in PS when generating query plan • Schema version changes(i.e. V2) as DDL is applied • Raise a flag when a different schema version is detected at runtime • Automatic re-prepare when invalidated, transparent to the user 13

Performance Evaluation Configuration • Sysbench - sltp_read_only - select_random_points • 100 Tables • 100K records per table • 500 Client connections 14

Distributed Query Processing Click to add text

Remote Execution Architecture • SQL Request Routing/Forwarding - Handle incorrect SQL routing • Partition Location Identification • Pass back result from remote • External dependencies - RPC service - GMS/LMS 16

Remote Execution Execution Scheduling • Promise/Future async paradigm • CONTEXT - Client communication protocol - THD context • Privilege check skipped on node B 17

Remote Execution Exception handling • Only one forwarding is allowed • Unsupported SQL request is prevent from execution - Query which touches data across multiple nodes - Transaction across multiple nodes • Error message/code is overlaid to node initializing the remote execution • Perform refresh operation if error is caused by out-of-date location cache 18

Distributed Execution Architecture • Query Coordinator - Accept SQL request - Generate distributed query plan - Split the execution pipeline(stage) - Schedule stage execution • Query Worker - Execute pipeline with given control information - Redistribute data to next stage • Scheduler - Pipeline dependency relation 19

Distributed Execution Query Plan on Coordinator • Generate a single-host query plan first • Identify the node under which Exchange is inserted if necessary - Join - Sort - Group By • Record the location where Exchange node is inserted (index of QEP_TAB array) 20

Distributed Execution Query Plan on Coordinator • mm_tree is built against query condition for Partition Pruning - i.e. only t1p1, t1p3, t2p2, t2p4 left after partition pruning • Partition Location Cache at LMS tells which hosts hold those partitions - Cache might be out-of-sync which can be detected at execution time - LMS is forced to refresh from GMS when out-of-sync is found • Table access operations are dispatched to corresponding hosts by RPC 21

Distributed Execution Modification on optimizer • Disable ” const table” optimization when necessary • Perform “lock table” operation as late as possible - Currently tables are locked in between prepare and optimize • Avoid diving into storage engine during optimization - Row number estimate - Record in range estimate • Use statistics stored in global catalog for cost estimation 22

Distributed Execution Terminology • Pipeline - A segment(set of operations) in the query plan tree - Receives input from upstream pipeline and generation output for downstream pipeline - The unit of scheduling and execution • Stage - The process in which a pipeline execution is performed • Task - A subset of a stage, which deals with a partition of data - A stage might be consist of multiple tasks 23

Distributed Execution Pipeline Tree Generation • Exchange node is the pipeline boundary • A pair of In/Out nodes corresponding to each Exchange - Exchange Out node at Producer side - Exchange In node at Consumer side • Pipeline dependency relationship is represented by the edge in the tree • Pipeline tree is input to scheduler 24

Distributed Execution Query Plan on Worker • Plan on worker must be the same as that on coordinator - Pipeline generated on coordinator must match the same segment on worker • Full SQL statement is compiled only once on worker - Plan is cached on worker with globally unique job ID as label 25

Distributed Execution Same environment on Coordinator and Workers • Environment on coordinator to generate query plan must be restored on workers • Several categories of environment/context information - All system variables used by optimizer - All session variables used by optimizer - Statistics used by optimizer • table::file::stats.records - Constant values in cost_model objects 26

Distributed Execution Two Level Scheduler • Stage level scheduling - Satisfies dependency restriction of pipeline tree - Bottom up - JOIN::exec() is modified to be reentrant • Allow to execute a segment of QEP_TAB array with given start and end indexes. • Join_buffer/Temp_table must be ready before the segment starts to run • Task level scheduling - Task is divided into a set of sub-task - Schedule sub-task to run in parallel 27

Distributed Execution Examples Partition-aware Join Push Down • Without Data Exchange - Simple query without group by/order by/subquery Merge - Partition key is prefix of group by list - Partition key is prefix of order by list - Partition-aware Join Stage 0 Stage 0 Stage 0 Task 0 Task 1 Task 1 28

Distributed Execution Examples • With Data Exchange - Broadcast T2 T2 T2 - T1 has 3 partitions on 3 nodes - T2 has 1 partitions on 1 node - T2 is small 29

Distributed Execution Examples • With Data Exchange T1P1_1 - Shuffle T1P1_2 T2P1_1 - T1 has 4 partitions on 2 T1P2_1 nodes T2P1_2 - T2 has 2 partitions on 2 T1P2_2 nodes T1P3_1 T1P3_2 T2P2_1 T1P4_1 T2P2_2 T1P4_2 30

Future Work

Future Work • Distributed Processing Enhancement - Support distributed query processing requiring data redistribution - Advanced Scheduling Options • Parallel scheduling for independent stages • Location/Resource based scheduling - Load Balancer - Admission Control • Enhancement to optimizer - Data Redistribution selection - Automatic Parallelism selection • Enhancement to executor - Hash Join - Sort Merge Join 32

Whats New in Alibabas X-DB SQL Engine Min Qiu, Alibaba Group Santa - PowerPoint PPT Presentation

Whats New in Alibabas X-DB SQL Engine Min Qiu, Alibaba Group Santa Clara, California | April 23th 25th, 2018 Agenda Introduction to X-DB Features in X-DB SQL Engine Query Plan Cache Remote Execution Distributed SQL

Percona XtraBackup at Alibaba Cloud Bo Wang Alibaba Cloud About Me Bo Wang (Fungo Wang)

Alibaba Cloud DNS Practice ICANN64 TechDay guochuan.gc@alibaba-inc.com introduction Who we

Trends in Alibaba Zhaogang Wang zhaogang.wzg@alibaba-inc.com 1 About me Senior Specialist

SQL SQL SQL = Structured Query Language Standard query language for relational

Intermezzo: A typical database architecture 136 A typical database architecture SQL SQL SQL

A1 (Part 2): Injection SQL Injection SQL injection is prevalent SQL injection is impactful Why a

What is SQL? SQL stands for Structured Query Language SQL lets you access and manipulate

BASIC SQL CHAPTER 4 (6/E) CHAPTER 8 (5/E) 1 CHAPTER 4 OUTLINE SQL Data Definition and

This Lecture SQL The SQL language SQL, the relational model, and E/R diagrams SQL Data

Basic SQL Lecture 2 1 Outline Data in SQL Simple Queries in SQL Queries with more

Technological Innova.on at Alibaba Alan Qi Vice President of Ant Financial Service Group Outline

PolarDB Cloud Native DB @ Alibaba Lixun Peng Inaam Rana Alibaba Cloud Team Agenda

Alibaba Dragonwell JDK: Towards a Java Runtime for Cloud Computing Xiaoming Gu Alibaba JVM Team

Live Migration @Alibaba Cloud: issues settled & challenges remain Chao Zhang Email:

Self-Driving Networks Speaker Ming Zhang Alibaba Group Alibaba serves users around the globe

The SQL Procedure Language (SQL PL) Tony Andrews Themis Education tandrews@themisinc.com

Benefits for All Overview History What is Simplify? How Data Sharing Works

Strategies for Policing Innovation 2018 New Site Orientation Webinar January 15, 2019

Group Communication Point-to-point vs. one-to-many Multicast communication Atomic

IEEE PES 3 rd GENERAL MEETING SIGN IN HERE: tinyurl.com/PES-SignIn3 IEEE Smart Village Projects

Da Data ta Re Requirements quirements Lothar Linde ADB GMS Environment Operations Center

Practical and Theoretical Advances for Inference in Partially Identified Models by Azeem M.

Post-Zeroizing Obfuscation New Mathematical Tools and the Case of

Standards and Interoperability Lab - Asia 1604 Asian Development Bank HQ, ADB Avenue, Ortigas CBD,

Whats New in Alibabas X-DB SQL Engine Min Qiu, Alibaba Group Santa - PowerPoint PPT Presentation

Whats New in Alibabas X-DB SQL Engine Min Qiu, Alibaba Group Santa Clara, California | April 23th 25th, 2018 Agenda Introduction to X-DB Features in X-DB SQL Engine Query Plan Cache Remote Execution Distributed SQL

Percona XtraBackup at Alibaba Cloud Bo Wang Alibaba Cloud About Me Bo Wang (Fungo Wang)

Alibaba Cloud DNS Practice ICANN64 TechDay guochuan.gc@alibaba-inc.com introduction Who we

Trends in Alibaba Zhaogang Wang zhaogang.wzg@alibaba-inc.com 1 About me Senior Specialist

SQL SQL SQL = Structured Query Language Standard query language for relational

Intermezzo: A typical database architecture 136 A typical database architecture SQL SQL SQL

A1 (Part 2): Injection SQL Injection SQL injection is prevalent SQL injection is impactful Why a

What is SQL? SQL stands for Structured Query Language SQL lets you access and manipulate

BASIC SQL CHAPTER 4 (6/E) CHAPTER 8 (5/E) 1 CHAPTER 4 OUTLINE SQL Data Definition and

This Lecture SQL The SQL language SQL, the relational model, and E/R diagrams SQL Data

Basic SQL Lecture 2 1 Outline Data in SQL Simple Queries in SQL Queries with more

Technological Innova.on at Alibaba Alan Qi Vice President of Ant Financial Service Group Outline

PolarDB Cloud Native DB @ Alibaba Lixun Peng Inaam Rana Alibaba Cloud Team Agenda

Alibaba Dragonwell JDK: Towards a Java Runtime for Cloud Computing Xiaoming Gu Alibaba JVM Team

Live Migration @Alibaba Cloud: issues settled &amp; challenges remain Chao Zhang Email:

Self-Driving Networks Speaker Ming Zhang Alibaba Group Alibaba serves users around the globe

The SQL Procedure Language (SQL PL) Tony Andrews Themis Education tandrews@themisinc.com

Benefits for All Overview History What is Simplify? How Data Sharing Works

Strategies for Policing Innovation 2018 New Site Orientation Webinar January 15, 2019

Group Communication Point-to-point vs. one-to-many Multicast communication Atomic

IEEE PES 3 rd GENERAL MEETING SIGN IN HERE: tinyurl.com/PES-SignIn3 IEEE Smart Village Projects

Da Data ta Re Requirements quirements Lothar Linde ADB GMS Environment Operations Center

Practical and Theoretical Advances for Inference in Partially Identified Models by Azeem M.

Post-Zeroizing Obfuscation New Mathematical Tools and the Case of

Standards and Interoperability Lab - Asia 1604 Asian Development Bank HQ, ADB Avenue, Ortigas CBD,

Live Migration @Alibaba Cloud: issues settled & challenges remain Chao Zhang Email: