Embedded Database Logic Lecture # 15 Database Systems Andy Pavlo - - PowerPoint PPT Presentation

embedded database logic
SMART_READER_LITE
LIVE PREVIEW

Embedded Database Logic Lecture # 15 Database Systems Andy Pavlo - - PowerPoint PPT Presentation

Embedded Database Logic Lecture # 15 Database Systems Andy Pavlo AP AP Computer Science 15-445/15-645 Carnegie Mellon Univ. Fall 2018 2 ADM IN ISTRIVIA Project #3 is due Monday October 19 th Project #4 is due Monday December 10 th


slide-1
SLIDE 1

Database Systems 15-445/15-645 Fall 2018 Andy Pavlo Computer Science Carnegie Mellon Univ.

AP AP

Lecture # 15

Embedded Database Logic

slide-2
SLIDE 2

CMU 15-445/645 (Fall 2018)

ADM IN ISTRIVIA

Project #3 is due Monday October 19th Project #4 is due Monday December 10th Homework #4 is due Monday November 12th

2

slide-3
SLIDE 3

CMU 15-445/645 (Fall 2018)

UPCO M IN G DATABASE EVEN TS

BlazingDB Tech Talk

→ Thursday October 25th @ 12pm → CIC - 4th floor (ISTC Panther Hollow Room)

Brytlyt Tech Talk

→ Thursday November 1st @ 12pm → CIC - 4th floor (ISTC Panther Hollow Room)

3

slide-4
SLIDE 4

CMU 15-445/645 (Fall 2018)

O BSERVATIO N

Until now, we have assumed that all of the logic for an application is located in the application itself. The application has a "conversation" with the DBMS to store/retrieve data.

→ Protocols: JDBC, ODBC

4

slide-5
SLIDE 5

CMU 15-445/645 (Fall 2018)

CO N VERSATIO N AL DATABASE API

5

Application

BEGIN SQL Program Logic SQL Program Logic ⋮ COMMIT

Parser Planner Optimizer Query Execution

slide-6
SLIDE 6

CMU 15-445/645 (Fall 2018)

CO N VERSATIO N AL DATABASE API

5

Application

BEGIN SQL Program Logic SQL Program Logic ⋮ COMMIT

Parser Planner Optimizer Query Execution

slide-7
SLIDE 7

CMU 15-445/645 (Fall 2018)

CO N VERSATIO N AL DATABASE API

5

Application

BEGIN SQL Program Logic SQL Program Logic ⋮ COMMIT

Parser Planner Optimizer Query Execution

slide-8
SLIDE 8

CMU 15-445/645 (Fall 2018)

CO N VERSATIO N AL DATABASE API

5

Application

BEGIN SQL Program Logic SQL Program Logic ⋮ COMMIT

Parser Planner Optimizer Query Execution

slide-9
SLIDE 9

CMU 15-445/645 (Fall 2018)

CO N VERSATIO N AL DATABASE API

5

Application

BEGIN SQL Program Logic SQL Program Logic ⋮ COMMIT

Parser Planner Optimizer Query Execution

slide-10
SLIDE 10

CMU 15-445/645 (Fall 2018)

EM BEDDED DATABASE LO GIC

Move application logic into the DBMS to avoid multiple network round-trips. Potential Benefits

→ Efficiency → Reuse

6

slide-11
SLIDE 11

CMU 15-445/645 (Fall 2018)

TO DAY'S AGEN DA

User-defined Functions Stored Procedures Triggers Change Notifications User-defined Types Views

7

slide-12
SLIDE 12

CMU 15-445/645 (Fall 2018)

USER- DEFIN ED FUN CTIO N S

A user-defined function (UDF) is a function written by the application developer that extends the system's functionality beyond its built-in

  • perations.

→ It takes in input arguments (scalars) → Perform some computation → Return a result (scalars, tables)

8

slide-13
SLIDE 13

CMU 15-445/645 (Fall 2018)

UDF DEFIN ITIO N

Return Types:

→ Scalar Functions: Return a single data value → Table Functions: Return a single result table.

Computation Definition:

→ SQL Functions → External Programming Language

9

slide-14
SLIDE 14

CMU 15-445/645 (Fall 2018)

UDF SQ L FUN CTIO N S

A SQL-based UDF contains a list of SQL statements that the DBMS executes in

  • rder when the UDF is invoked.

→ The function returns whatever the result is of the last query executed;

10

CREATE TABLE foo ( id INT PRIMARY KEY, val VARCHAR(16) );

CREATE FUNCTION get_foo(int) RETURNS foo AS $$ SELECT * FROM foo WHERE foo.id = $1; $$ LANGUAGE SQL;

slide-15
SLIDE 15

CMU 15-445/645 (Fall 2018)

UDF SQ L FUN CTIO N S

A SQL-based UDF contains a list of SQL statements that the DBMS executes in

  • rder when the UDF is invoked.

→ The function returns whatever the result is of the last query executed;

10

CREATE TABLE foo ( id INT PRIMARY KEY, val VARCHAR(16) );

CREATE FUNCTION get_foo(int) RETURNS foo AS $$ SELECT * FROM foo WHERE foo.id = $1; $$ LANGUAGE SQL;

Input Args

slide-16
SLIDE 16

CMU 15-445/645 (Fall 2018)

UDF SQ L FUN CTIO N S

A SQL-based UDF contains a list of SQL statements that the DBMS executes in

  • rder when the UDF is invoked.

→ The function returns whatever the result is of the last query executed;

10

CREATE TABLE foo ( id INT PRIMARY KEY, val VARCHAR(16) );

CREATE FUNCTION get_foo(int) RETURNS foo AS $$ SELECT * FROM foo WHERE foo.id = $1; $$ LANGUAGE SQL;

Return Args

slide-17
SLIDE 17

CMU 15-445/645 (Fall 2018)

UDF SQ L FUN CTIO N S

A SQL-based UDF contains a list of SQL statements that the DBMS executes in

  • rder when the UDF is invoked.

→ The function returns whatever the result is of the last query executed;

10

CREATE TABLE foo ( id INT PRIMARY KEY, val VARCHAR(16) );

CREATE FUNCTION get_foo(int) RETURNS foo AS $$ SELECT * FROM foo WHERE foo.id = $1; $$ LANGUAGE SQL;

Function Body

slide-18
SLIDE 18

CMU 15-445/645 (Fall 2018)

UDF EXTERN AL PRO GRAM M IN G LAN GUAGE

Some DBMSs support writing UDFs in languages

  • ther than SQL.

→ SQL Standard: SQL/PSM → Oracle/DB2: PL/SQL → Postgres: PL/pgSQL → MSSQL/Sybase: Transact-SQL

Other systems support more common programming languages:

→ Sandbox vs. non-Sandbox

11

slide-19
SLIDE 19

CMU 15-445/645 (Fall 2018)

PL/ PGSQ L EXAM PLE

12

CREATE OR REPLACE FUNCTION get_foo(int) RETURNS SETOF foo AS $$ BEGIN RETURN QUERY SELECT * FROM foo WHERE foo.id = $1; END; $$ LANGUAGE plpgsql;

slide-20
SLIDE 20

CMU 15-445/645 (Fall 2018)

PL/ PGSQ L EXAM PLE (2)

13

CREATE OR REPLACE FUNCTION sum_foo(i int) RETURNS int AS $$ DECLARE foo_rec RECORD; DECLARE out INT; BEGIN

  • ut := 0;

FOR foo_rec IN SELECT id FROM foo WHERE id > i LOOP

  • ut := out + foo_rec.id;

END LOOP; RETURN out; END; $$ LANGUAGE plpgsql;

slide-21
SLIDE 21

CMU 15-445/645 (Fall 2018)

STO RED PRO CEDURES

A stored procedure is a self-contained function that performs more complex logic inside of the DBMS.

→ Can have many input/output parameters. → Can modify the database table/structures. → Not normally used within a SQL query.

Some DBMSs distinguish UDFs vs. stored procedures, but not all.

14

slide-22
SLIDE 22

CMU 15-445/645 (Fall 2018)

STO RED PRO CEDURES

15

Application

BEGIN SQL Program Logic SQL Program Logic ⋮ COMMIT

slide-23
SLIDE 23

CMU 15-445/645 (Fall 2018)

STO RED PRO CEDURES

15

Application

CALL PROC(x=99)

PROC(x)

slide-24
SLIDE 24

CMU 15-445/645 (Fall 2018)

STO RED PRO CEDURES

15

Application

CALL PROC(x=99)

PROC(x)

slide-25
SLIDE 25

CMU 15-445/645 (Fall 2018)

STO RED PRO CEDURE VS. UDF

A UDF is meant to perform a subset of a read-only computation within a query. A stored procedure is meant to perform a complete computation that is independent of a query.

16

slide-26
SLIDE 26

CMU 15-445/645 (Fall 2018)

DATABASE TRIGGERS

A trigger instructs the DBMS to invoke a UDF when some event occurs in the database. The developer has to define:

→ What type of event will cause it to fire. → The scope of the event. → When it fires relative to that event.

17

slide-27
SLIDE 27

CMU 15-445/645 (Fall 2018)

CREATE TABLE foo ( id INT PRIMARY KEY, val VARCHAR(16) );

TRIGGER EXAM PLE

18

CREATE TABLE foo_audit ( id SERIAL PRIMARY KEY, foo_id INT REFERENCES foo (id),

  • rig_val VARCHAR,

cdate TIMESTAMP );

slide-28
SLIDE 28

CMU 15-445/645 (Fall 2018)

CREATE TABLE foo ( id INT PRIMARY KEY, val VARCHAR(16) );

TRIGGER EXAM PLE

18

CREATE TABLE foo_audit ( id SERIAL PRIMARY KEY, foo_id INT REFERENCES foo (id),

  • rig_val VARCHAR,

cdate TIMESTAMP ); CREATE OR REPLACE FUNCTION log_foo_updates() RETURNS trigger AS $$ BEGIN IF NEW.val <> OLD.val THEN INSERT INTO foo_audit (foo_id, orig_val, cdate) VALUES (OLD.id, OLD.val, NOW()); END IF; RETURN NEW; END; $$ LANGUAGE plpgsql;

Tuple Versions

slide-29
SLIDE 29

CMU 15-445/645 (Fall 2018)

CREATE TABLE foo ( id INT PRIMARY KEY, val VARCHAR(16) );

TRIGGER EXAM PLE

18

CREATE TABLE foo_audit ( id SERIAL PRIMARY KEY, foo_id INT REFERENCES foo (id),

  • rig_val VARCHAR,

cdate TIMESTAMP ); CREATE OR REPLACE FUNCTION log_foo_updates() RETURNS trigger AS $$ BEGIN IF NEW.val <> OLD.val THEN INSERT INTO foo_audit (foo_id, orig_val, cdate) VALUES (OLD.id, OLD.val, NOW()); END IF; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER foo_updates BEFORE UPDATE ON foo FOR EACH ROW EXECUTE PROCEDURE log_foo_updates();

slide-30
SLIDE 30

CMU 15-445/645 (Fall 2018)

TRIGGER DEFIN ITIO N

Event Type:

→ INSERT → UPDATE → DELETE → TRUNCATE → CREATE → ALTER → DROP

19

Trigger Timing:

→ Before the statement executes. → After the statement executes → Before each row that the statement affects. → After each row that the statement affects. → Instead of the statement.

Event Scope:

→ TABLE → DATABASE → VIEW → SYSTEM

slide-31
SLIDE 31

CMU 15-445/645 (Fall 2018)

CH AN GE N OTIFICATIO N S

A change notification is like a trigger except that the DBMS sends a message to an external entity that something notable has happened in the database.

→ Think a "pub/sub" system. → Can be chained with a trigger to pass along whenever a change occurs.

SQL standard: LISTEN + NOTIFY

20

slide-32
SLIDE 32

CMU 15-445/645 (Fall 2018)

N OTIFICATIO N EXAM PLE

21

CREATE OR REPLACE FUNCTION notify_foo_updates() RETURNS trigger AS $$ DECLARE notification JSON; BEGIN notification = row_to_json(NEW); PERFORM pg_notify('foo_update', notification::text); RETURN NEW; END; $$ LANGUAGE plpgsql;

Notification Payload

slide-33
SLIDE 33

CMU 15-445/645 (Fall 2018)

N OTIFICATIO N EXAM PLE

21

CREATE OR REPLACE FUNCTION notify_foo_updates() RETURNS trigger AS $$ DECLARE notification JSON; BEGIN notification = row_to_json(NEW); PERFORM pg_notify('foo_update', notification::text); RETURN NEW; END; $$ LANGUAGE plpgsql;

Notification Payload

CREATE TRIGGER foo_notify AFTER INSERT ON foo_audit FOR EACH ROW EXECUTE PROCEDURE notify_foo_updates();

slide-34
SLIDE 34

CMU 15-445/645 (Fall 2018)

O BSERVATIO N

All DBMSs support the basic primitive types in the SQL standard. They also support basic arithmetic and string manipulation on them. But what if we want to store data that doesn't match any of the built-in types?

22

coordinate (x, y, label)

slide-35
SLIDE 35

CMU 15-445/645 (Fall 2018)

CO M PLEX TYPES

Approach #1: Attribute Splitting

→ Store each primitive element in the complex type as its own attribute in the table.

Approach #2: Application Serialization

→ Java serialize, Python pickle → Google Protobuf, Facebook Thrift → JSON / XML

23

INSERT INTO location (coord) VALUES ( '{x:10, y:20, label:"OTB"}' ); INSERT INTO locations (x, y, label) VALUES (10, 20, "OTB"); CREATE TABLE locations ( coord JSONB NOT NULL );

slide-36
SLIDE 36

CMU 15-445/645 (Fall 2018)

USER- DEFIN ED TYPES

A user-defined type is a special data type that is defined by the application developer that the DBMS can stored natively.

→ First introduced by Postgres in the 1980s. → Added to the SQL:1999 standard as part of the "object- relational database" extensions.

Sometimes called structured user-defined types

  • r structured types.

24

slide-37
SLIDE 37

CMU 15-445/645 (Fall 2018)

USER- DEFIN ED TYPES

Each DBMS exposes a different API that allows you to create a UDT.

→ Oracle supports PL/SQL. → DB2 supports creating types based on built-in types. → MSSQL/Postgres only support type definition using external languages (.NET, C)

25

CREATE TYPE coordinates AS OBJECT ( x INT NOT NULL, y INT NOT NULL, label VARCHAR(32) NOT NULL );

slide-38
SLIDE 38

CMU 15-445/645 (Fall 2018)

VIEWS

Creates a "virtual" table containing the output from a SELECT query. The view can then be accessed as if it was a real table. This allows programmers to simplify a complex query that is executed often.

→ Won’t make it faster though.

Often used as a mechanism for hiding a subset of a table's attributes from certain users.

26

slide-39
SLIDE 39

CMU 15-445/645 (Fall 2018)

VIEW EXAM PLE (1)

Create a view of the CS student records with just their id, name, and login.

27

Original Table

CREATE VIEW cs_students AS SELECT sid, name, login FROM student WHERE login LIKE '%@cs';

sid name login age gpa 53666 Kanye West kw@cs 40 3.5 53677 Justin Bieber jb@ece 23 2.25 53688 Tone Loc tloc@isr 51 3.8 53699 Andy Pavlo pavlo@cs 36 3.0

slide-40
SLIDE 40

CMU 15-445/645 (Fall 2018)

VIEW EXAM PLE (1)

Create a view of the CS student records with just their id, name, and login.

27

Original Table

CREATE VIEW cs_students AS SELECT sid, name, login FROM student WHERE login LIKE '%@cs';

sid name login age gpa 53666 Kanye West kw@cs 40 3.5 53677 Justin Bieber jb@ece 23 2.25 53688 Tone Loc tloc@isr 51 3.8 53699 Andy Pavlo pavlo@cs 36 3.0 sid name login 53666 Kanye West kw@cs 53699 Andy Pavlo pavlo@cs

SELECT * FROM cs_students;

slide-41
SLIDE 41

CMU 15-445/645 (Fall 2018)

VIEW EXAM PLE (1)

Create a view of the CS student records with just their id, name, and login.

27

Original Table

CREATE VIEW cs_students AS SELECT sid, name, login FROM student WHERE login LIKE '%@cs';

sid name login age gpa 53666 Kanye West kw@cs 40 3.5 53677 Justin Bieber jb@ece 23 2.25 53688 Tone Loc tloc@isr 51 3.8 53699 Andy Pavlo pavlo@cs 36 3.0 sid name login 53666 Kanye West kw@cs 53699 Andy Pavlo pavlo@cs

SELECT * FROM cs_students;

slide-42
SLIDE 42

CMU 15-445/645 (Fall 2018)

VIEW EXAM PLE (2)

Create a view with the average age of all of the students.

28

CREATE VIEW cs_gpa AS SELECT AVG(gpa) AS avg_gpa FROM student WHERE login LIKE '%@cs';

slide-43
SLIDE 43

CMU 15-445/645 (Fall 2018)

VIEWS VS. SELECT IN TO

VIEW

→ Dynamic results are only materialized when needed.

SELECT…INTO

→ Creates static table that does not get updated when student gets updated.

29

CREATE VIEW cs_gpa AS SELECT AVG(gpa) AS avg_gpa FROM student WHERE login LIKE '%@cs'; SELECT AVG(gpa) AS avg_gpa INTO cs_gpa FROM student WHERE login LIKE '%@cs';

slide-44
SLIDE 44

CMU 15-445/645 (Fall 2018)

VIEWS VS. SELECT IN TO

VIEW

→ Dynamic results are only materialized when needed.

SELECT…INTO

→ Creates static table that does not get updated when student gets updated.

29

CREATE VIEW cs_gpa AS SELECT AVG(gpa) AS avg_gpa FROM student WHERE login LIKE '%@cs'; SELECT AVG(gpa) AS avg_gpa INTO cs_gpa FROM student WHERE login LIKE '%@cs';

slide-45
SLIDE 45

CMU 15-445/645 (Fall 2018)

UPDATIN G VIEWS

The SQL-92 standard specifies that an application is allowed to modify a VIEW if it has the following properties:

→ It only contains one base table. → It does not contain grouping, distinction, union, or aggregation.

30

slide-46
SLIDE 46

CMU 15-445/645 (Fall 2018)

M ATERIALIZED VIEWS

Creates a view containing the output from a SELECT query that is automatically updated when the underlying tables change.

31

CREATE MATERIALIZED VIEW cs_gpa AS SELECT AVG(gpa) AS avg_gpa FROM student WHERE login LIKE '%@cs';

slide-47
SLIDE 47

CMU 15-445/645 (Fall 2018)

CO N CLUSIO N

Moving application logic into the DBMS has lots

  • f benefits.

→ Better Efficiency → Reusable across applications

But it has problems:

→ Not portable → DBAs don't like constant change. → Potentially need to maintain different versions.

32

slide-48
SLIDE 48

CMU 15-445/645 (Fall 2018)

N EXT CLASS

TRANSACTIONS!!!

33