Introduction Basics Derivations Extractions Types Inheritance Thank you
Advanced Data Modelling in PostgreSQL Chris Travers Adjust GmbH - - PowerPoint PPT Presentation
Advanced Data Modelling in PostgreSQL Chris Travers Adjust GmbH - - PowerPoint PPT Presentation
Introduction Basics Derivations Extractions Types Inheritance Thank you Advanced Data Modelling in PostgreSQL Chris Travers Adjust GmbH May 16, 2020 Introduction Basics Derivations Extractions Types Inheritance Thank you About Me
Introduction Basics Derivations Extractions Types Inheritance Thank you
About Me
- New contributor to PostgreSQL (one bugfix so far)
- Heads the PostgreSQL-related R&D at Adjust GmbH
- Long-time PostgreSQL user (since 1999)
- Been around the community for a long time.
Introduction Basics Derivations Extractions Types Inheritance Thank you
About Adjust
We are big PostgreSQL users. Over 10PB of data, with near-real-time analytics on 1PB of raw data and 400,000 inbound requests per second. We provide mobile advertisement attribution and analytics services to companies who buy advertising.
Introduction Basics Derivations Extractions Types Inheritance Thank you
What We Will Cover
- Derived Data Modelling
- Custom Data Types
- Data Extracted from Semi-Structured Data
- Inheritance and Polymorphic Relationships
- Multiple Inheritance and Overlapping Sets
Introduction Basics Derivations Extractions Types Inheritance Thank you
Quick Refresher of RDBMS Model
- Relation as a set of corresponding facts
- Relation entry modelled as tuple
- Functional dependencies important for inference derivation
Introduction Basics Derivations Extractions Types Inheritance Thank you
RDBMSs as Inferential Systems
- Functional dependencies imply functions
- Inferences are derived via function composition
Introduction Basics Derivations Extractions Types Inheritance Thank you
ORDBMS Goals
Example Query
SELECT * FROM photos p JOIN locations l ON distance(p.pt, l.pt) < ’100km’::distance WHERE is sunset(p.graphics)
- Distance calculation?
- Distance type?
- Detecting sunsets?
Introduction Basics Derivations Extractions Types Inheritance Thank you
What is needed?
- Custom data types to normalize and compare
- Custom calculations
- Pluggable calculations based on arbitrary data
Introduction Basics Derivations Extractions Types Inheritance Thank you
Calculated Inferences
- All inferences are calculated.
- Some inferences require traversing known facts
- Other inferences can be calculated based on a single known
fact or a single relation entry. This section is about the last category.
Introduction Basics Derivations Extractions Types Inheritance Thank you
Examples of tuple-derived Inferences
- tsvectors for searching notes with a subject, author, and
contents
- Area of a polygon
- Bounding box of a polygon
Given x, we get one and exactly one y, and we can calculate it knowing only x.
Introduction Basics Derivations Extractions Types Inheritance Thank you
Example: Searchable Notes
Table
create table note ( id serial primary key, subject text not null, author text not null, contents text not null, ... );
What we want
SELECT * from note WHERE plainto tsquery( ’Postgres is great’ ) @@ tsvector(note)
- - or note.tsvector
Introduction Basics Derivations Extractions Types Inheritance Thank you
The Function
Function
CREATE OR REPLACE FUNCTION tesvector(note) RETURNS TSVECTOR LANGUAGE SQL IMMUTABLE AS $$ SELECT to tsvector(array to string( array[$1.subject $1.contents], ’ ’ )); $$;
Introduction Basics Derivations Extractions Types Inheritance Thank you
But there is more!
- CREATE INDEX ON note using gin (tsvector(note));
- We can change the function and reindex
- Calculated on write, not on read
Introduction Basics Derivations Extractions Types Inheritance Thank you
Fun with notation
Quick: Which of these does not work?
- select tsvector(note) from note;
- select subject(note) from note;
- select note.subject, note.tsvector from note;
- select tsvector from note;
Some of this is due to Postgres’s heritage before SQL.
Introduction Basics Derivations Extractions Types Inheritance Thank you
Exercise
Exercise 1 https: //www.github.com/einhverfr/admcourse/exercises/1 (10 min to complete)
Introduction Basics Derivations Extractions Types Inheritance Thank you
Particular Example: Extracting Data from Structured Text
Introduction Basics Derivations Extractions Types Inheritance Thank you
Desired SQL
We want to pull protein documents that are patents and sort based
- n patent date:
Something like this
select * from published doc d WHERE d.protein = 123 AND d.is patent ORDER BY d.patent date asc LIMIT 100
Introduction Basics Derivations Extractions Types Inheritance Thank you
Example Document
Problem is, this information is the published docs text field.
Example
DB patentd REF 123ZZ12433 REGD 19750305 .... Here the important data is stored in a structured text document along with different documents of different structures.
Introduction Basics Derivations Extractions Types Inheritance Thank you
Example is patent function
We want to make sure this is only called on these documents so we create a function which takes in a row of the appropriate type:
is patent
create or replace function is patent(published doc) returns bool strict immutable language sql as $$ select $1.document like ’%DB patentd%’ $$;
- Simple. Part 1 solved.
Introduction Basics Derivations Extractions Types Inheritance Thank you
Part 2: date extraction
patent date function
create or replace function patent date(published doc) returns date strict immutable language plperl as $$ use strict; my @lines = split /\n/, $ [1]; my ($regd) = grep /ˆREGD \d{4}\d{2}\d{2}/ @lines; return unless $regd; $regd =˜ /ˆREGD \d{4}\d{2}\d{2}/; return ”$1-$2-$3”; $$; create index doc patent date on published doc(patent date(published doc));
Introduction Basics Derivations Extractions Types Inheritance Thank you
Exercise 2
Exercise 2 https: //www.github.com/einhverfr/admcourse/exercises/2 (10 min to complete)
Introduction Basics Derivations Extractions Types Inheritance Thank you
Approach here
- Composite types are used for illustration purposes
- Most types should be implemented in C for heavier use
- Indexing is not covered.
Introduction Basics Derivations Extractions Types Inheritance Thank you
Use case 1: Interfaces
Use Cases:
- Fixed-structure complex field in table
- N1NF modelling (arrays of tuples in a field)
- Way of passing data into or out of a user defined function
Introduction Basics Derivations Extractions Types Inheritance Thank you
Example for Contact/tagged text
Type definition
create type full customer as ( customer customer, contacts contact item[], addresses address[] ); Each field is a tuple or an array of tuples.
Introduction Basics Derivations Extractions Types Inheritance Thank you
Example function
Save Function
create or replace function full customer save(full customer in c) returns void language sql strict as $$ insert into customer select ((in c).customer).*; insert into contact item select u.* from unnest(contacts(in c)) u; insert into address select u.* from unnest(addresses(in c)) u; $$;
Introduction Basics Derivations Extractions Types Inheritance Thank you
Exercise 3
Exercise 3 https: //www.github.com/einhverfr/admcourse/exercises/3 (10 min to complete)
Introduction Basics Derivations Extractions Types Inheritance Thank you
Use Case 2: Tables as Types
- Convert a table into another type
- Arithmetic on rows
- Aggregates on rows
Introduction Basics Derivations Extractions Types Inheritance Thank you
Example: work shift
Table Structure
CREATE TABLE work shift ( id serial primary key, employee id int references employee(id), shift start timestamp not null, shift end timestamp not null, ... );
Introduction Basics Derivations Extractions Types Inheritance Thank you
Example Derivation Function
CREATE OR REPLACE FUNCTION date(work shift)
RETURNS date language sql immutable as $$ select $1.shift start::date; $$;
CREATE OR REPLACE FUNCTION timestamp(work shift)
RETURNS interval language sql immutable as $$ select $1.shift end - $1.shift start; $$;
CREATE OR REPLACE FUNCTION add intervals(work shift, work shift)
RETURNS interval language sql immutable as $$ select interval($1) + interval($2)$$;
Introduction Basics Derivations Extractions Types Inheritance Thank you
Example Aggregate
CREATE OR REPLACE AGGREGATE total interval(work shift) as (
sfunc = add intervals, stype = interval, initialcond = ’0 days’ );
Introduction Basics Derivations Extractions Types Inheritance Thank you
Exercise 4
Exercise 4 https: //www.github.com/einhverfr/admcourse/exercises/4 (10 min to complete)
Introduction Basics Derivations Extractions Types Inheritance Thank you
The Polymorphic Relationship Problem
Introduction Basics Derivations Extractions Types Inheritance Thank you
Common example: Notes
- notes in an accounting system
- invoices, customers, vendors, journal entries, etc
- Each of these have their own primary key spaces
Introduction Basics Derivations Extractions Types Inheritance Thank you
What Inheritance Gives You
- a collection for all of the collection entries
- referential integrity enforcement
- ability to differentiate on polymorphic relationship
- Consistent point of management
- casts for common functions like tsvector conversions etc
Introduction Basics Derivations Extractions Types Inheritance Thank you
Gotchas
- Multiple inheritance causes schema management challenges
- Casting can happen at odd points leading to index strangeness
Introduction Basics Derivations Extractions Types Inheritance Thank you
Exercise 5
Exercise 5 https: //www.github.com/einhverfr/admcourse/exercises/5 (10 min to complete)
Introduction Basics Derivations Extractions Types Inheritance Thank you