DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // - - PowerPoint PPT Presentation

database system implementation
SMART_READER_LITE
LIVE PREVIEW

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // - - PowerPoint PPT Presentation

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #4: SYSTEM CATALOGS & DATABASE COMPRESSION 2 OFFICE HOURS Prashanth Dintyala Office Hours: Mon, 1:30-2:30 PM Location: Near KACB 3324 Email:


slide-1
SLIDE 1

DATABASE SYSTEM IMPLEMENTATION

GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #4: SYSTEM CATALOGS & DATABASE COMPRESSION

slide-2
SLIDE 2

OFFICE HOURS

Prashanth Dintyala

→ Office Hours: Mon, 1:30-2:30 PM → Location: Near KACB 3324 → Email: vdintyala3@gatech.edu

Sonia Matthew

→ Office Hours: Wed, 11-12 AM → Location: Near KACB 3324 → Email: smathew60@gatech.edu

2

slide-3
SLIDE 3

HOMEWORK #1

3

We have posted clarifications on Piazza

→ More clarifications will be provided over time

Separately submit the design document on Gradescope

→ Homework 1 - Design Doc

slide-4
SLIDE 4

TODAY’S AGENDA

System Catalogs Compression Background Naïve Compression OLAP Columnar Compression OLTP Index Compression

4

slide-5
SLIDE 5

SYSTEM CATALOGS

Group of tables and views that describe the structure of the database.

→ Statistics related to queries

Each system catalog table contains information about specific elements in the database.

→ Statistics related to queries → Statistics related to data distribution → List of indexes → List of user tables

5

slide-6
SLIDE 6

SYSTEM CATALOGS

Almost every DBMS stores their a database's catalog in itself (i.e., using it storage manager).

→ Wrap object abstraction around tuples to avoid invoking SQL queries to retrieve catalog data → Specialized code for "bootstrapping" catalog tables.

The entire DBMS should be aware of transactions in order to automatically provide ACID guarantees for DDL (i.e., schema change) commands and concurrent txns.

6

slide-7
SLIDE 7

SYSTEM CATALOGS

MySQL

→ Special hard-coded scripts to alter the catalog → Non-transactional catalog changes

PostgreSQL

→ Uses SQL commands to alter the catalog

7

slide-8
SLIDE 8

SCHEMA CHANGES

ADD COLUMN:

→ NSM: Copy tuples into new region in memory. → DSM: Just create the new column segment

DROP COLUMN:

→ NSM #1: Copy tuples into new region of memory. → NSM #2: Mark column as "deprecated", clean up later. → DSM: Just drop the column and free memory.

CHANGE COLUMN:

→ Check whether the conversion is allowed to happen. Depends on default values.

8

slide-9
SLIDE 9

SCHEMA CHANGES

Relevant comment on HackerNews:

→ Schema changes fail to complete on databases > 2 TB → Operation requires double the amount of disk storage for

  • copying. Takes close to a month to perform such an
  • peration on large tables

→ The big reason that DDL is slow is because these systems haven't tried to make it fast. So, blame the DB designers! → Wish it was on my day job list of things I can work on.

9

slide-10
SLIDE 10

INDEXES

CREATE INDEX:

→ Scan the entire table and populate the index. → Must not block all other txns during index construction. → Have to record changes made by txns that modified the table while another txn was building the index. → When the scan completes, lock the table and resolve changes that were missed after the scan started.

DROP INDEX:

→ Just drop the index logically from the catalog. → It only becomes "invisible" when the txn that dropped it

  • commits. All existing txns will still have to update it.

10

slide-11
SLIDE 11

OBSERVATION

I/O is the main bottleneck if the DBMS has to fetch data from disk.

→ CPU cost for decompressing data < I/O cost for fetching un-compressed data. Compression always helps.

In-memory DBMSs are more complicated

→ Compressing the database reduces DRAM requirements and processing.

Key trade-off is speed vs. compression ratio

→ In-memory DBMSs (always?) choose speed.

11

slide-12
SLIDE 12

REAL-WORLD DATA CHARACTERISTICS

Data sets tend to have highly skewed distributions for attribute values.

→ Example: Zipfian distribution of the Brown Corpus → Words like “the”, “a” occur very frequently in books

Data sets tend to have high correlation between attributes of the same tuple.

→ Example: Order Date to Ship Date (few days) → (June 5, +5) instead of (June 5, June 10)

12

slide-13
SLIDE 13

DATABASE COMPRESSION

Goal #1: Must produce fixed-length values. Allows us to be efficient while accessing tuples. Goal #2: Allow the DBMS to postpone decompression as long as possible during query

  • execution. Operate directly on compressed data.

Goal #3: Must be a lossless scheme. No data should be lost during this transformation.

13

slide-14
SLIDE 14

LOSSLESS VS. LOSSY COMPRESSION

When a DBMS uses compression, it is always lossless because people don’t like losing data. Any kind of lossy compression is has to be performed at the application level.

→ Example: Sensor data. Readings are taken every second, but we may only store average across every minute.

Some new DBMSs support approximate queries

→ Example: BlinkDB, SnappyData, XDB, Oracle (2017)

14

slide-15
SLIDE 15

ZONE MAPS

Pre-computed aggregates for blocks of data. DBMS can check the zone map first to decide whether it wants to access the block.

15

Original Data

val 100 200 300 400 400

slide-16
SLIDE 16

ZONE MAPS

Pre-computed aggregates for blocks of data. DBMS can check the zone map first to decide whether it wants to access the block.

16

Zone Map

val 100 400 280 1400 type MIN MAX AVG SUM 5 COUNT

Original Data

val 100 200 300 400 400

slide-17
SLIDE 17

ZONE MAPS

Pre-computed aggregates for blocks of data. DBMS can check the zone map first to decide whether it wants to access the block.

17

Zone Map

val 100 400 280 1400 type MIN MAX AVG SUM 5 COUNT

Original Data

val 100 200 300 400 400

SELECT * FROM table WHERE val > 600

slide-18
SLIDE 18

COMPRESSION GRANULARITY

Choice #1: Block-level

→ Compress a block of tuples for the same table.

Choice #2: Tuple-level

→ Compress the contents of the entire tuple (NSM-only).

Choice #3: Attribute-level

→ Compress a single attribute value within one tuple. → Can target multiple attributes for the same tuple.

Choice #4: Column-level

→ Compress multiple values for one or more attributes stored for multiple tuples (DSM-only).

18

slide-19
SLIDE 19

NAÏVE COMPRESSION

Compress data using a general purpose algorithm. Scope of compression is only based on the data provided as input.

→ LZO (1996), LZ4 (2011), Snappy (2011), Zstd (2015)

Considerations

→ Computational overhead (gzip is super slow) → Compress vs. decompress speed.

19

slide-20
SLIDE 20

NAÏVE COMPRESSION

Choice #1: Entropy Encoding

→ More common sequences use less bits to encode, less common sequences use more bits to encode.

Choice #2: Dictionary Encoding

→ Build a lookup table that maps logical identifiers to data

  • chunks. Example: 1 ~ “the”

→ Replace those values in the original data with logical identifiers which can be later uncompressed using the lookup table. Example: <1, 5, 3> ~ <“the lookup table”>

20

slide-21
SLIDE 21

MYSQL INNODB COMPRESSION

21

Source: MySQL 5.7 Documentation

Buffer Pool Disk Pages

Compressed page0 modification log Compressed page1 modification log Compressed page2 modification log

slide-22
SLIDE 22

MYSQL INNODB COMPRESSION

22

[1,2,4,8] KB

Source: MySQL 5.7 Documentation

Buffer Pool Disk Pages

Compressed page0 modification log Compressed page1 modification log Compressed page2 modification log

slide-23
SLIDE 23

MYSQL INNODB COMPRESSION

23

[1,2,4,8] KB

Source: MySQL 5.7 Documentation

Buffer Pool Disk Pages

Compressed page0 modification log Compressed page0 modification log Compressed page1 modification log Compressed page2 modification log

slide-24
SLIDE 24

MYSQL INNODB COMPRESSION

24

[1,2,4,8] KB

Source: MySQL 5.7 Documentation

Buffer Pool Disk Pages

Compressed page0 modification log Compressed page0 modification log Compressed page1 modification log Compressed page2 modification log

Updates

slide-25
SLIDE 25

MYSQL INNODB COMPRESSION

25

[1,2,4,8] KB

Source: MySQL 5.7 Documentation

Buffer Pool Disk Pages

Uncompressed page0

Compressed page0 modification log Compressed page0 modification log Compressed page1 modification log Compressed page2 modification log

Updates

slide-26
SLIDE 26

MYSQL INNODB COMPRESSION

26

16 KB [1,2,4,8] KB

Source: MySQL 5.7 Documentation

Buffer Pool Disk Pages

Uncompressed page0

Compressed page0 modification log Compressed page0 modification log Compressed page1 modification log Compressed page2 modification log

Updates

slide-27
SLIDE 27

NAÏVE COMPRESSION

The data has to be decompressed first before it can be read and (potentially) modified.

→ This limits the “scope” of the compression scheme.

These schemes also do not consider the high-level meaning or semantics of the data.

→ Example: Relationship between order and shipping dates

27

slide-28
SLIDE 28

OBSERVATION

We can perform exact-match comparisons and natural joins on compressed data if predicates and data are compressed the same way.

→ Range predicates are more tricky…

28

slide-29
SLIDE 29

OBSERVATION

We can perform exact-match comparisons and natural joins on compressed data if predicates and data are compressed the same way.

→ Range predicates are more tricky…

29

SELECT * FROM users WHERE name = 'Andy'

NAME SALARY Andy 99999 Prashanth 88888

slide-30
SLIDE 30

OBSERVATION

We can perform exact-match comparisons and natural joins on compressed data if predicates and data are compressed the same way.

→ Range predicates are more tricky…

30

SELECT * FROM users WHERE name = 'Andy'

NAME SALARY Andy 99999 Prashanth 88888 NAME SALARY

XX AA YY BB

slide-31
SLIDE 31

OBSERVATION

We can perform exact-match comparisons and natural joins on compressed data if predicates and data are compressed the same way.

→ Range predicates are more tricky…

31

SELECT * FROM users WHERE name = 'Andy' SELECT * FROM users WHERE name = XX

NAME SALARY Andy 99999 Prashanth 88888 NAME SALARY

XX AA YY BB

slide-32
SLIDE 32

COLUMNAR COMPRESSION

Compression Schemes

→ Run-length Encoding → Bitmap Encoding → Delta Encoding → Incremental Encoding → Mostly Encoding → Dictionary Encoding

32

slide-33
SLIDE 33

RUN-LENGTH ENCODING

Compress runs of the same value in a single column into triplets:

→ Example: [‘Atlanta’, ‘Atlanta’, ‘Atlanta’] ~ [‘Atlanta’, 3] → The value of the attribute. → The # of elements in the run. → The start position in the column segment.

Requires the columns to be sorted intelligently to maximize compression opportunities.

33

DATABASE COMPRESSION SIGMOD RECORD 1993

slide-34
SLIDE 34

RUN-LENGTH ENCODING

34

Original Data

id 2 1 4 3 7 6 9 8 sex M M F M F M M M

slide-35
SLIDE 35

RUN-LENGTH ENCODING

35

Original Data

id 2 1 4 3 7 6 9 8 sex M M F M F M M M

slide-36
SLIDE 36

RUN-LENGTH ENCODING

36

Compressed Data

id 2 1 4 3 7 6 9 8 sex (F,3,1) (M,0,3) (F,5,1) (M,4,1) (M,6,2)

Original Data

id 2 1 4 3 7 6 9 8 sex M M F M F M M M

RLE Triplet

  • Value
  • Offset
  • Length
slide-37
SLIDE 37

RUN-LENGTH ENCODING

37

Compressed Data

id 2 1 4 3 7 6 9 8 sex (F,3,1) (M,0,3) (F,5,1) (M,4,1) (M,6,2)

Original Data

id 2 1 4 3 7 6 9 8 sex M M F M F M M M

RLE Triplet

  • Value
  • Offset
  • Length
slide-38
SLIDE 38

RUN-LENGTH ENCODING

38

Compressed Data Sorted Data

id 2 1 6 3 9 8 7 4 sex M M M M M M F F

RLE Triplet

  • Value
  • Offset
  • Length
slide-39
SLIDE 39

RUN-LENGTH ENCODING

39

Compressed Data Sorted Data

id 2 1 6 3 9 8 7 4 sex M M M M M M F F id 2 1 6 3 9 7 7 4 sex (F,7,2) (M,0,6)

RLE Triplet

  • Value
  • Offset
  • Length
slide-40
SLIDE 40

BITMAP ENCODING

Store a separate Bitmap for each unique value for a particular attribute where an offset in the vector corresponds to a tuple.

→ The ith position in the Bitmap corresponds to the ith tuple in the table. → Typically segmented into chunks to avoid allocating large blocks of contiguous memory.

Only practical if the value cardinality is low.

40

MODEL 204 ARCHITECTURE AND PERFORMANCE High Performance Transaction Systems 1987

slide-41
SLIDE 41

BITMAP ENCODING

41

Original Data

id 2 1 4 3 7 6 9 8 sex M M F M F M M M

slide-42
SLIDE 42

BITMAP ENCODING

42

Original Data

id 2 1 4 3 7 6 9 8 sex M M F M F M M M

slide-43
SLIDE 43

BITMAP ENCODING

43

Compressed Data Original Data

id 2 1 4 3 7 6 9 8 sex M M F M F M M M id 2 1 4 3 7 6 9 8 M 1 1 1 1 1 1 F 1 1 sex

slide-44
SLIDE 44

BITMAP ENCODING

44

Compressed Data Original Data

id 2 1 4 3 7 6 9 8 sex M M F M F M M M id 2 1 4 3 7 6 9 8 M 1 1 1 1 1 1 F 1 1 sex

slide-45
SLIDE 45

BITMAP ENCODING: EXAMPLE

45

CREATE TABLE customer_dim ( id INT PRIMARY KEY, name VARCHAR(32), email VARCHAR(64), address VARCHAR(64), zip_code INT );

slide-46
SLIDE 46

BITMAP ENCODING: EXAMPLE

46

CREATE TABLE customer_dim ( id INT PRIMARY KEY, name VARCHAR(32), email VARCHAR(64), address VARCHAR(64), zip_code INT );

slide-47
SLIDE 47

BITMAP ENCODING: EXAMPLE

Assume we have 10 million tuples. 43,000 zip codes in the US.

→ 10000000 × 32-bits = 40 MB → 10000000 × 43000 = 53.75 GB

Every time a txn inserts a new tuple, we have to extend 43,000 different bitmaps.

47

CREATE TABLE customer_dim ( id INT PRIMARY KEY, name VARCHAR(32), email VARCHAR(64), address VARCHAR(64), zip_code INT );

slide-48
SLIDE 48

BITMAP ENCODING: EXAMPLE

Assume we have 10 million tuples. 43,000 zip codes in the US.

→ 10000000 × 32-bits = 40 MB → 10000000 × 43000 = 53.75 GB

Every time a txn inserts a new tuple, we have to extend 43,000 different bitmaps.

48

CREATE TABLE customer_dim ( id INT PRIMARY KEY, name VARCHAR(32), email VARCHAR(64), address VARCHAR(64), zip_code INT );

slide-49
SLIDE 49

BITMAP ENCODING: EXAMPLE

Assume we have 10 million tuples. 43,000 zip codes in the US.

→ 10000000 × 32-bits = 40 MB → 10000000 × 43000 = 53.75 GB

Every time a txn inserts a new tuple, we have to extend 43,000 different bitmaps.

49

CREATE TABLE customer_dim ( id INT PRIMARY KEY, name VARCHAR(32), email VARCHAR(64), address VARCHAR(64), zip_code INT );

slide-50
SLIDE 50

BITMAP ENCODING: EXAMPLE

Assume we have 10 million tuples. 43,000 zip codes in the US.

→ 10000000 × 32-bits = 40 MB → 10000000 × 43000 = 53.75 GB

Every time a txn inserts a new tuple, we have to extend 43,000 different bitmaps.

50

CREATE TABLE customer_dim ( id INT PRIMARY KEY, name VARCHAR(32), email VARCHAR(64), address VARCHAR(64), zip_code INT );

slide-51
SLIDE 51

DELTA ENCODING

Recording the difference between values that follow each other in the same column.

→ The base value can be stored in-line or in a separate look- up table. → Can be combined with RLE to get even better compression ratios.

51

Original Data

time 12:01 12:00 12:03 12:02 12:04 temp 99.4 99.5 99.6 99.5 99.4

slide-52
SLIDE 52

DELTA ENCODING

Recording the difference between values that follow each other in the same column.

→ The base value can be stored in-line or in a separate look- up table. → Can be combined with RLE to get even better compression ratios.

52

Original Data

time 12:01 12:00 12:03 12:02 12:04 temp 99.4 99.5 99.6 99.5 99.4

slide-53
SLIDE 53

DELTA ENCODING

Recording the difference between values that follow each other in the same column.

→ The base value can be stored in-line or in a separate look- up table. → Can be combined with RLE to get even better compression ratios.

53

Original Data

time 12:01 12:00 12:03 12:02 12:04 temp 99.4 99.5 99.6 99.5 99.4

Compressed Data

time +1 12:00 +1 +1 +1 temp

  • 0.1

99.5 +0.1 +0.1

  • 0.2
slide-54
SLIDE 54

DELTA ENCODING

Recording the difference between values that follow each other in the same column.

→ The base value can be stored in-line or in a separate look- up table. → Can be combined with RLE to get even better compression ratios.

54

Original Data

time 12:01 12:00 12:03 12:02 12:04 temp 99.4 99.5 99.6 99.5 99.4

Compressed Data

time +1 12:00 +1 +1 +1 temp

  • 0.1

99.5 +0.1 +0.1

  • 0.2
slide-55
SLIDE 55

DELTA ENCODING

Recording the difference between values that follow each other in the same column.

→ The base value can be stored in-line or in a separate look- up table. → Can be combined with RLE to get even better compression ratios.

55

Original Data

time 12:01 12:00 12:03 12:02 12:04 temp 99.4 99.5 99.6 99.5 99.4

Compressed Data

time (+1,4) 12:00 temp

  • 0.1

99.5 +0.1 +0.1

  • 0.2

Compressed Data

time +1 12:00 +1 +1 +1 temp

  • 0.1

99.5 +0.1 +0.1

  • 0.2
slide-56
SLIDE 56

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

56

Original Data

rob robbed robbing robot

slide-57
SLIDE 57

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

57

Original Data

rob robbed robbing robot

Common Prefix

slide-58
SLIDE 58

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

58

Original Data

rob robbed robbing robot

Common Prefix

slide-59
SLIDE 59

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

59

Original Data

rob robbed robbing robot

Common Prefix

slide-60
SLIDE 60

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

60

Original Data

rob robbed robbing robot

Common Prefix

slide-61
SLIDE 61

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

61

Original Data

rob robbed robbing robot

Common Prefix

slide-62
SLIDE 62

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

62

Original Data

rob robbed robbing robot

Common Prefix

  • rob
slide-63
SLIDE 63

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

63

Original Data

rob robbed robbing robot

Common Prefix

  • rob

robb rob

slide-64
SLIDE 64

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

64

Original Data

rob robbed robbing robot

Common Prefix

  • rob

robb rob

Compressed Data

rob bed ing

  • t

3 4 3

slide-65
SLIDE 65

INCREMENTAL ENCODING

Type of delta encoding whereby common prefixes

  • r suffixes and their lengths are recorded so that

they need not be duplicated. This works best with sorted data.

65

Original Data

rob robbed robbing robot

Common Prefix

  • rob

robb rob

Compressed Data

rob bed ing

  • t

3 4 3 Prefix Length Suffix

slide-66
SLIDE 66

MOSTLY ENCODING

When the values for an attribute are “mostly” less than the largest size, you can store them as a smaller data type.

→ The remaining values that cannot be compressed are stored in their raw form.

66

Source: Redshift Documentation

Original Data

int64 4 2 6 99999999 8

slide-67
SLIDE 67

MOSTLY ENCODING

When the values for an attribute are “mostly” less than the largest size, you can store them as a smaller data type.

→ The remaining values that cannot be compressed are stored in their raw form.

67

Source: Redshift Documentation

Original Data

int64 4 2 6 99999999 8

Compressed Data

mostly8 4 2 6 XXX 8

  • ffset

3 value 99999999

slide-68
SLIDE 68

DICTIONARY COMPRESSION

Most pervasive compression scheme in DBMSs. Replace frequent patterns with smaller codes. Need to support fast encoding and decoding. Need to also support range queries.

→ Example: SALARY > 100

68

DICTIONARY-BASED ORDER-PRESERVING STRING COMPRESSION FOR MAIN MEMORY COLUMN STORES SIGMOD 2009

slide-69
SLIDE 69

DICTIONARY COMPRESSION

When to construct the dictionary? What is the scope of the dictionary? How do we allow for range queries? How do we enable fast encoding/decoding?

69

slide-70
SLIDE 70

DICTIONARY CONSTRUCTION

Choice #1: All At Once

→ Compute the dictionary for all the tuples at a given point

  • f time.

→ New tuples must use a separate dictionary or the all tuples must be recomputed.

Choice #2: Incremental

→ Merge new tuples in with an existing dictionary. → Likely requires re-encoding of existing tuples.

70

slide-71
SLIDE 71

DICTIONARY SCOPE

Choice #1: Block-level

→ Only include a subset of tuples within a single table. → Potentially lower compression ratio, but can add new tuples more easily. → Impact of dictionary data corruption is localized

Choice #2: Table-level

→ Construct a dictionary for the entire table. → Better compression ratio, but expensive to update.

Choice #3: Multi-Table

→ Can be either subset or entire tables. → Sometimes helps with joins and set operations.

71

slide-72
SLIDE 72

MULTI-ATTRIBUTE ENCODING

Instead of storing a single value per dictionary entry, store entries that span attributes.

→ I’m not sure any DBMS actually implements this.

72

Original Data

202 val2 101 202 101 202 101 val1 B A C A B A 101 C 101 B

slide-73
SLIDE 73

MULTI-ATTRIBUTE ENCODING

Instead of storing a single value per dictionary entry, store entries that span attributes.

→ I’m not sure any DBMS actually implements this.

73

Original Data Compressed Data

202 val2 101 202 101 202 101 val1 B A C A B A 101 C 101 B val2 101 202 101 val1 B A C code YY XX ZZ val1+val2 YY XX ZZ XX YY ZZ XX YY

slide-74
SLIDE 74

ENCODING / DECODING

A dictionary needs to support two operations:

→ Encode: For a given uncompressed value, convert it into its compressed form. → Decode: For a given compressed value, convert it back into its original form.

No magic hash function will do this for us. We need two hash tables to support operations in both directions.

74

slide-75
SLIDE 75

ORDER-PRESERVING COMPRESSION

The encoded values need to support sorting in the same order as original values.

75

Original Data

name Andrea Prashanth Andy Dana

slide-76
SLIDE 76

ORDER-PRESERVING COMPRESSION

The encoded values need to support sorting in the same order as original values.

76

Original Data

name Andrea Prashanth Andy Dana

Compressed Data

code 10 20 30 40 value Andrea Andy Dana Prashanth name 10 40 20 30

slide-77
SLIDE 77

ORDER-PRESERVING COMPRESSION

The encoded values need to support sorting in the same order as original values.

77

SELECT * FROM users WHERE name LIKE 'And%' Original Data

name Andrea Prashanth Andy Dana

Compressed Data

code 10 20 30 40 value Andrea Andy Dana Prashanth name 10 40 20 30

slide-78
SLIDE 78

ORDER-PRESERVING COMPRESSION

The encoded values need to support sorting in the same order as original values.

78

SELECT * FROM users WHERE name LIKE 'And%' Original Data

name Andrea Prashanth Andy Dana

Compressed Data

code 10 20 30 40 value Andrea Andy Dana Prashanth name 10 40 20 30

SELECT * FROM users WHERE name BETWEEN 10 AND 20

slide-79
SLIDE 79

ORDER-PRESERVING COMPRESSION

79

Original Data

name Andrea Prashanth Andy Dana

Compressed Data

code 10 20 30 40 value Andrea Andy Dana Prashanth name 10 40 20 30

slide-80
SLIDE 80

ORDER-PRESERVING COMPRESSION

80

SELECT name FROM users WHERE name LIKE 'And%'

???

Original Data

name Andrea Prashanth Andy Dana

Compressed Data

code 10 20 30 40 value Andrea Andy Dana Prashanth name 10 40 20 30

slide-81
SLIDE 81

ORDER-PRESERVING COMPRESSION

81

SELECT name FROM users WHERE name LIKE 'And%'

Have to perform seq scan

Original Data

name Andrea Prashanth Andy Dana

Compressed Data

code 10 20 30 40 value Andrea Andy Dana Prashanth name 10 40 20 30

slide-82
SLIDE 82

ORDER-PRESERVING COMPRESSION

82

SELECT name FROM users WHERE name LIKE 'And%' SELECT DISTINCT name FROM users WHERE name LIKE 'And%'

Have to perform seq scan

???

Original Data

name Andrea Prashanth Andy Dana

Compressed Data

code 10 20 30 40 value Andrea Andy Dana Prashanth name 10 40 20 30

slide-83
SLIDE 83

ORDER-PRESERVING COMPRESSION

83

SELECT name FROM users WHERE name LIKE 'And%' SELECT DISTINCT name FROM users WHERE name LIKE 'And%'

Have to perform seq scan Only need to access dictionary

Original Data

name Andrea Prashanth Andy Dana

Compressed Data

code 10 20 30 40 value Andrea Andy Dana Prashanth name 10 40 20 30

slide-84
SLIDE 84

DICTIONARY IMPLEMENTATIONS

Hash Table:

→ Fast and compact. → Unable to support range and prefix queries.

B+Tree:

→ Slower than a hash table and takes more memory. → Can support range and prefix queries.

84

slide-85
SLIDE 85

PARTING THOUGHTS

Dictionary encoding is probably the most useful compression scheme because it does not require pre-sorting. The DBMS can combine different approaches for even better compression. It is important to wait as long as possible during query execution to decompress data.

85

slide-86
SLIDE 86

NEXT CLASS

Physical vs. Logical Logging

86