' $ Bitmap Index Design and Ev aluation Chee-Y ong Chan - - PowerPoint PPT Presentation

bitmap index design and ev aluation chee y ong chan univ
SMART_READER_LITE
LIVE PREVIEW

' $ Bitmap Index Design and Ev aluation Chee-Y ong Chan - - PowerPoint PPT Presentation

' $ Bitmap Index Design and Ev aluation Chee-Y ong Chan Univ ersit y of Wisconsin-Madison Y annis Ioannidis Univ ersit y of Wisconsin-Madison Univ ersit y of A thens & % 1 ' $ In tro duction T


slide-1
SLIDE 1 ' & $ % Bitmap Index Design and Ev aluation Chee-Y
  • ng
Chan Univ ersit y
  • f
Wisconsin-Madison Y annis Ioannidis Univ ersit y
  • f
Wisconsin-Madison Univ ersit y
  • f
A thens 1
slide-2
SLIDE 2 ' & $ % In tro duction
  • T
remendous gro wth in Decision Supp
  • rt
Systems (DSS).
  • Characteristics
  • f
DSS Queries: r e ad-mostly, c
  • mplex,
adho c, with lar ge foundsets (i.e., high sele ctivity factors).
  • \Resurrection"
  • f
in terest in bitmap indexing.
  • Not
m uc h kno wn ab
  • ut
space-time tradeos. 2
slide-3
SLIDE 3 Example
  • f
a Bitmap Index A B 10 B 9 B 8 B 7 B 6 B 5 B 4 B 3 B 2 B 1 B 4 1 9 1 1 1 3 1 8 1 2 1 10 1 1 7 1 5 1 6 1 3 1 3
slide-4
SLIDE 4 ' & $ % Bitmap Index (con t.)
  • V
alue-List Index [O'Neil & Quass, SIGMOD'97].
  • Adv
an tages: { Compact represen tation
  • f
index (esp ecially for attributes with lo w cardinalit y) ) space and I/O ecien t. { Bitmap
  • p
erations (AND, OR, X OR, NOT) are ecien tly supp
  • rted
b y hardw are. 4
slide-5
SLIDE 5 ' & $ % Scop e
  • f
T alk
  • Bitmap
Index Design for selection queries
  • f
the form: (A
  • p
c) where
  • p
2 f; ; <; >; =; 6=g: { Range Query:
  • p
2 f; ; <; >g. { Equalit y Query:
  • p
2 f=; 6=g.
  • Assumption:
A ttribute v alues are in f0; 1; 2; : : : ; C
  • 1g,
where C is the attribute cardinalit y.
  • 2-Dimensional
F ramew
  • rk
for Design Space.
  • Space-Time
T radeo Study . 5
slide-6
SLIDE 6 Example
  • f
a V alue-List Index A B 10 B 9 B 8 B 7 B 6 B 5 B 4 B 3 B 2 B 1 B 4 1 9 1 1 1 3 1 8 1 2 1 10 1 1 7 1 5 1 6 1 3 1 6
slide-7
SLIDE 7 ' & $ % Design Space Of Bitmap Indexes for Selection Queries
  • Design
space consists
  • f
2
  • rthogonal
dimensions (inspired b y [W
  • ng
et al, VLDB'85]): 1. A ttribute V alue Decomp
  • sition:
determines n um b er and size
  • f
index comp
  • nen
ts. 2. Bitmap Enco ding Sc heme: determines enco ding
  • f
bitmap comp
  • nen
ts.
  • Index
! ! Comp
  • nen
t ! ! Bitmap 7
slide-8
SLIDE 8 ' & $ % 1 st Dimension: A ttribute V alue Decomp
  • sition
  • Giv
en a sequence
  • f
n n um b ers < b n ; b n1 ; : : : ; b 1 >, eac h attribute v alue A is decomp
  • sed
in to n digits A n A n1 : : : A 1 , where A i is a base-b i digit.
  • Example:
C = 1000 and attribute v alue A = 256. < b n ; : : : ; b 1 > Decomp
  • sition
  • f
A < 1000 > 256 < 50; 20 > 12 (20) + 16 < 32; 32 > 8(32) + < 5; 20; 10 > 1(20)(10) + 5(10) + 6
  • Eac
h < b n ; b n1 ; : : : ; b 1 > (base
  • f
index) denes an n-comp
  • nen
t index. 8
slide-9
SLIDE 9 A ttribute V alue Decomp
  • sition
with Base < 3; 4 > A A 2 A 1 4 14+0
  • !
1 9 24+1
  • !
2 1 1 04+1
  • !
1 3 04+3
  • !
3 8 24+0
  • !
2 2 04+2
  • !
2 10 24+2
  • !
2 2 04+0
  • !
7 14+3
  • !
1 3 5 14+1
  • !
1 1 6 14+2
  • !
1 2 3 04+3
  • !
3 9
slide-10
SLIDE 10 ' & $ % 2 nd Dimension: Bitmap Enco ding Sc hemes
  • Consider
the i th index comp
  • nen
t with base b i .
  • Tw
  • basic
w a ys to enco de a v alue x (0
  • x
< b i ): Enco ding b i
  • bit
Represen tation for v alue x Sc heme b i
  • 1
  • x
+ 1 x x
  • 1
  • Equalit
y
  • 1
  • Range
1
  • 1
1
  • Equalit
y Enco ded Bitmap: B x i = f records with A i = x g
  • Range
Enco ded Bitmap: B x i = f records with A i
  • x
g B b i 1 i is not materialized since all its bits are set to 1. 10
slide-11
SLIDE 11 An Equalit y-Enco ded Base-< 3; 4 > Index A A 2 A 1 B 2 2 B 1 2 B 2 B 3 1 B 2 1 B 1 1 B 1 4 1 1 1 9 2 1 1 1 1 1 1 1 3 3 1 1 8 2 1 1 2
  • !
2
  • !
1 1 10 2 2 1 1 1 1 7 1 3 1 1 5 1 1 1 1 6 1 2 1 1 3 3 1 1 11
slide-12
SLIDE 12 A Range-Enco ded Base-< 3; 4 > Index A A 2 A 1 B 1 2 B 2 B 2 1 B 1 1 B 1 4 1 1 1 1 1 9 2 1 1 1 1 1 1 1 1 1 3 3 1 1 8 2 1 1 1 2
  • !
2
  • !
1 1 1 10 2 2 1 1 1 1 1 1 7 1 3 1 5 1 1 1 1 1 6 1 2 1 1 3 3 1 1 12
slide-13
SLIDE 13 ' & $ %

< b, b, ..., b> log C

b

times < > b2 , b1 , < > b b

2 3

,

1

b

Design Space of Bitmap Indexes

. . . . .

BITMAP ENCODING SCHEME Value-List Bit-Sliced Index Index < C >

. . . . .

DECOMPOSITION VALUE ATTRIBUTE Equality Range

13
slide-14
SLIDE 14 ' & $ % Space-Time T radeo Issues

Space Time Optimal Space-Time Tradeoff (knee) Time-Optimal Time-Optimal under Space Constraint S Space-Optimal Infeasible Region S

14
slide-15
SLIDE 15 ' & $ % Analytical Cost Mo del Cost Metrics Space Num b er
  • f
bitmaps. Time Exp ected n um b er
  • f
bitmap scans for a selection query ev aluation.
  • Uniform
Query Distribution Assumption: Query space = fA
  • p
v :
  • p
2 f; ; <; >; =; 6=g;
  • v
< C g, where C is the attribute cardinalit y . 15
slide-16
SLIDE 16 Comparison
  • f
Enco ding Sc hemes

2 4 6 8 10 10 20 30 40 50 60 70 80 90 100 Time (Expected Number of Bitmap Scans) Space (Number of Bitmaps) Range-Encoded Index Equality-Encoded Index 2 4 6 8 10 20 40 60 80 100 Time (Expected Number of Bitmap Scans) Space (Number of Bitmaps) Range-Encoded Index Equality-Encoded Index

(a) C = 100 (b) C = 1000 16
slide-17
SLIDE 17 ' & $ % Space-Time T radeo Results
  • Class
  • f
n-Comp
  • nen
t Indexes { Time-Optimal Index = < 2; 2; : : : ; 2; | {z } n1
  • C
2 n1
  • >.
{ Space-Optimal Index = < b
  • 1;
b
  • 1;
: : : ; b
  • 1;
| {z } nr b; b; : : : ; b | {z } r > where b =
  • n
p C
  • ;
b r 1 (b
  • 1)
nr +1 < C
  • b
r (b
  • 1)
nr .
  • Time-Optimal
Index = Single-comp
  • nen
t index.
  • Space-Optimal
Index = Maximal-comp
  • nen
t index.
  • Knee
Index
  • 2-comp
  • nen
t space-optimal index. 17
slide-18
SLIDE 18 Time-Optimal and Space-Optimal Indexes, C=100

1 2 3 4 5 6 7 20 40 60 80 100 Time (Expected Number of Bitmap Scans) Space (Number of Bitmaps) 1 2 3 4 5 7 2 3 4 5 6 n-Comp. Time-Optimal Index n-Comp. Space-Optimal Index

18
slide-19
SLIDE 19 Knee Index, C = 100

1 2 3 4 5 6 7 20 40 60 80 100 Time (Expected Number of Bitmap Scans) Space (Number of Bitmaps) 1 2 3 4 5 7 n-Comp. Space-Optimal Index All Index

19
slide-20
SLIDE 20 ' & $ % Space-Time T radeo Results (con t.) Time-Optimal Index under Space Constrain t
  • Searc
h space for the
  • ptimal
solution is large!
  • A
2-step Heuristic Approac h: 1. Select an initial index that satises the space constrain t. 2. Iterativ ely adjust the base
  • f
index to impro v e its time-eciency .
  • Heuristic
Approac h is near-optimal. 20
slide-21
SLIDE 21

6 files of N bits each 2 files of 3N bits each 1 file of 6N bits

( N = # tuples )

Storage Schemes for Bitmap Compression

Component-Level Storage (CS) Index-Level Storage (IS) Bitmap-Level Storage (BS)

21
slide-22
SLIDE 22 ' & $ % Bitmap Compression
  • Exp
erimen tal Data (from TPC-D Benc hmark): { A ttribute: Lineitem.Qt y with C = 50 and 6M tuples. { Indexes: 6 n-comp
  • nen
t space-optimal indexes. { Compression co de: zlib library (a LZ77 v arian t).
  • Notation:
cBS, cCS, cIS for compressed storage sc hemes. 22
slide-23
SLIDE 23 Compressibilit y
  • f
Storage Sc hemes (relativ e to 1-comp. index under BS)

0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 Compressibility Number of Components, n BS/CS/IS cBS cCS cIS

23
slide-24
SLIDE 24 Space-Time T radeo
  • f
Compressed Indexes

5 10 15 20 25 30 5 10 15 20 25 30 35 40 Time (secs) Space (MB) BS cBS cCS

24
slide-25
SLIDE 25 ' & $ % Conclusions
  • General
framew
  • rk
to explore design space
  • f
bitmap indexes for selection queries.
  • Study
  • f
space-time tradeo issues
  • er
guidelines for ph ysical database design using bitmap indexes.
  • F
uture W
  • rk
{ Hybrid-enco ded bitmap indexes. { More general class
  • f
selection queries; e.g. A 2 fv 1 ; v 2 ; : : : ; v n g. 25