Improved Address-Calculation Coding of Integer Arrays Jyrki - - PowerPoint PPT Presentation

improved address calculation coding of integer arrays
SMART_READER_LITE
LIVE PREVIEW

Improved Address-Calculation Coding of Integer Arrays Jyrki - - PowerPoint PPT Presentation

Improved Address-Calculation Coding of Integer Arrays Jyrki Katajainen 1 , 2 Amr Elmasry 3 , Jukka Teuhola 4 1 University of Copenhagen 2 Jyrki Katajainen and Company 3 Alexandria University 4 University of Turku c Performance Engineering


slide-1
SLIDE 1

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (1)

Improved Address-Calculation Coding of Integer Arrays

Jyrki Katajainen1,2 Amr Elmasry3, Jukka Teuhola4

1 University of Copenhagen 2 Jyrki Katajainen and Company 3 Alexandria University 4 University of Turku

slide-2
SLIDE 2

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (2)

Problem formulation

Given: An array

  • f

integers {xi | i ∈ {1, 2, . . . , n}} Wanted: Compressed represen- tation, fast random access Operations:

access(i): retrieve xi insert(i, v): insert v before xi delete(i): remove xi

Other: omitted in this talk

sum(j): retrieve j

i=1 xi

search(p): find the rank of the

given prefix sum p

modify(i, v): change xi to v

Many solutions known, see the list of references in the paper Theoretical approaches

  • O(1) worst-case-time access
  • overhead of o(n) bits with

respect to some measure of compactness

  • complicated

Practical approaches

  • slower access
  • O(n) bits of overhead
  • implementable
  • fast in practice
slide-3
SLIDE 3

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (3)

Measures of compactness

What is optimal? n: # integers ˆ x = maxn

i=1 xi

s = n

i=1 xi

Data-aware measure Raw representation:

n

i=1 ⌈lg(1 + xi)⌉ bits

Overhead: In order to support random access we expect to need some more bits Data-independent measures Compact representation: n lg(1 + s/n) + O(n) bits Apply Jensen’s inequality to the raw representation and accept a linear overhead Lower bound1: ⌈lg ˆ xn⌉ ˆ xn: The number

  • f

se- quences of n positive integers whose value is at most ˆ x Lower bound2:

  • lg

s−1

n−1

  • s−1

n−1

  • :

The number of se- quences of n positive integers that add up to s

slide-4
SLIDE 4

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (4)

Two trivial “solutions”

Uncompressed array

a:

w: size of a machine word Space: w · n + O(w) bits

access(i): a[i]

Access times on my computer: n sequential random 210 0.89 1.1 215 0.74 1.4 220 0.89 7.1 225 0.74 10.9 տ ns per operation – no compression + fast Fixed-length coding

a:

ˆ x = maxn

i=1 xi

β = ⌈lg(1 + ˆ x)⌉ Space: β · n + O(w) bits

access(i):

  • compute the word address
  • read one or two words
  • mask the bits needed

– one outlier ruins the com- pactness + relatively fast Q: How would you support insert and delete for these structures?

slide-5
SLIDE 5

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (5)

Two examples

x1 = n, xi = 1 for i ∈ {2, . . . , n} Raw representation: n + O(lg n) bits Fixed-length coding: n ⌈lg(1 + n)⌉ bits Lower bound1: ⌈n lg n⌉ bits x1 = n2, xi = 1 for i ∈ {2, . . . , n} Raw representation: n + O(lg n) bits Compact representation: n lg n + Θ(n) bits Lower bound1: ⌈2n lg n⌉ bits Lower bound2: n lg n + Θ(n) bits N.B. All our representations are compact, but we do not claim them to be optimal

slide-6
SLIDE 6

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (6)

Our contribution

Teuhola 2011 Interpolative coding

  • f

integer sequences supporting log-time random access, Inform. Process.

  • Manag. 47,5, 742–761

Space: n lg(1+s/n)+O(n) bits, i.e. compact

access: O (lg(n + s))

worst-case time

insert, delete: not supported

This paper Space: n lg(1+s/n)+O(n) bits, i.e. compact

access: O (lg lg(n + s))

worst- case time in the static case and O(lg n) worst-case time in the dynamic case

insert, delete: O(lg n + w2) worst-

case time n: # integers (assume n ≥ w) s: sum of the integers w: size of a machine word

slide-7
SLIDE 7

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (7)

Address-calculation coding

10101

5 2 5 2 3 5 14 7 2 9 4 4 1 21

01110 1001 0100 010 010 10 100

  • encoding in depth-first order
  • yellow nodes not stored
  • skip subtrees using the formula

Space: Compact by the magical formula

access: O(lg n)

worst-case time (assuming that the position

  • f the most significant one

bit in a word can be deter- mined in O(1) time)

insert, delete: not supported

t = ⌈lg(1 + s)⌉ Magical formula B(n, s) =

  

n(t − lg n + 1) + ⌊s(n−1)

2t−1 ⌋ − t − 1

, if s ≥ n/2 2t + ⌊s(2 −

1 2t−1)⌋ − t − 1 + s(lg n − t)

, otherwise

slide-8
SLIDE 8

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (8)

Indexed address-calculation coding

c: a tuning parameter, c ≥ 1 si: sum of the numbers in the ith chunk

chunk size: k = ⌊c · lg(n + s)⌋ # chunks: t = ⌈n/k⌉ root: ⌈lg(1 + s)⌉ bits pointer: lg n + lg lg(1 + s/n) + O(1) bits chunks; address-calculation coding index; fixed-length coding

Analysis roots: ⌈n/k⌉·⌈lg(1+s)⌉ ≤ n/c+O(w) pointers: ⌈n/k⌉ · (lg n + lg lg(1+s/n) + O(1)) ≤ n/c + O(w) chunks:

t

i=1[k ·lg(1+si/k)+O(k)] ≤

n lg(1 + s/n) + O(n)

slide-9
SLIDE 9

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (9)

Other applications of indexing

Indexed Elias delta coding c: a tuning parameter, c ≥ 1

chunk size: k = ⌊c · (lg n + lg lg s)⌋ # chunks: t = ⌈n/k⌉ pointer: lg n + lg lg(1 + s/n) + O(1) bits chunks; Elias delta coding index; fixed-length coding

Space: raw + O(n

i=1 lg lg xi)

access: O(lg n+lg lg s) worst-case

time Indexed fixed-length coding c: a tuning parameter, c ≥ 1 ˆ x = maxn

i=1 xi

  • ffsets; fixed-length coding

index; fixed-length coding data; raw coding chunk size: k = ⌊c · (lg n + lg lg ˆ x)⌋ # chunks: t = ⌈n/k⌉ pointer: lg n + lg lg(1 + ˆ x) + O(1) bits landmark + offset

Space: raw + O(n lg lg(n + ˆ x))

access: O(1) worst-case time

slide-10
SLIDE 10

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (10)

Dynamization

c: a tuning parameter, c ≥ 1 w: size of a machine word

chunks; address-calculation coding index; balanced search tree chunk size: k = cw/2..2cw # chunks: t = ⌈n/(2cw)⌉..⌈2n/(cw)⌉ root: w bits pointer: w bits

Use the zone technique:

  • align chunks to word bound-

aries

  • keep chunks of the same size in

separate zones

  • only w zones
  • maintain zones as rotated ar-

rays (one chunk may be split) Space: Still compact

access: O(lg n)

worst-case time (n ≥ w))

insert, delete: O(lg n + w2) worst-

case time

slide-11
SLIDE 11

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (11)

Experimental setup

Benchmark data: n integers – uniformly distributed – exponentially distributed Repetitions: Each experiment repeated r times for sufficiently large r Reported value: Measurement result divided by r × n Processor: Intel R

Xeon R CPU 1.8 GHz

× 2 Programming language: C Compiler: gcc with optimization -O3 Source code: Available from Jukka’s home page

slide-12
SLIDE 12

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (12)

Experimental results: Overhead

2 4 6 8 10 12 14 16 2 4 8 16 32 64 128 256 512 1024 Bits per source integer Range size Indexed modifiable array Indexed static array Basic AC-coded array Entropy 2 4 6 8 10 1/64 1/32 1/16 1/8 1/4 1/2 1 2 4 8 Bits per source integer Lambda Indexed modifiable array Indexed static array Basic AC-coded array Entropy

– entropy of xi: expected information content of xi – for a random floating-point number yi, yi ≥ 0, xi =

  • −ln(1−yi)

λ

slide-13
SLIDE 13

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (13)

Experimental results: access, search, modify

0.5 1.0 1.5 2.0 1 000 10 000 100 000 1 000 000 Time per operation (microsec.) Number of source integers Basic AC-coded array, access Basic AC-coded array, search Indexed static array, search Indexed static array, access 1 2 3 4 5 6 1 000 10 000 100 000 1 000 000 Time per operation (microsec.) Number of source integers Indexed modifiable array, modify Indexed modifiable array, access

– uniformly-distributed integers drawn from [0..63]

slide-14
SLIDE 14

c

Performance Engineering Laboratory

SPIRE 2012, Cartagena (14)

Further work

Theory

  • Try to understand better the

trade-off between the speed

  • f access and the amount of
  • verhead in the data-aware

case. Applications

  • Can some of you convince me

that compressed arrays are useful—or even necessary— in some information-retrieval application(s)? Practice

  • As to the speed of access, we

showed that O(lg lg(n+s)) is better than O(lg(n+s)). Can you show that O(1) is better than O(lg lg(n + s))?

  • Independent of the theoreti-

cal running time, can one get the efficiency of access closer to that provided by uncom- pressed arrays? To do

  • A

thorough experimental comparison!