Hardware-Aware Algorithms and Data Structures Gabriel Moruz BRICS - - PowerPoint PPT Presentation

hardware aware algorithms and data structures
SMART_READER_LITE
LIVE PREVIEW

Hardware-Aware Algorithms and Data Structures Gabriel Moruz BRICS - - PowerPoint PPT Presentation

Hardware-Aware Algorithms and Data Structures Gabriel Moruz BRICS University of Aarhus 1 Hardware /nm./: the part of the computer that you can kick. Geeky folklore. Gabriel Moruz: Hardware aware algorithms and data structures 2


slide-1
SLIDE 1

Hardware-Aware Algorithms and Data Structures

Gabriel Moruz BRICS University of Aarhus

1

slide-2
SLIDE 2

Hardware /nm./: “the part of the computer that you can kick.” – Geeky folklore.

Gabriel Moruz: Hardware aware algorithms and data structures

2

slide-3
SLIDE 3

Algorithms and Data Structures

  • Algorithm:

– A finite sequence of steps to solve a problem – Is given an input – Is required to produce an output – Should be efficient

  • Data structure:

– The "way" in which data is stored – Supports operations

Gabriel Moruz: Hardware aware algorithms and data structures

3

slide-4
SLIDE 4

Example – Searching

  • The problem

– Input: A sequence of numbers A, an element e – Output: YES, if e is in A, NO otherwise

  • Dictionary – underlying data structure

– Static: Supports only searches – Dynamic: Supports searches and updates

  • Why bother

– Numerous applications: Database systems, search engines, implementing sets, sorting, interval trees,

  • rthogonal range searching, line segment intersection, phone

book, the search for the Holy Grail, finding Nemo, the vial of life,

cherchez la femme, the lost city of Atlantis, the bad Mafia guys, the dark Mordor, pirates’ treasure chest etc.

Gabriel Moruz: Hardware aware algorithms and data structures

4

slide-5
SLIDE 5

Linear Search

  • Consider sequence A to be an array of size n
  • Efficiency - the number of comparisons

1 2 4 5 6 7 8 9 10 11 12 14 15 42 7 10 15 12 8 5 22 3 31 18 24 13 3 13 35 21 28

A

  • The algorithm

– Compare elements in A against e left-to-right – Stop upon encountering an element equal to e

  • Analysis

– What if e = 13 or e not in A? – Worst case scenario: need to access all elements in A!!! – Why avoiding this approach: imagine n = 100, 000, 000

Gabriel Moruz: Hardware aware algorithms and data structures

5

slide-6
SLIDE 6

What if A is sorted?

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 18 21 24 22 35 28 31 42 15 3

A ← → The algorithm – binary search:

  • Compare e against the middle element in A
  • If e is smaller then restrict to the left half of A
  • If e is larger then restrict to the right half of A
  • Stop when:

– an element in A matching e is found, or – the sequence in which we search has one element

Gabriel Moruz: Hardware aware algorithms and data structures

6

slide-7
SLIDE 7

What if A is sorted?

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 15 18 21 24 22 35 28 31 42 3

A → ← ← The algorithm – binary search:

  • Compare e against the middle element in A
  • If e is smaller then restrict to the left half of A
  • If e is larger then restrict to the right half of A
  • Stop when:

– an element in A matching e is found, or – the sequence in which we search has one element

  • The searched element e = 13

Gabriel Moruz: Hardware aware algorithms and data structures

6

slide-8
SLIDE 8

What if A is sorted?

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 15 18 21 24 22 35 28 31 42 3

← → A The algorithm – binary search:

  • Compare e against the middle element in A
  • If e is smaller then restrict to the left half of A
  • If e is larger then restrict to the right half of A
  • Stop when:

– an element in A matching e is found, or – the sequence in which we search has one element

  • The searched element e = 13

Gabriel Moruz: Hardware aware algorithms and data structures

6

slide-9
SLIDE 9

What if A is sorted?

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 15 18 21 24 22 35 28 31 42 3

← → → A The algorithm – binary search:

  • Compare e against the middle element in A
  • If e is smaller then restrict to the left half of A
  • If e is larger then restrict to the right half of A
  • Stop when:

– an element in A matching e is found, or – the sequence in which we search has one element

  • The searched element e = 13

Gabriel Moruz: Hardware aware algorithms and data structures

6

slide-10
SLIDE 10

What if A is sorted?

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 15 18 21 24 22 35 28 31 42 3

← → → A The algorithm – binary search:

  • Compare e against the middle element in A
  • If e is smaller then restrict to the left half of A
  • If e is larger then restrict to the right half of A
  • Stop when:

– an element in A matching e is found, or – the sequence in which we search has one element

  • The searched element e = 13

Gabriel Moruz: Hardware aware algorithms and data structures

6

slide-11
SLIDE 11

Analyzing Binary Search

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 15 18 21 24 22 35 28 31 42 3

← → → A

  • Analysis

– One comparison: search in a sequence of size n/2 – Two comparisons: search in a sequence of size n/4 – k comparisons: search in a sequence of size n/(2k) – Worst case scenario: stop in a sequence of size 1 – Sequence size n/(2k) = 1, meaning k ≈ log2 n – Conclusion: we need about log2 n comparisons

Gabriel Moruz: Hardware aware algorithms and data structures

7

slide-12
SLIDE 12

Analyzing Binary Search

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 15 18 21 24 22 35 28 31 42 3

← → → A

  • Analysis

– One comparison: search in a sequence of size n/2 – Two comparisons: search in a sequence of size n/4 – k comparisons: search in a sequence of size n/(2k) – Worst case scenario: stop in a sequence of size 1 – Sequence size n/(2k) = 1, meaning k ≈ log2 n – Conclusion: we need about log2 n comparisons

  • Imagine n = 100, 000, 000: log2 100, 000, 000 ≈ 26.5

Gabriel Moruz: Hardware aware algorithms and data structures

7

slide-13
SLIDE 13

Outline

  • Hardware factors affecting the running time

– Instructions performed by microprocessor – Branch mispredictions – Memory transfers – Streaming

  • Hardware factors affecting the reliability

– Memory corruptions

  • Optimal resilient dictionaries

Gabriel Moruz: Hardware aware algorithms and data structures

8

slide-14
SLIDE 14

Theory vs Practice

are the same In theory, theory and practice

Gabriel Moruz: Hardware aware algorithms and data structures

9

slide-15
SLIDE 15

Theory vs Practice

are the same In theory, theory and practice In practice, theory and practice may be quite different . . .

Gabriel Moruz: Hardware aware algorithms and data structures

9

slide-16
SLIDE 16

Traditional RAM model

CPU Memory

  • Consists of a processor and an infinite memory
  • Instructions:

– Load/stores of memory cells, assignments, comparisons, simple math operations – NO loops!

  • Complexity: given by # instructions
  • Not always adequate!!!

Gabriel Moruz: Hardware aware algorithms and data structures

10

slide-17
SLIDE 17

Branch Mispredictions – Motivation

  • Input:

– a – array of size 2 × 107, ai ∈ [1, . . . , 100] – param – a threshold, param ∈ [0, . . . , 101]

  • Output:

– g – # elements in a larger than param – s – # elements in a smaller or equal to param

  • Algorithm:

– Compare each element in a against param – Use a left-to-right scan

72 21 3 45 98 53 87 17 24 33 52 8 81 79 63 48

param = 30, g = 0, s = 0

Gabriel Moruz: Hardware aware algorithms and data structures

11

slide-18
SLIDE 18

Branch Mispredictions – Motivation

  • Input:

– a – array of size 2 × 107, ai ∈ [1, . . . , 100] – param – a threshold, param ∈ [0, . . . , 101]

  • Output:

– g – # elements in a larger than param – s – # elements in a smaller or equal to param

  • Algorithm:

– Compare each element in a against param – Use a left-to-right scan

72 21 3 45 98 53 87 17 24 33 52 8 81 79 63 48

param = 30, g = 1, s = 0

Gabriel Moruz: Hardware aware algorithms and data structures

11

slide-19
SLIDE 19

Branch Mispredictions – Motivation

  • Input:

– a – array of size 2 × 107, ai ∈ [1, . . . , 100] – param – a threshold, param ∈ [0, . . . , 101]

  • Output:

– g – # elements in a larger than param – s – # elements in a smaller or equal to param

  • Algorithm:

– Compare each element in a against param – Use a left-to-right scan

72 21 3 45 98 53 87 17 24 33 52 8 81 79 63 48

param = 30, g = 1, s = 1

Gabriel Moruz: Hardware aware algorithms and data structures

11

slide-20
SLIDE 20

Branch Mispredictions – Motivation

  • Input:

– a – array of size 2 × 107, ai ∈ [1, . . . , 100] – param – a threshold, param ∈ [0, . . . , 101]

  • Output:

– g – # elements in a larger than param – s – # elements in a smaller or equal to param

  • Algorithm:

– Compare each element in a against param – Use a left-to-right scan

72 21 3 45 98 53 87 17 24 33 52 8 81 79 63 48

param = 30, g = 11, s = 5

Gabriel Moruz: Hardware aware algorithms and data structures

11

slide-21
SLIDE 21

Running Time

0.5 1 1.5 2 10 20 30 40 50 60 70 80 90 100 Running time param Running time

Theory

  • The number of instructions is the same regardless of param

Gabriel Moruz: Hardware aware algorithms and data structures

12

slide-22
SLIDE 22

Running Time

0.5 1 1.5 2 10 20 30 40 50 60 70 80 90 100 Running time param Running time 0.05 0.1 0.15 0.2 0.25 0.3 0.35 20 40 60 80 100 Running time param No opt Opt -O3

Theory Practice

Explanation: branch mispredictions!

Gabriel Moruz: Hardware aware algorithms and data structures

12

slide-23
SLIDE 23

Pipelining

Gabriel Moruz: Hardware aware algorithms and data structures

13

slide-24
SLIDE 24

Pipelining

  • Each instruction is broken into several stages
  • The smaller pieces can be executed in the same time
  • Significant gains in running time

get ops decode execute fetch write

→x=1; y=y-1; z=m+n; if (t==0) printf(‘‘It’s zero’’); else t=0;

Gabriel Moruz: Hardware aware algorithms and data structures

13

slide-25
SLIDE 25

Pipelining

  • Each instruction is broken into several stages
  • The smaller pieces can be executed in the same time
  • Significant gains in running time

get ops decode execute fetch x=1 write

→x=1; y=y-1; z=m+n; if (t==0) printf(‘‘It’s zero’’); else t=0;

Gabriel Moruz: Hardware aware algorithms and data structures

13

slide-26
SLIDE 26

Pipelining

  • Each instruction is broken into several stages
  • The smaller pieces can be executed in the same time
  • Significant gains in running time

get ops decode execute fetch x=1 y=y−1 write

→x=1; →y=y-1; z=m+n; if (t==0) printf(‘‘It’s zero’’); else t=0;

Gabriel Moruz: Hardware aware algorithms and data structures

13

slide-27
SLIDE 27

Pipelining

  • Each instruction is broken into several stages
  • The smaller pieces can be executed in the same time
  • Significant gains in running time

get ops decode execute fetch x=1 y=y−1 z=m+n write

→x=1; y=y-1; →z=m+n; if (t==0) printf(‘‘It’s zero’’); else t=0;

Gabriel Moruz: Hardware aware algorithms and data structures

13

slide-28
SLIDE 28

Pipelining

  • Each instruction is broken into several stages
  • The smaller pieces can be executed in the same time
  • Significant gains in running time

get ops decode execute fetch x=1 y=y−1 z=m+n if(t==0) write

→x=1; y=y-1; z=m+n; →if (t==0) printf(‘‘It’s zero’’); else t=0;

Gabriel Moruz: Hardware aware algorithms and data structures

13

slide-29
SLIDE 29

Pipelining

  • Each instruction is broken into several stages
  • The smaller pieces can be executed in the same time
  • Significant gains in running time

get ops decode execute fetch x=1 y=y−1 z=m+n if(t==0) ? write

→x=1; y=y-1; z=m+n; if (t==0) → printf(‘‘It’s zero’’); else → t=0;

Gabriel Moruz: Hardware aware algorithms and data structures

13

slide-30
SLIDE 30

Branch Predictor

get ops decode execute write fetch x=1 y=x+1 x=y−2 if(y==0) ? Branch predictor

  • Modern processors include branch predictors
  • Attempts to predict the direction of each branch
  • Accurate over 90% of the times
  • Significant penalties upon mispredictions
  • Pipelines are getting longer

Gabriel Moruz: Hardware aware algorithms and data structures

14

slide-31
SLIDE 31

Running Time

0.5 1 1.5 2 10 20 30 40 50 60 70 80 90 100 Running time param Running time 0.05 0.1 0.15 0.2 0.25 0.3 0.35 20 40 60 80 100 Running time param No opt Opt -O3

Theory Practice

Many branch mispredictions for param ≈ 50!

Gabriel Moruz: Hardware aware algorithms and data structures

15

slide-32
SLIDE 32

Memory Transfers and Streaming

Gabriel Moruz: Hardware aware algorithms and data structures

16

slide-33
SLIDE 33

Memory Hierarchy – Motivation

Simple algorithm:

  • Consider an array of size n
  • Perform r element accesses circularly
  • n is a parameter, r is fixed

Accesses per element (apm) for r = 20:

1 2 1 2 3 4 1 2 3 4 5 1 2 3 4 5 6 7 8 9 10

n = 2 n = 4 n = 5 n = 10 apm = 10 apm = 5 apm = 4 apm = 2

Gabriel Moruz: Hardware aware algorithms and data structures

17

slide-34
SLIDE 34

Running Time

0.5 1 1.5 2 5 10 15 20 25 30 Running time log n

Theory

  • The number of instructions is the same regardless of n
  • The number of memory accesses is also the same
  • Branch mispredictions don’t stand in the way

Gabriel Moruz: Hardware aware algorithms and data structures

18

slide-35
SLIDE 35

Running Time

0.5 1 1.5 2 5 10 15 20 25 30 Running time log n 500 1000 1500 2000 2500 3000 5 10 15 20 25 30 Running time log n

Theory Practice

Explanation: memory hierarchy!

Gabriel Moruz: Hardware aware algorithms and data structures

18

slide-36
SLIDE 36

Memory Hierarchy

L1 cache CPU L2 cache RAM Hard disk

S p e e d S i z e

  • Each level is larger and slower than the previous
  • Transfers are done only between consecutive levels
  • Transfer large blocks of data at once
  • Real bottleneck: between memory and disk
  • Bad news: data sets are getting huge

Gabriel Moruz: Hardware aware algorithms and data structures

19

slide-37
SLIDE 37

Running Time

500 1000 1500 2000 2500 3000 5 10 15 20 25 30 Running time log n 2 4 6 8 10 5 10 15 20 25 30 Running time log n

Practice Same chart zoomed

Many memory transfers when n exceeds memory!

Gabriel Moruz: Hardware aware algorithms and data structures

20

slide-38
SLIDE 38

Streaming

  • Data access is done only sequentially
  • Don’t want to store all data, use only small memory
  • One pass streaming

– Data comes on the fly: sensor data, IP monitoring – Use a single pass, get as much use of it as possible

  • Multi pass streaming

– Modern disks have high sequential access – A tempting approach for really massive data sets

Gabriel Moruz: Hardware aware algorithms and data structures

21

slide-39
SLIDE 39

Outline

  • Hardware factors affecting the running time

– Instructions performed by microprocessor – Branch mispredictions – Memory transfers – Streaming

  • Hardware factors affecting the reliability

– Memory corruptions

  • Optimal resilient dictionaries

Gabriel Moruz: Hardware aware algorithms and data structures

22

slide-40
SLIDE 40

Soft Memory Errors

  • Nowadays memories:

– Small, complex, high frequency, low voltage – Price to pay - reliability

  • Soft memory errors:

– Bit flip, implying cell corruption – Caused by radiation, power failures, cosmic rays

  • Good news:

– Doesn’t happen often (every few months)

  • Bad news:

– Happen often for large clusters – Soft memory error rate is increasing

Gabriel Moruz: Hardware aware algorithms and data structures

23

slide-41
SLIDE 41

Soft Memory Errors – Applications

[Govindavajhala and Appel ’03]

Gabriel Moruz: Hardware aware algorithms and data structures

24

slide-42
SLIDE 42

Soft Memory Errors – Applications

[Govindavajhala and Appel ’03] Applications:

  • Break JVM
  • Insecure cryptographic protocols, smart-cards

Gabriel Moruz: Hardware aware algorithms and data structures

24

slide-43
SLIDE 43

Outline

  • Hardware factors affecting the running time

– Instructions performed by microprocessor – Branch mispredictions – Memory transfers – Streaming

  • Hardware factors affecting the reliability

– Memory corruptions

  • Optimal resilient dictionaries

Gabriel Moruz: Hardware aware algorithms and data structures

25

slide-44
SLIDE 44

Contributions

1. On the Adaptiveness of Quicksort. G. S. Brodal, R. Fagerberg, and G. Moruz. In Proc. 7th Workshop on Algorithm Engineering and Experiments (ALENEX), 2005. 2. Cache-Aware and Cache-Oblivious Adaptive Sorting. G. S. Brodal, R. Fagerberg, and

  • G. Moruz. In Proc. Int. Colloquium on Automata, Languages, and Programming, 2005.

3. Tradeoffs Between Branch Mispredictions and Comparisons for Sorting

  • Algorithms. G. S. Brodal and G. Moruz. In Proc. 9th Int. Workshop on Algorithms and

Data Structures (WADS), 2005. 4. Skewed Binary Search Trees. G. S. Brodal and G. Moruz. In Proc. 14th Annual European Symposium on Algorithms (ESA), 2006. 5. Adapting Parallel Algorithms to the W-Stream Model, with Applications to Graph

  • Problems. C. Demetrescu, B. Escoffier, G. Moruz, and A. Ribichini. In Proc. 32nd Int.

Symposium on Mathematical Foundations of Computer Science (MFCS), 2007. 6. Resilient Priority Queues. A. G. Jørgensen, G. Moruz, and T. Mølhave. In Proc. 10th

  • Int. Workshop on Algorithms and Data Structures (WADS), 2007.

7. Optimal Resilient Dynamic Dictionaries. G. S. Brodal, R. Fagerberg, I. Finocchi, F . Grandoni, G. F . Italiano, A. G. Jørgensen, G. Moruz, and T. Mølhave. In Proc. 15th Annual European Symposium on Algorithms (ESA), 2007. To appear.

Gabriel Moruz: Hardware aware algorithms and data structures

26

slide-45
SLIDE 45

ESA ’07

Optimal resilient dictionaries Optimal resilient dictionaries Resilient Search Trees: Randomization and Prejudice Optimal resilient dictionaries Resilient Search Trees: Randomization and Prejudice

  • G. Moruz, and T. Mølhave
  • G. S. Brodal, R. Fagerberg, A. G. Jørgensen,
  • I. Finocchi, F. Grandoni, and G. F. Italiano

Submissions:

  • G. Moruz, and T. Mølhave
  • G. S. Brodal, R. Fagerberg, A. G. Jørgensen,
  • I. Finocchi, F. Grandoni, and G. F. Italiano

Reviewers deciding: Acceptance notification:

  • G. S. Brodal, R. Fagerberg, I. Finocchi, F. Grandoni,G. F. Italiano,
  • A. G. Jørgensen, G. Moruz, and T. Mølhave

Gabriel Moruz: Hardware aware algorithms and data structures

27

slide-46
SLIDE 46

Faulty-Memory RAM

[Finocchi and Italiano ’04]

  • A regular RAM with possibly corrupted cells
  • Bad news:

– Memory corruptions occur at any time and at any place – Corruptions are performed by an adversary – Corrupted and uncorrupted cells can’t be distinguished – No space increase (asymptotically)

Gabriel Moruz: Hardware aware algorithms and data structures

28

slide-47
SLIDE 47

Faulty-Memory RAM

[Finocchi and Italiano ’04]

  • A regular RAM with possibly corrupted cells
  • Bad news:

– Memory corruptions occur at any time and at any place – Corruptions are performed by an adversary – Corrupted and uncorrupted cells can’t be distinguished – No space increase (asymptotically)

  • Good news:

– Assumption: at most δ corruptions – O(1) corruption-free cells (reliable CPU registers)

Gabriel Moruz: Hardware aware algorithms and data structures

28

slide-48
SLIDE 48

Resilient Algorithms

  • Work correctly for uncorrupted values
  • Searching:
  • 41

7 9 12 14 16 18 21 27 30 15 32 33 39 42 44 45 49 37 1 41 7 9 12 14 16 18 21 27 30 15 32 33 39 42 44 45 49 13 1 41 7 9 12 13 16 18 21 27 30 15 32 33 39 42 44 45 49 1 37

Search key e = 13.

Gabriel Moruz: Hardware aware algorithms and data structures

29

slide-49
SLIDE 49

Resilient Results

[Finocchi and Italiano ’04, Finocchi et al. ’06, Finocchi et al. ’07, Jørgensen et al. ’07]

  • Sorting: Θ(n log n + δ2)
  • Static dictionaries:

– Randomized: Θ(log n + δ) expected time – Deterministic: Ω(log n + δ), O(log n + δ1+ǫ) worst case

  • Search trees: amortized O(log n + δ2) time per operation
  • Priority queues: amortized O(log n + δ) time per operation

Our paper:

  • Randomized static dictionary: Θ(log n + δ) expected time
  • Deterministic static dictionary: O(log n + δ) worst case time
  • Deterministic dynamic dictionary: O(log n + δ) worst case

time for search, O(log n + δ) amortized time for updates

Gabriel Moruz: Hardware aware algorithms and data structures

30

slide-50
SLIDE 50

Classical Binary Search

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 18 21 24 22 35 28 31 42 3

Gabriel Moruz: Hardware aware algorithms and data structures

31

slide-51
SLIDE 51

Classical Binary Search

1 2 4 5 6 7 8 9 10 11 12 14 15 3 13 7 5 8 10 12 13 18 21 24 22 35 28 31 42 3

← ← Search key e = 3 Problems

  • The adversary can mislead the search
  • The answer may be wrong
  • The search may end very far from the correct location
  • A single corruption suffices!!!

Gabriel Moruz: Hardware aware algorithms and data structures

31

slide-52
SLIDE 52

Randomized Static Dictionary

  • 1. Split the input in 2δ disjoint sequences S1, . . . , S2δ
  • 2. Perform a classic binary search on a random Sk
  • 3. Check whether the search was not mislead by corruptions
  • 4. If search was mislead restart from step 2. with a new Sk

4 7 9 12 13 16 18 21 27 30 25 32 33 39 42 44 45 46 37 1 12 21 32 42 4 13 25 33 44 7 16 27 37 45 46 39 30 18 1 9

A S1 S2 S3 S4

δ = 2

Gabriel Moruz: Hardware aware algorithms and data structures

32

slide-53
SLIDE 53

The Magic Step 3

4 7 9 12 13 16 18 21 10 30 25 32 33 39 42 44 45 46 37 1 12 21 32 42 4 13 25 33 44 7 16 37 45 46 39 30 18 9 1 10

A S1 S2 S3 S4

  • L

R

e = 13, δ = 2, cl = 1, cr = 5

  • |L| = |R| = 2δ + 1
  • cl – # keys in L smaller than e
  • cr – # keys in R larger than e
  • Restart if cl ≤ δ or cr ≤ δ
  • Scan all elements between L and R otherwise

Gabriel Moruz: Hardware aware algorithms and data structures

33

slide-54
SLIDE 54

Analysis

  • 1. Split the input in 2δ disjoint sequences S1, . . . , S2δ
  • 2. Perform a classic binary search on a random Sk
  • 3. Check whether the search was not mislead by corruptions
  • 4. If search was mislead restart from step 2. with a new Sk

Gabriel Moruz: Hardware aware algorithms and data structures

34

slide-55
SLIDE 55

Analysis

  • 1. Split the input in 2δ disjoint sequences S1, . . . , S2δ
  • 2. Perform a classic binary search on a random Sk
  • 3. Check whether the search was not mislead by corruptions
  • 4. If search was mislead restart from step 2. with a new Sk
  • Step 2: O(log n) time and O(log δ) random bits
  • Step 3: O(δ) time
  • Probability theory: expected at most two iterations
  • Altogether: O(log n + δ) time, O(log δ) random bits

Gabriel Moruz: Hardware aware algorithms and data structures

34

slide-56
SLIDE 56

Analysis

  • 1. Split the input in 2δ disjoint sequences S1, . . . , S2δ
  • 2. Perform a classic binary search on a random Sk
  • 3. Check whether the search was not mislead by corruptions
  • 4. If search was mislead restart from step 2. with a new Sk
  • Step 2: O(log n) time and O(log δ) random bits
  • Step 3: O(δ) time
  • Probability theory: expected at most two iterations
  • Altogether: O(log n + δ) time, O(log δ) random bits

Note

  • Adaptive adversaries can compute index k of Sk!!!
  • For adaptive adversaries: O(δ log n) time

Gabriel Moruz: Hardware aware algorithms and data structures

34

slide-57
SLIDE 57

Deterministic Static Dictionary

Gabriel Moruz: Hardware aware algorithms and data structures

35

slide-58
SLIDE 58

High Level Picture

  • Adapted binary search

– Reuse the sub-sequencing idea – Perform adapted binary search on subsequences – Change the subsequence when identifying corruptions – A corruption forces it to advance one level

  • Verification procedure

– Checks whether the search was mislead by corruptions – Upon success takes O(δ) time – Upon failure takes O(f) time and identifies Ω(f) errors

  • Final scan

– Performed once, check O(δ) elements

Gabriel Moruz: Hardware aware algorithms and data structures

36

slide-59
SLIDE 59

Structure

LV RV Q

  • δ + 1

. . . . . .

Block

  • Use different elements for search and verification
  • Query segment Q:

– Used only by the binary search – Defines subsequences S0, . . . , Sδ+1 – There is at least an Sk corruption-free

  • Verification segments LV and RV

– Used only by verification – Allow the use of a majority argument

Gabriel Moruz: Hardware aware algorithms and data structures

37

slide-60
SLIDE 60

Adapted Binary Search

47 31 25 23 29 32 35 41 10 12 13 8 4 3 18 21 14

−∞ 8 7 6 5 4 3 2 1 −1 9 10 11 12 13 14 15 16 17 +∞ ← → Sk

The search key e = 21.

  • Check the next to last element in the pointed direction

Gabriel Moruz: Hardware aware algorithms and data structures

38

slide-61
SLIDE 61

Adapted Binary Search

47 31 25 23 29 32 35 41 10 12 13 8 4 3 18 21 14

−∞ 8 7 6 5 4 3 2 1 −1 9 10 11 12 13 14 15 16 17 +∞ ← → Sk

The search key e = 21.

  • Check the next to last element in the pointed direction

Gabriel Moruz: Hardware aware algorithms and data structures

38

slide-62
SLIDE 62

Adapted Binary Search

47 31 25 23 29 32 35 41 10 12 13 8 4 3 18 21 14

−∞ 8 7 6 5 4 3 2 1 −1 9 10 11 12 13 14 15 16 17 +∞ ← → ← ← Sk

The search key e = 21.

  • Check the next to last element in the pointed direction
  • A corruption would be identified in the next step (unless

another corruption occurs)

  • Big idea: each step in the wrong direction corresponds to a

corruption

  • Conflict area: search key must be there or corruption
  • Call verification procedure on conflict area:

– Succeeds: search key must be there, scan two blocks – Fails: Backtrack the search on a different Sk

Gabriel Moruz: Hardware aware algorithms and data structures

38

slide-63
SLIDE 63

Verification procedure

  • 2

3 5 7 12 14 18 21 23 24 28 49 31 32 35 40 41 45 . . . . . . → ← cl cr 38 71

  • LVi

Qi RVi

  • LVi+1

Qi+1 RVi+1 1 1

Search key e = 45, δ = 3, # corruptions found k = 1

  • Performed on LVi and RVi+1
  • cl – confidence that e is to the right of LVi
  • cr – confidence that e is to the left of RVi+1

Gabriel Moruz: Hardware aware algorithms and data structures

39

slide-64
SLIDE 64

Verification procedure

  • 2

3 5 7 12 14 18 21 23 24 28 49 31 32 35 40 41 45 . . . . . . → → ← 2 cl cr ← 38 71

  • LVi

Qi RVi

  • LVi+1

Qi+1 RVi+1 2

Search key e = 45, δ = 3, # corruptions found k = 1

  • Performed on LVi and RVi+1
  • cl – confidence that e is to the right of LVi
  • cr – confidence that e is to the left of RVi+1

Gabriel Moruz: Hardware aware algorithms and data structures

39

slide-65
SLIDE 65

Verification procedure

  • 2

3 5 7 12 14 18 21 23 24 28 49 31 32 35 40 41 45 . . . . . . → → → ← 2 3 cl cr → ← 38 71 1 2

  • LVi

Qi RVi

  • LVi+1

Qi+1 RVi+1

Search key e = 45, δ = 3, # corruptions found k = 1

  • Performed on LVi and RVi+1
  • cl – confidence that e is to the right of LVi
  • cr – confidence that e is to the left of RVi+1

Gabriel Moruz: Hardware aware algorithms and data structures

39

slide-66
SLIDE 66

Verification procedure

  • 2

3 5 7 12 14 18 21 23 24 28 49 31 32 35 40 41 45 . . . . . . → → → → ← 2 3 4 cl cr → → ← 38 71 1 2

  • LVi

Qi RVi

  • LVi+1

Qi+1 RVi+1

Search key e = 45, δ = 3, # corruptions found k = 1

  • Performed on LVi and RVi+1
  • cl – confidence that e is to the right of LVi
  • cr – confidence that e is to the left of RVi+1
  • Fails if cl = 0 or cr = 0, succeeds otherwise
  • 2f elements visited in each segment to detect f errors
  • Start 2k positions away from end of the each segment

Gabriel Moruz: Hardware aware algorithms and data structures

39

slide-67
SLIDE 67

Analysis

  • Adapted binary search:

– Corruption-free: O(log n) – Time spent in wrong direction: O(f) for f corruptions

  • Verification:

– A single verification: O(f) time for f corruptions – All verifications: O(δ)

  • Final scan: O(δ) time to scan two blocks

Altogether: The resilient static deterministic dictionary supports searches in O(log n + δ) time.

Gabriel Moruz: Hardware aware algorithms and data structures

40

slide-68
SLIDE 68

Dynamic Deterministic Dictionary

Gabriel Moruz: Hardware aware algorithms and data structures

41

slide-69
SLIDE 69

Reliable Value

  • Stored in unreliable memory, retrieved reliably
  • Uses O(δ) time and O(δ) space
  • Replicate the given value 2δ + 1 times
  • Retrieve during a scan using a majority argument

– Keep in safe memory a candidate element and a counter – Increase counter when encountering a matching element – Decrease counter when encountering a different element – Discard candidate when counter becomes zero

Gabriel Moruz: Hardware aware algorithms and data structures

42

slide-70
SLIDE 70

Dynamic Dictionary – Structure

Leaf structure Top tree

O(1)

  • Θ(δ)

Θ(δ)

. . .

. . .

Θ(δ) Θ(log n) B B0 B1 Bb−1

  • Top tree

– Introduced in [Brodal et al. ’02] – Stores only guiding elements, not input elements

  • Leaf structure

– Consists of Θ(log n) buckets and a top bucket – Only B0, . . . , Bb−1 contain input elements

Gabriel Moruz: Hardware aware algorithms and data structures

43

slide-71
SLIDE 71

Top Tree

[Brodal et al. ’02]

O(1)

  • Common knowledge:

– Has height log |T| + O(1), can be laid in BFS order – Supports updates in amortized O(log2 |T|) time

  • We store it reliably:

– Updates cost becomes amortized O(δ log2 |T|) time

Gabriel Moruz: Hardware aware algorithms and data structures

44

slide-72
SLIDE 72

Leaf Structure

  • Θ(δ)

Θ(δ)

. . .

. . .

Θ(δ) Θ(log n) B B0 B1 Bb−1

  • Stores Θ(δ log n) input elements
  • Each bucket Bi store Θ(δ) input elements
  • Top bucket contains guiding elements stored reliably

Gabriel Moruz: Hardware aware algorithms and data structures

45

slide-73
SLIDE 73

Searches

Leaf structure Top tree

O(1)

  • Θ(δ)

Θ(δ)

. . .

. . .

Θ(δ) Θ(log n) B B0 B1 Bb−1

  • Search the last level of internal nodes in the top-tree, and

identify two consecutive nodes

  • Search reliably the O(1) remaining nodes
  • Search the top bucket, identify some bucket Bi
  • Scan Bi and report result

Gabriel Moruz: Hardware aware algorithms and data structures

46

slide-74
SLIDE 74

Searches

Leaf structure Top tree

O(1)

  • Θ(δ)

Θ(δ)

. . .

. . .

Θ(δ) Θ(log n) B B0 B1 Bb−1

  • Search the last level of internal nodes in the top-tree, and

identify two consecutive nodes

  • Search reliably the O(1) remaining nodes
  • Search the top bucket, identify some bucket Bi
  • Scan Bi and report result
  • Time: O(log n + δ) worst case.

Gabriel Moruz: Hardware aware algorithms and data structures

46

slide-75
SLIDE 75

Updates

Leaf structure Top tree

O(1)

  • Θ(δ)

Θ(δ)

. . .

. . .

Θ(δ) Θ(log n) B B0 B1 Bb−1

  • Use standard bucketing techniques

– Split/merge buckets each Ω(δ) operations – Insert/delete new elements in B each Ω(δ) operations – Insert/delete new elements in the top tree each Ω(δ log n) operations

  • Time: O(log n + δ) amortized for insertions and deletions.

Gabriel Moruz: Hardware aware algorithms and data structures

47

slide-76
SLIDE 76

Conclusion

Will theory catch practice?

Gabriel Moruz: Hardware aware algorithms and data structures

48