Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency - - PowerPoint PPT Presentation

comparison based dictionaries
SMART_READER_LITE
LIVE PREVIEW

Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency - - PowerPoint PPT Presentation

Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency Gerth Stlting Brodal Allan Grnlund Jrgensen Thomas Mlhave University of Aarhus ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data Structures University


slide-1
SLIDE 1

Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency

Gerth Stølting Brodal Allan Grønlund Jørgensen Thomas Mølhave University of Aarhus

ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data Structures University Residential Centre of Bertinoro, Italy, September 30-October 5, 2007

slide-2
SLIDE 2

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

2

Binary Searching Fault tolerance I/O Efficiency

This talk Future work

slide-3
SLIDE 3

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

3

Search(17)

4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38 17

O(log N) comparisons

slide-4
SLIDE 4

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

4

Search(17)

4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38 17? 9 soft memory error

slide-5
SLIDE 5

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

5

Faulty-Memory RAM Model

Finocchi and Italiano, STOC’04

  • Content of memory cells can get corrupted
  • Corrupted and uncorrupted content cannot be distinguished
  • O(1) safe registers
  • Assumption: At most δ corruptions
  • Example: Sorting requires time Θ(N·log N+δ2)

Finocchi, Grandoni, Italiano, ICALP‘06

slide-6
SLIDE 6

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

6

Faulty-Memory RAM: Searching

  • Lower bound
  • Upper bound

Θ(log N + δ) comparisons

Finocchi, Grandoni, Italiano, ICALP’06 Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07

slide-7
SLIDE 7

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

7

Faulty-Memory RAM: Searching

17? 9 4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38

Problem?

High confidence Low confidence

Requirement: If there exists an uncorrupted element equal to the search key, we should find such an element

slide-8
SLIDE 8

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

8

Faulty-Memory RAM: Searching

When are we done (δ=3)?

Contradiction, i.e. at least one fault

If range contains at least δ+1 and δ+1 then there is at least one uncorrupted and , i.e. x must be contained in the range

slide-9
SLIDE 9

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

9

If verification fails → contradiction, i.e. ≥1 memory-fault → ignore 4 last comparisons → backtrack one level of search

Faulty-Memory RAM: Θ(log N + δ) Searching

1 1 2 2 3 3 4 4 5 5

Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07

slide-10
SLIDE 10

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

10

Faulty-Memory RAM: Θ(log N + δ) Searching

1 1 2 2 3 3 4 4

  • Standard binary search + verification steps
  • At most δ verification steps can fail/backtracking
  • Detail: Avoid repeated comparison with the same

(wrong) element by grouping elements into blocks

  • f size O(δ)

Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07

slide-11
SLIDE 11

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

11

Faulty-Memory RAM: Reliable Values

  • Store 2δ+1 copies of value x - at most δ copies uncorrupted
  • x = majority
  • Time O(δ) using two safe registers (candidate and count)

δ=5 y y y x x y x x x y x Candidate y y y y y y y – x – x Count 1 2 3 2 1 2 1 0 1 0 1

Boyer and Moore ‘91

slide-12
SLIDE 12

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

12

Faulty-Memory RAM: Dynamic Dictionaries

  • Packed array
  • Reliable pointers and keys
  • Updates O(δ ·log2 N)
  • Searches = fault tolerant O(log N+δ)
  • 2-level buckets of size O(δ·log N)
  • Root: Reliable pointers and keys
  • Bucket search/update amortized

O(log N+δ)

... ...

Θ(δ·log N) elements

Itai, Konheim, Rodeh, 1981

  • Search and update amortized O(log N+δ)

... Θ(δ) elements

Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07

slide-13
SLIDE 13

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

13

I/O Model

slide-14
SLIDE 14

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

14

I/O Model

  • N = problem size
  • M = memory size
  • B = I/O block size
  • One I/O moves B consecutive records from to disk
  • Complexity = number of I/Os

CPU External I/O Memory y r

  • m

e M

Aggarwal and Vitter 1988

       B N B N

B M /

log

  • Example: Sorting requires I/Os
slide-15
SLIDE 15

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

15

B-trees

O(logB N)

....

Ω(B)

Search path

  • Search and update O(logB N)
slide-16
SLIDE 16

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

16

Fault-Tolerance

versus I/O Efficiency

slide-17
SLIDE 17

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

17

Lower Bound for Fault-Tolerant External Searching

  • Adversary argument
  • If Bε slabs per I/O → factor Bε reduction and B1-ε faults
  • After k I/Os N/(Bε)k–k· B1-ε elements remain

Possible values

       



 

1

log 1 B N

B

  • I/Os required [minimized wrt ε ]
slide-18
SLIDE 18

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

18

Randomized Upper Bound for Fault-Tolerant External Searching

  • Sorted array + 2δ identical B-trees

(over N/(2δ) elements, stored in BFS layout)

  • Search: Select random tree for each

node on search path + verification

  • Probability no faults on path:

where Σβi≤δ

  • Search O(logB N+δ/B) expected

2 1 2 1

log 1

       

 N i i

B

 

....

slide-19
SLIDE 19

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

19

  • Sorted array

+ 2δ/B1-ε identical B-trees of degree Bε

+ B1-ε copies of each key + min/max

  • Search: Verify against min/max in

each step – if fail, backtrack one level and advance to next copy

  • Search

I/Os

Deterministic Upper Bound for Fault-Tolerant External Searching

       

B B N O

B

  

 1

log 1

slide-20
SLIDE 20

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

20

Dynamic Fault-Tolerant External Dictionaries

Static structure + Packed arrays + Buckets of size O(δ ·log3 N)

  • Deterministic

I/Os search and updates

  • Randomized

Expected O(logB N+δ/B) I/Os search and updates

     



 

1

log 1 B N O

B

... ...

Static

slide-21
SLIDE 21

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

21

  • Fault-tolerant external memory searching

I/Os worst-case [minized wrt ε]

  • Randomized O(logB N+δ/B) I/Os

Conclusion

       



 

1

log 1 B N

B

slide-22
SLIDE 22

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

22

Future Work Fault Tolerance versus I/O Efficiency

  • Randomized algorithms:

Memory faults in internal memory?

  • Sorting:

?

  • ...

          B B N B N

B M 2 /

log 

slide-23
SLIDE 23

Dictionaries: Fault Tolerance versus I/O Efficiency Brodal, Jørgensen, Mølhave

23

THANKS