mLSM : Making Authenticated Storage Faster in Ethereum Pandian Raju - - PowerPoint PPT Presentation

mlsm making authenticated storage faster in ethereum
SMART_READER_LITE
LIVE PREVIEW

mLSM : Making Authenticated Storage Faster in Ethereum Pandian Raju - - PowerPoint PPT Presentation

mLSM mLSM : Making Authenticated Storage Faster in Ethereum Pandian Raju 1 , Soujanya Ponnapalli 1 , Evan Kaminsky 1 , Gilad Oved 1 , Zachary Keener 1 Vijay Chidambaram 1,2 , Ittai Abraham 2 1 The University of Texas at Austin; 2 VMware Research 1


slide-1
SLIDE 1

mLSM mLSM: Making Authenticated Storage Faster in Ethereum

Pandian Raju1, Soujanya Ponnapalli1, Evan Kaminsky1, Gilad Oved1, Zachary Keener1 Vijay Chidambaram1,2, Ittai Abraham2

1The University of Texas at Austin; 2VMware Research

1

slide-2
SLIDE 2

Et Ethereum

  • Distributed software platform
  • Cryptocurrency applications
  • Key-value store
  • Accounts : Balances
  • Trustless Decentralized setting

2

slide-3
SLIDE 3

Et Ethereum – Di Distribut buted d De Decentral alized d System

3

B1 B2 B3 Node1 Node2 Node4 Node3 Node4 B4

slide-4
SLIDE 4

Ne Need for

  • r Au

Auth thenti ticated St Stor

  • rage

4

User A, Balance: $500 User C, Balance: $500000

A

User B, Balance: $5000 User B, Balance: $5000 User A, Balance: $0 User C, Balance: $0

Balance (A) ? Balance : $0 Balance : $0

slide-5
SLIDE 5

Au Auth thenti ticated Stor

  • rage
  • Users can verify the value returned by a node
  • Each read is returned with the value and a proof

5

User A, Balance: $500 User C, Balance: $500000

A

User B, Balance: $5000

Balance (A) ? [ $500, PROOF ]

slide-6
SLIDE 6

Au Auth thenti ticati tion

  • n Techniques in Eth

thereum

  • Ethereum authenticated storage suffer from high IO Amplification
  • 64x in the worst case
  • IO Amplification
  • Ratio of the amount of IO to the amount of user data

6

Client

User data : 10 GB

Server User data : 10 GB Total write IO : 500 GB Write Amplification : 50

slide-7
SLIDE 7

Wh Why is s IO Amplification bad?

  • Reduces the write throughput
  • Directly impact the life of Flash devices
  • Flash devices wear out after limited write cycles

(Intel SSD DC P4600 can last ~5 years assuming ~5 TB write per day) For the same SSD life expectancy, with 65x IO Amplification, instead of 5TB of data we can now only write ~75 GB of user data per day

7

slide-8
SLIDE 8

Ho How to des esign gn an authen enticated ed storage e system em th that t min minimiz imizes s IO O amp amplif lific ication tion?

Merkelized LSM

  • Maintains multiple mutually independent binary merkle trees
  • Decouples lookup from authentication
  • Minimizes IO Amplification

8

slide-9
SLIDE 9

Ou Outline

  • Authentication in Ethereum
  • Why caching doesn’t work?
  • Merkelized LSM

9

slide-10
SLIDE 10

Aut Authe henticated d St Storage in n Et Ethereum

10

slide-11
SLIDE 11

Me Merkle Trees – Fu Fundamen ental b building b blocks

11

K1 : v1 K2 : v2 K3 : v3 K4 : v4 H(v1-2) H(v3-4) H(v1-4) K4 : V4’ H(v3-4’) H(v1-4’) With a constant sized root hash, we can authenticate all the key-value pairs Root hash is publicly available to all clients

slide-12
SLIDE 12

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client queries for value of key

k4

  • Server replies with the value

12

K1 : v1 K2 : v2 K3 : v3 P1 P2 Root K4 : v4 K4 : v4

slide-13
SLIDE 13

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client queries for value of key

k4

  • Server replies with the value
  • Along with a Merkle Proof

13

K1 : v1 K2 : v2 K3 : v3 K4 : v4 P1 P2 Root K4 : v4 K3 : v3 P1 Root

slide-14
SLIDE 14

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client queries for value of key

k4

  • Server replies with the value
  • Along with a Merkle Proof

14

K1 : v1 K2 : v2 K3 : v3 K4 : v4 P1 P2 Root K4 : v4 k3 : v3 Root P1 K4 : v4

Response :

slide-15
SLIDE 15

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client verifies the value by

calculating the root hash from the value and Merkle proof

15

K1 : v1 K2 : v2 K3 : v3 K4 : v4 P1 P2 Root K4 : v4 k3 : v3 Root P1 K4 : v4

Response :

slide-16
SLIDE 16

Au Auth thenti ticati tion

  • n using

g Merkle Trees

16

K1 : v1 K2 : v2 K3 : v3 K4 : v4 P1 P2 Root K4 : v4

  • Client verifies the value by

calculating the root hash from the value and Merkle proof

K3 : v3 Root P1 K4 : v4

Response :

slide-17
SLIDE 17

Au Auth thenti ticati tion

  • n using

g Merkle Trees

17

K1 : v1 K2 : v2 P1 P2 Root

  • Client verifies the value by

calculating the root hash from the value and Merkle proof

Root P1

Response :

P2 K3 : v3 K4 : v4 P2

slide-18
SLIDE 18

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client queries for value of key

k4

  • Server replies with value and a

Merkle Proof

18

K1 : v1 K2 : v2 K3 : v3 P1 P2 Root K4 : v4 Root P1

Response :

P2

slide-19
SLIDE 19

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client queries for value of key

k4

  • Server replies with value and a

Merkle Proof

19

K1 : v1 K2 : v2 K3 : v3 P1 P2 Root K4 : v4 Root P1

Response :

P2 P1

slide-20
SLIDE 20

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client verifies the value by

calculating the root hash from the value and Merkle proof

20

K1 : v1 K2 : v2 K3 : v3 K4 : v4 Root

Response :

Root Root P1 P2

slide-21
SLIDE 21

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client verifies the value by

calculating the root hash from the value and Merkle proof

21

K1 : v1 K2 : v2 K3 : v3 P1 P2 Root K4 : v4 Root

Response :

Root ?

slide-22
SLIDE 22

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Server can no longer lie about

the data

22

K1 : v1 K2 : v2 K3 : v3 K4 : v4 P1 P2 Root K4 : v4 k3 : v3 Root P1 K4 : v4’

Response :

slide-23
SLIDE 23

Au Auth thenti ticati tion

  • n using

g Merkle Trees

23

K1 : v1 K2 : v2 K3 : v3 K4 : v4 P1 P2 Root K4 : v4

  • Server can no longer lie about

the value

K3 : v3 Root P1 K4 : v4’

Response :

slide-24
SLIDE 24

Au Auth thenti ticati tion

  • n using

g Merkle Trees

24

K1 : v1 K2 : v2 P1 P2 Root

  • Client verifies the value by

calculating the root hash from the value and Merkle proof

Root P1

Response :

P2’ K3 : v3 K4 : v4 P2

slide-25
SLIDE 25

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client queries for value of key

k4

  • Server replies with value and a

Merkle Proof

25

K1 : v1 K2 : v2 K3 : v3 P1 P2 Root K4 : v4 Root P1

Response :

P2’

slide-26
SLIDE 26

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client queries for value of key

k4

  • Server replies with value and a

Merkle Proof

26

K1 : v1 K2 : v2 K3 : v3 P1 P2 Root K4 : v4 Root P1

Response :

P2’ P1

slide-27
SLIDE 27

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Client verifies the value by

calculating the root hash from the value and Merkle proof

27

K1 : v1 K2 : v2 K3 : v3 K4 : v4 Root

Response :

Root’ Root P1 P2

slide-28
SLIDE 28

Au Auth thenti ticati tion

  • n using

g Merkle Trees

  • Server cannot lie about the

value

28

K1 : v1 K2 : v2 K3 : v3 P1 P2 Root K4 : v4 Root

Response :

Root’ ?

slide-29
SLIDE 29

Me Merkle Patricia a Trie

  • Similar to Merkle trees
  • Lookup based on the key structure
  • Considering 4 bit hex key-value pairs:
  • 0x20 – V1
  • 0x2f – V2
  • 0x51 – V3
  • 0x5e – V4

29

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4

slide-30
SLIDE 30

Au Auth thenti ticated Stor

  • rage in Eth

thereum

  • Trie is flattened and stored as key

value pairs

  • For every leaf node V, we store

[Hash(V) -> V]

  • For every parent node P, we have an

[Hash(P) -> [ … ]].

30

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4

slide-31
SLIDE 31

Au Auth thenti ticated Stor

  • rage in Eth

thereum

31

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4 KEY VALUE

slide-32
SLIDE 32

Au Auth thenti ticated Stor

  • rage in Eth

thereum

32

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4 KEY VALUE Hash (V1) V1 Hash (V2) V2 Hash (V3) V3 Hash (V4) V4

slide-33
SLIDE 33

Au Auth thenti ticated Stor

  • rage in Eth

thereum

33

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4 KEY VALUE Hash (V1) V1 Hash (V2) V2 Hash (V3) V2 Hash (V4) V3 Hash (P1) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4)

slide-34
SLIDE 34

Au Auth thenti ticated Stor

  • rage in Eth

thereum

34

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4 KEY VALUE Hash (V1) V1 Hash (V2) V2 Hash (V3) V3 Hash (V4) V4 Hash (P1) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4) Hash (RH) Hash (P1), Hash (P2)

slide-35
SLIDE 35

Re Read Amplification in Ethereum

35

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4 KEY VALUE Hash (V1) V1 Hash (V2) V2 Hash (V3) V3 Hash (V4) V4 Hash (P1) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4) Hash (RH) Hash (P1), Hash (P2) Get (0x2f)

slide-36
SLIDE 36

Re Read Amplification in Ethereum

36

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4 Get (0x2f) Get (Hash(RH)) KEY VALUE Hash (V1) V1 Hash (V2) V2 Hash (V3) V3 Hash (V4) V4 Hash (P1) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4) Hash (RH) Hash (P1), Hash (P2)

slide-37
SLIDE 37

Re Read Amplification in Ethereum

37

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4 Get (0x2f) Get (Hash(P1)) KEY VALUE Hash (V1) V1 Hash (V2) V2 Hash (V3) V3 Hash (V4) V4 Hash (P1) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4) Hash (RH) Hash (P1), Hash (P2)

slide-38
SLIDE 38

Re Read Amplification in Ethereum

38

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V2 V3

e

V4 Get (0x2f) Get (Hash(V2)) KEY VALUE Hash (V1) V1 Hash (V2) V2 Hash (V3) V3 Hash (V4) V4 Hash (P1) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4) Hash (RH) Hash (P1), Hash (P2)

slide-39
SLIDE 39

Wr Write Amplification in Ethereum

39

2

Root Hash P1

5 f 1

Branching: 0 - f V1 P2 V5 V3

e

V4 Update (0x2f, 5) Update (Hash(V5), V5) KEY VALUE Hash (V1) V1 Hash (V5) V5 Hash (V3) V3 Hash (V4) V4 Hash (P1) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4) Hash (RH) Hash (P1), Hash (P2)

slide-40
SLIDE 40

Wr Write Amplification in Ethereum

40

2

Root Hash P1’

5 f 1

Branching: 0 - f V1 P2 V5 V3

e

V4 Put (0x2f, 5) Update (Hash (P1’)) KEY VALUE Hash (V1) V1 Hash (V5) V5 Hash (V3) V3 Hash (V4) V4 Hash (P1’) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4) Hash (RH) Hash (P1), Hash (P2)

slide-41
SLIDE 41

Wr Write Amplification in Ethereum

41

2

Root Hash’ P1’

5 f 1

Branching: 0 - f 1 Hash 3 - 4 5 3

e

4 Put (0x2f, 5) Update (RH’) KEY VALUE Hash (V1) V1 Hash (V5) V5 Hash (V3) V3 Hash (V4) V4 Hash (P1’) Hash (V1), Hash (V2) Hash (P2) Hash (V3), Hash (V4) Hash (RH’) Hash (P1’), Hash (P2)

slide-42
SLIDE 42

Ex Experime imental l Setup

  • Private Ethereum network
  • Importing first 1.6 M blocks of the real-world public block chain
  • geth - Ethereum go client
  • Machine
  • 16 GB of RAM
  • 2TB Intel 750 series SSD

42

slide-43
SLIDE 43

IO IO Amplification in Ethereum

  • State Trie – 7X IO Amplification
  • getBalance (addr)
  • Returns the amount of ether balance present in the account addr
  • 0.22M account addresses
  • 1.4M LevelDB gets

43

slide-44
SLIDE 44

IO IO Amplification in Ethereum

  • State Trie – 7X IO Amplification
  • Worst case – 64X IO Amplification
  • Key : 256 bits
  • Every node : 4 bits
  • Depth of Trie : 64 in the worst case
  • Ignoring the IO Amplification introduced by underlying kv store
  • Considers the first 1.6M blocks of the block chain
  • Current size of blockchain : 5.9M blocks

44

slide-45
SLIDE 45

Caching Caching - Wh Why d does esn’t i it w work rk?

45

slide-46
SLIDE 46

Cac Cachi hing ng key with h val alue ue, pr proof

  • Going back to our example
  • For a 4 bit hex string key-value pairs
  • 0x20 – 1
  • 0x2f – 2
  • 0x51 – 3
  • 0x5e – 4

46

2

Root Hash P1

5 f 1

Branching: 0 - f 1 P2 2 3

e

4

slide-47
SLIDE 47

Cac Cachi hing ng key with h val alue ue, pr proof

  • For every key, we cache the value and

the Merkle Proof

47

2

Root Hash P1

5 f 1

Branching: 0 - f 1 P2 2 3

e

4 Key Value Proof 0x2f 2 [1, P2, Root Hash]

slide-48
SLIDE 48

Cac Cachi hing ng key with h val alue ue, pr proof

  • For every key, we cache the value and

the Merkle Proof

48

2

Root Hash P1

5 f 1

Branching: 0 - f 1 P2 2 3

e

4 Key Value Proof 0x2f 2 [1, P2, Root Hash] 0x20 1 [2, P2, Root Hash]

slide-49
SLIDE 49

Cac Cachi hing ng key with h val alue ue, pr proof

  • For every key, we cache the value and

the Merkle Proof

49

2

Root Hash P1

5 f 1

Branching: 0 - f 1 P2 2 3

e

4 Key Value Proof 0x2f 2 [1, P2, Root Hash] 0x20 1 [2, P2, Root Hash] 0x51 3 [4, P1, Root Hash]

slide-50
SLIDE 50

Cac Cachi hing ng key with h val alue ue, pr proof

  • For every key, we cache the value and

the Merkle Proof

50

2

Root Hash P1

5 f 1

Branching: 0 - f 1 P2 2 3

e

4 Key Value Proof 0x2f 2 [1, P2, Root Hash] 0x20 1 [2, P2, Root Hash] 0x51 3 [4, P1, Root Hash] 0x5e 4 [3, P1, Root Hash]

slide-51
SLIDE 51

A A singl gle update invalidates th the wh whol

  • le cache

51

2

Root Hash P1

5 f 1

Branching: 0 - f 1 P2 2 3

e

4 Key Value Proof 0x2f 2 [1, P2, Root Hash] 0x20 1 [2, P2, Root Hash] 0x51 3 [4, P1, Root Hash] 0x5e 4 [3, P1, Root Hash]

Reads can be served from the cache

slide-52
SLIDE 52

A A singl gle update invalidates th the wh whol

  • le cache

52

2

Root Hash’ P1’

5 f 1

Branching: 0 - f 1 P2 5 3

e

4 Key Value Proof 0x2f 2 [1, P2, Root Hash] 0x20 1 [2, P2, Root Hash] 0x51 3 [4, P1, Root Hash] 0x5e 4 [3, P1, Root Hash]

slide-53
SLIDE 53

A A singl gle update invalidates th the wh whol

  • le cache

53

2

Root Hash P1

5 f 1

Branching: 0 - f 1 P2 5 3

e

4 Key Value Proof 0x2f 5 [1, P2, Root Hash] 0x20 1 [2, P2, Root Hash] 0x51 3 [4, P1, Root Hash] 0x5e 4 [3, P1, Root Hash]

slide-54
SLIDE 54

A A singl gle update invalidates th the wh whol

  • le cache

54

2

Root Hash P1’

5 f 1

Branching: 0 - f 1 P2 5 3

e

4 Key Value Proof 0x2f 5 [1, P2, Root Hash] 0x20 1 [2, P2, Root Hash] 0x51 3 [4, P1’, Root Hash] 0x5e 4 [3, P1’, Root Hash]

slide-55
SLIDE 55

A A singl gle update invalidates th the wh whol

  • le cache

55

2

Root Hash’ P1’

5 f 1

Branching: 0 - f 1 P2 5 3

e

4 Key Value Proof 0x2f 5 [1, P2, Root Hash’] 0x20 1 [2, P2, Root Hash’] 0x51 3 [4, P1’, Root Hash’] 0x5e 4 [3, P1’, Root Hash’]

slide-56
SLIDE 56

A A singl gle update invalidates th the wh whol

  • le cache

56

2

Root Hash’ P1

5 f 1

Branching: 0 - f 1 P2 5 3

e

4 Key Value Proof 0x2f 5 [1, P2, Root Hash’] 0x20 1 [2, P2, Root Hash’] 0x51 3 [4, P1’, Root Hash’] 0x5e 4 [3, P1’, Root Hash’]

Works only for read-only workloads

slide-57
SLIDE 57

Me Merkelized LSM

57

slide-58
SLIDE 58

Wh Why caching didn’t work rk?

  • Tight coupling between any two nodes in the tree
  • All nodes form a single tree under the same root node
  • Tight coupling between Lookup and Authentication
  • Lookup for a value is done traversing the authenticated data structure

58

slide-59
SLIDE 59

In Insights behind mLSM

Maintaining Multiple Independent structures Decoupling Lookup from Authentication

59

slide-60
SLIDE 60

Ma Maintaining multiple in independent struct ctures

60

slide-61
SLIDE 61

Me Merkelized LSM M : De Design

61

In-Memory Memory Storage

In-memory and On-disk layers

slide-62
SLIDE 62

Merkelized Log Struct ctured Merge Tree (mLSM)

62

In memory data is periodically written as binary Merkle trees to storage

In-Memory Memory Storage

slide-63
SLIDE 63

Me Merkelized LSM M : De Design

  • Binary Merkle Trees
  • Reduce the size of the Merkle Proof
  • Balance data better than Tries

63

slide-64
SLIDE 64

Merkelized Log Struct ctured Merge Tree (mLSM)

64

Merkle Trees on storage are logically arranged in different levels

In-Memory Memory Storage Level 0 Level 1 Level n

slide-65
SLIDE 65

Merkelized Log Struct ctured Merge Tree (mLSM)

65

Compaction is performed once #Trees in a level reaches a threshold

In-Memory Memory Storage Level 0 Level 1 Level n Compaction

slide-66
SLIDE 66

Merkelized Log Struct ctured Merge Tree (mLSM)

66

Compaction is performed once #Trees in a level reaches a threshold

In-Memory Memory Storage Level 0 Level 1 Level n Compaction

slide-67
SLIDE 67

Wr Writes in Merkelized LSM

67

Writes are handled in-memory

In-Memory Memory Storage Level 0 Level 1 Level n Write (Key, Value)

slide-68
SLIDE 68

Wr Writes in Merkelized LSM

68

Writes are batched and written onto storage

In-Memory Memory Storage Level 0 Level 1 Level n Write (Key, Value)

slide-69
SLIDE 69

Wr Writes in Merkelized LSM

69

Numbers of files on reaching the threshold at the level

In-Memory Memory Storage Level 0 Level 1 Level n Compaction In-Memory Memory Storage Level 1 Level n Compaction Write (Key, Value)

slide-70
SLIDE 70

Wr Writes in Merkelized LSM

70

Compaction is performed from lower levels to higher levels

In-Memory Memory Storage Level 0 Level 1 Level n Compaction

slide-71
SLIDE 71

Au Auth thenti ticati tion

  • n in mLSM

71

In-Memory Memory Storage Level 0 Level 1 Level n

slide-72
SLIDE 72

Au Auth thenti ticati tion

  • n in mLSM

72

In-Memory Memory Storage Level 0 Level 1 Level n

Every binary merkle tree on level has a local root

slide-73
SLIDE 73

Au Auth thenti ticati tion

  • n in mLSM

73

In-Memory Memory Storage Level 0 Level 1 Level n

Global Master Root dynamically computes global Merkle Tree

slide-74
SLIDE 74

Au Auth thenti ticati tion

  • n in mLSM

74

In-Memory Memory Storage Level 0 Level 1 Level n

Merkle Proof includes the local and the global Merkle proofs

slide-75
SLIDE 75

Dec Decoupling l lookup fr from Aut Authe hentication

75

slide-76
SLIDE 76

Le LevelDB Cache

76

LevelDB cache to store ( Key, Level : Value, Merkle Proof )

Key, Level Value, Proof LevelDB cache In-Memory Memory Storage Level 0 Level 1 Level n

slide-77
SLIDE 77

Re Reads in mLSM

77

Key, Level Value, Proof LevelDB cache In-Memory Memory Storage Level 0 Level 1 Level n Get (key)

slide-78
SLIDE 78

Re Reads in mLSM

78

In-memory structure is searched for the value

Key, Level Value, Proof LevelDB cache In-Memory Memory Storage Level 0 Level 1 Level n Get (key)

slide-79
SLIDE 79

Re Reads in mLSM

79

mLSM is traversed level by level in-order

Key, Level Value, Proof LevelDB cache In-Memory Memory Storage Level 0 Level 1 Level n Get (key)

slide-80
SLIDE 80

Re Reads in mLSM

80

First occurrence of the key value pair is returned

Key, Level Value, Proof LevelDB cache In-Memory Memory Storage Level 0 Level 1 Level n Get (key)

slide-81
SLIDE 81

Re Reads in mLSM

81

<Key, level : value, local Merkle proof> are cached

Key, Level Value, Proof key, Level value, Local proof LevelDB cache In-Memory Memory Storage Level 0 Level 1 Level n Get (key)

slide-82
SLIDE 82

Re Reads in mLSM

82

Key, Level Value, Proof key, Level value, Local Proof LevelDB cache In-Memory Memory Storage Level 0 Level 1 Level n Get (key)

NOTE: Global Proof is not cached

slide-83
SLIDE 83

Re Reads in mLSM

83

Key, Level Value, Proof key, Level value, Local Proof LevelDB cache In-Memory Memory Storage Level 0 Level 1 Level n

Subsequent reads are served from the cache

Get (key)

slide-84
SLIDE 84

Re Reads in mLSM

84

LevelDB cache can be populated once a new binary Merkle tree is created

In-Memory Memory Storage Compaction Level 0 Level 1 Level n Key, Level Value, Proof key, Level value, Local Proof LevelDB cache

slide-85
SLIDE 85

Re Revisiting writes

85

In-Memory Memory Storage Level 0 Level 1 Level n

Writes affect values in a single local tree and the global root

Put (Key, value)

slide-86
SLIDE 86

Wo Would writes invalidate the whole cache?

  • Global proofs are not cached
  • Writes don’t invalidate any existing entries
  • Keys at the same level are over-written when the binary tree is created
  • Cache will not be invalidated on every write

86

slide-87
SLIDE 87

Me Merkelized LSM M : Reviewing the design

  • Writes
  • Buffered in memory
  • Then written to storage
  • No in place updates
  • A write affects one tree and the master root
  • Reads
  • Served from the cache
  • Or by traversing levels from lowest and till the first occurrence of key is found
  • Returns value and proof : <local proof, global proof>

87

slide-88
SLIDE 88

Me Merkelized LSM M ad advan antag ages

  • Writes are handled in memory : O(1) complexity
  • Reads :
  • Served from cache : O(levels in LevelDB cache)
  • Traversing the mLSM : O(height of mLSM * height of a binary Merkle tree)

88

Reads Complexity Served by Cache Hit O(Levels in Cache) LevelDB cache Cache Miss O(Height of mLSM x Height of the binary tree) Traversing mLSM

slide-89
SLIDE 89

Me Merkelized LSM M chal allenges

  • Handling read amplification
  • Overhead of LSM structure is significant for applications with little data
  • LevelDB cache misses would result in read amplification
  • Deterministic Compaction
  • Replicas : Multiple nodes storing data

89

slide-90
SLIDE 90

De Determini nistic Co Compac paction

90

In-Memory Memory Storage Level 0 Level 1 Level n

Compaction changes the local roots

Compaction

slide-91
SLIDE 91

De Determini nistic Co Compac paction

91

In-Memory Memory Storage Level 0 Level 1 Level n

Compaction changes the local roots and the global root

Compaction

slide-92
SLIDE 92

mL mLSM : : Authentic icated Data Struct cture

  • Minimizes IO Amplification
  • Maintains multiple mutually independent binary Merkle trees
  • Decouples lookup from authentication

92