b tree structure b tree file organization
play

B+ Tree Structure B+ Tree File Organization In a B+ Tree file - PDF document

11/24/2009 Outline Dynamic Authenticated Index Structures The Model for Outsourced Databases Motivation Problem Solution Background Feifei Li, Marios Hadjieleftheriou, George Kollios, Leonid Reyzin Boston University Papers


  1. 11/24/2009 Outline Dynamic Authenticated Index Structures ☺ The Model for Outsourced Databases ☺ Motivation ☺ Problem ☺ Solution ☺ Background Feifei Li, Marios Hadjieleftheriou, George Kollios, Leonid Reyzin Boston University ☺ Papers contributions AT&T Labs ‐ Research ☺ Experimental validation Presenter : Nima Najafian 1 Outsourced Database Model Motivation Owner: publish data Servers: host the data and provide query services • Advantages Clients: query the owner’s data through servers – The data owner does not need the hardware / software / personnel to run a DBMS – The ownerachieves economies of scale The ownerachieves economies of scale – The client enjoys better quality of service SD � A main challenge – The service provider is not trusted, and may return incorrect query results clients servers owner 3 Problem Un ‐ trusted server ☺ Un ‐ trusted Servers • Lazy: incentives to perform less • Curious: incentives to acquire information • Malicious – Incorrect results ( could be bugs) – Possibly compromised 1

  2. 11/24/2009 Problem 1: Injection Problem 2: Drop Select * from T where 5< A< 11 Select * from T where 5< A< 11 client client owner owner SD SD Returns Returns 7 7, 8 , 9 7, 8 , 9 A B A B A B A B r 1 … r 1 … r 1 … r 1 … … … … … … … … … r i-1 4 r i-1 4 r i-1 4 r i-1 4 r i 7 r i 7 r i 7 r i 7 r i+1 9 r i+1 9 r i+1 r i+1 9 9 r i+1 9 r i+2 11 r i+2 11 r i+2 11 r i+2 11 server server 7 8 Solution Query Authentication: (the dimensions) ☺ The Model • Query Correctness ☺ Motivation results do exist in the owner's database ~ injection ☺ Problem • Query Completeness ☺ Ability to authenticate without trusting the server no records have been omitted from the result ~ drop y g • Query Freshness ★ ★ h (Query Authentication) results are based on the most current version of the database ( this will bring a third problem into the picture ) ~omission 10 Background General Approach Authenticated Structures ☺ Cryptographic essentials Verification Object (VO) A B r 1 … … … r i-1 4 r i 7 Query results SD clients servers owner 11 2

  3. 11/24/2009 1: Collision ‐ resistant hash functions 2: Public key digital signature schemes • It is computational hard to find x 1 and x 2 s.t. h(x 1 )=h(x 2 ) Sender • Computational hard? Based on well established m assumptions such as discrete logarithms Insecure Channel • SHA1 Recipient Recipient SHA1 KeyGen → (SK, PK) • Observations: σ SK – variable input size � 20 bytes σ Ver(m, PK, σ ) → valid? m – Computation cost: 2 ‐ 3 μ s (for up to 500 bytes input) Sign(m, SK) → σ – Storage cost: 20 bytes – Under Crypto++ [crypto] and OpenSSL [openssl] 14 13 4: Merkle Hash Tree [M89] - Amortizing Signature Cost 2: Public Key Digital Signature Schemes Collision resistant hash function � any change in the Digital signature of the root � no one except the owner Single signature to sign many messages • Formally defined by [GMR88] Hash function is publicly known tree will lead to a different hash value for the root could produce the signature – The message has not been changed in any way σ Ver(h 1..8 , σ , pK)= valid? – Sign(h 1..8 ,SK) The message is indeed from the sender (corresponding to the public key) σ – No one except the secret key owner could produce a signature h 1..8 h 1..8 • One such scheme: RSA [RSA78] • Observations h 1..4 h 1..4 h 5..8 h 5..8 – Computation cost: about 3 ‐ 4 ms for signing and more than 100 μ s for – Computation cost: about 3 ‐ 4 ms for signing and more than 100 μ s for verifying h 12 = h 12 h 34 h 56 h 56 h 78 h 78 – Storage cost: 128 bytes H(h 1 | h 2 ) 3: Signature Aggregation (Condensed RSA) h 1 h 2 h 3 h 4 h 5 h 5 h 6 h 6 h 7 h 8 – Checking one aggregated signature is almost as fast as an individual signature m 1 m 2 m 3 m 4 m 5 m 5 m 6 m 6 m 7 m 8 15 16 Contributions Correctness and Completeness ☺ Proposed authenticated structures • Correctness, Completeness: � Getting to know B+ trees – Any change in the tree will lead to different hash � The idea of changing – Relative position of values is authenticated � ASB Tree ( based on existing work) � ASB Tree ( based on existing work) • Authentication: A th ti ti � MB tree ( based on existing work) – Signing the root with SK � EMB tree � Freshness (third dimension of query Authentication) 17 3

  4. 11/24/2009 B+ ‐ Tree Structure B+ ‐ Tree File Organization In a B+ ‐ Tree file organization, the leaf nodes • A typical node contains up to n – 1 search key values of the tree stores the actual record rather than storing pointers to records. K1, K2,…, Kn ‐ 1, and n pointers P1, P2,…, Pn. The search key values are kept in sorted order. • The pointer Pi can point to either a file record or a bucket of pointers which each point to a file record. P1 K1 P2 … Pn-1 Kn-1 Pn 19 20 Range Authentication – A Simple Approach Signature ‐ Based Approach: ASB Tree based on [PJR05] correctness but B+ Tree Produced by the owner NOT NOT = completeness !!! sig h r ( ) Sent to the client Sent to the client i i i i along with 3 r r r r , , , 4 5 6 S(r 1 |r 2 ) S(r 2 |r 3 ) … … S( n ‐ 2 |r n ‐ 1 ) S(r n ‐ 1 |r n ) 1. order database tuples w.r.t query attribute sig sig sig sig 3 4 5 6 2. sign consecutive pairs 3. build B+ tree on top of it r r r r r r r r r r r r r r r r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 4. return tuples [a ‐ 1, b+1] together with signatures in [a ‐ 1, b]. (query is [a, b]) (a, b here are index) Q 5. verify any two consecutive pairs 22 Comparing Cryptographic OP Condensed RSA (NDSS’04) • Server: – Selects records matching posed query • one hashing takes 2 ‐ 3 μ s – Multiplies corresponding RSA signatures – Modular Multiplication ‐ 100 times slower – Returns single signature to querier – Verifying ‐ 1000 times slower – Signing ‐ 10000 times slower Signing 10000 times slower S Server Q Querier i Given t record signatures: Given t messages: { σ 1 , σ 2 … σ t } , {m 1 ,m 2 … m t } and σ 1,t σ 1,t compute combined signature verify combined signature: t Hashing <t mod_M <t ver <t Sign σ 1,t = Π σ i mod n ( σ 1,t ) e = ? = Π h(m i ) (mod n) Send σ 1,t to the querier N is RSA modulus of the public key from the owner 23 24 4

  5. 11/24/2009 Signature Chaining Issues Reduce S/C communication Cost • A heavy burden on the owner to produce the • Aggregation Signature: Condensed RSA signatures • Overhead on the client to verify the aggregated m 1 m k m 1 m k signature σ 1 σ k σ • Storage overhead at the server to store the σ = combine( σ 1 ,… , σ k ) signatures (which potentially leads to higher computational cost to retrieve them) Overhead: computation cost of modular multiplication with big modular base number, • High communication overhead on both the server close to 100 μ s and the owner, in order to exchange the signatures 25 Merkle B(MB) Tree: Natural Extension for Merkle B(MB) Tree: Natural Extension for Range Query Range Query • Use a B + ‐ tree instead of a binary search tree: … p 0 h 0 p 1 k 1 h 1 h 1 p f k f h f 410 720 … 250 250 320 320 410 410 600 600 720 720 h 1 = H(h 10 | … | h 1f ) p 10 h 10 h 10 p 11 k 11 h 11 h 11 t 1 t 2 t 3 t 4 t 5 � Extend it with hash information: For root node, σ = Sign(h 0 | … | h f ) leaf node … … K i h i =H(t i ) K j h j =H(t j ) 27 28 Extends to Range Query: f=2 (f is the Client Side Verification fanout) Ver(h 1..8 ,PK, σ ) Select * from T where 5< A< 11 Sign(h 1..8 ,SK) Select * from T where 5< A< 11 σ σ VO: 5, 12, h 1..4 , σ Valid? h 1..8 h 1..8 Query results: 6, 9 h 1..4 h 1..4 h 5..8 h 5..8 h 1..4 h 5..8 h 12 h 34 h 56 h 78 h 56 h 78 Unknown to the client h 1 h 2 h 3 h 4 h 5 h 6 h 7 h 8 h 5 h 6 h 7 h 8 1 2 3 4 5 5 6 9 12 12 5 6 9 12 VO: 5, 12, h 1..4 , σ Reconstruct query LB(q) RB(q) q q subtree 29 30 5

  6. 11/24/2009 Embedded Merkle B (EMB) tree: A fractal structure Query Example: f=5 tuple 5, 10, hash of 1, 3, 12, 14, 16, VO: p 0 h 0 p 1 k 1 h 1 … p f k f h f hash of entry 20, 29, 42 8 hashes 10 20 20 29 29 42 42 LB(q) 1 1 3 3 5 5 6 9 10 10 12 12 14 14 16 16 p 10 h 10 p 11 k 11 h 11 p 1f k 1f h 1f … q RB(q) 20 22 23 25 … … … … A MB tree with fanout f e built on this node 31 32 EMB tree Analysis Query Example: f=5 VO: tuple 5, 10, hash of red circle node, • We can show that: hash of red circle nodes(2), hash of red circle nodes(2), – Query cost is as a MB tree with fanout f k 5 hashes – Authentication cost (c/s comm. cost and client ( / 10 20 29 42 verification cost) is as a MB tree with fanout f e , 10 12 14 16 10 20 29 42 intuition: LB(q) 1 3 5 6 9 – f k is smaller than a normal MB tree given a page size P 1 3 5 5 6 9 10 10 12 14 16 q RB(q) 20 22 23 25 … … … … 34 33 EMB tree’s variants Freshness? • Don’t store the embedded tree, build it on the fly – emm, it’s correct! ☺ Owner EMB ‐ tree Client – Fanout f k is as a normal MB tree, better query performance, better storage performance performance better storage performance query query update update Server � Use multi ‐ way search tree instead of B + tree as q+VO embedded tree – EMB * tree � Hash path in the embedded tree could stop in index level, not necessary to go to the leaf level, hence reduce the VO size new signature(s): Return VO constructed based σ v on previous version: σ v ‐ 1 (s) 35 36 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend