Query Processing in a Self-Organized Storage System Hannes - - PowerPoint PPT Presentation

query processing in a self organized storage system
SMART_READER_LITE
LIVE PREVIEW

Query Processing in a Self-Organized Storage System Hannes - - PowerPoint PPT Presentation

Query Processing in a Self-Organized Storage System Hannes Mhleisen, supervised by Robert Tolksdorf 1 Distributed DBs - Goals Scalability Data, Queries, Nodes Robustness Node/Network failure Adaptiveness Fair


slide-1
SLIDE 1

Query Processing in a Self-Organized Storage System

Hannes Mühleisen, supervised by Robert Tolksdorf

1

slide-2
SLIDE 2
  • Scalability
  • Data, Queries, Nodes
  • Robustness
  • Node/Network failure
  • Adaptiveness
  • “Fair” distribution of load

2

Distributed DBs - Goals

slide-3
SLIDE 3

S1 S2 S3 S4 S5 S6 C1

Clustered / Federated

3

[Bernstein81, Epstein78]

slide-4
SLIDE 4

S1 S2 S3 S6 S5 S4

1-2 0-1 2-3 3-4 4-5 5-6

Global Laws

4

[Harren02,Karnstedt04,Rösch05]

slide-5
SLIDE 5

S1 S2 S3 S6 S5 S4

#B

70% 25% 95% 50% 50% 95% 10% 85%

#B?

Probabilistic Request Routing

5

[Lindgren03]

slide-6
SLIDE 6

[Wilensky97, NetLogo Ants model]

6

slide-7
SLIDE 7

Distribution Paradigms

Scalability Adaptability Robustness Completeness Complex Queries Stand-Alone Federated Global-Law Probabilistic e.g. Swarms

low high low high ✓ high high fair high ✓ high fair high high ✓ high high high fair ?

7

slide-8
SLIDE 8

Research Question

Can complex queries be evaluated efficiently in a swarm-based distributed storage system?

8

slide-9
SLIDE 9

Mutable Moving Query Plans

parse ✓ rewrite✓ move & repeat

Catalog✗

  • ptimize

based on?

execute where?

9

[Papadimos03,Battré08]

slide-10
SLIDE 10

10

⋈ σ r r r σ

③ ① ②

⋈ σ r r r σ

③ ① ②

⋈ σ r r

③ ②

slide-11
SLIDE 11

11

r(#)

① ② r(*)

r(*)

① ② r(#)

p(#)= 2% p(*)=78% p(#)= 2% p(*)=10% p(#)=53% p(*)= 3%

slide-12
SLIDE 12

12

Handling Routing #Failures

r(#)

p(#)=0% p(#)=0%

what now? Trackback!

slide-13
SLIDE 13

13

Evaluation Methodology

3,75 7,5 11,25 15 Query 1 Query 2 Query 3 Query4

# Participating Nodes / Query

Optimal Plan Moving Mutable Plans Static Plan Routing

better

Not actual data!

slide-14
SLIDE 14

14

Evaluation Methodology

150 300 450 600 Query 1 Query 2 Query 3 Query4

# Results / Query

Optimal Plan Moving Mutable Plans Static Plan Routing

better

Not actual data!

slide-15
SLIDE 15

Thank You!

Web Page: http://hannes.muehleisen.org

Questions?