West Coast Systems West Coast Systems Peter Dempsey Peter Dempsey - - PowerPoint PPT Presentation

west coast systems west coast systems
SMART_READER_LITE
LIVE PREVIEW

West Coast Systems West Coast Systems Peter Dempsey Peter Dempsey - - PowerPoint PPT Presentation

West Coast Systems West Coast Systems Peter Dempsey Peter Dempsey Jonathan Abbett Abbett Jonathan Archana Joshi Joshi Archana Kevin Hoeschele Kevin Hoeschele Schedule Schedule 1:40 1:40 2:25 CACQ 2:25 CACQ


slide-1
SLIDE 1

West Coast Systems West Coast Systems

Peter Dempsey Peter Dempsey Jonathan Jonathan Abbett Abbett Archana Archana Joshi Joshi Kevin Kevin Hoeschele Hoeschele

slide-2
SLIDE 2

Schedule Schedule

  • 1:40

1:40– –2:25 CACQ 2:25 CACQ – – Peter Peter

  • 2:25

2:25– –3:10 3:10 PSoup PSoup – – Jonathan Jonathan

  • 3:10

3:10– –3:20 Break 3:20 Break

  • 3:20

3:20– –4:05 STREAM 4:05 STREAM – – Kevin and Kevin and Archana Archana

  • 4:05-4:30 Discussion

4:05-4:30 Discussion

slide-3
SLIDE 3

Continuously Adaptive Continuously Adaptive Continuous Queries over Streams Continuous Queries over Streams (CACQ) (CACQ)

Peter Dempsey Peter Dempsey

slide-4
SLIDE 4

References References

  • S. Madden, M. Smith, J.
  • S. Madden, M. Smith, J. Hellerstein

Hellerstein, and V. , and V.

  • Raman. Continuously adaptive continuous
  • Raman. Continuously adaptive continuous

queries over streams. In queries over streams. In Proc. ACM SIGMOD

  • Proc. ACM SIGMOD
  • Intl. Conf. on Management of Data
  • Intl. Conf. on Management of Data, pages 49-

, pages 49- 60, Madison, Wisconsin, May 2002. 60, Madison, Wisconsin, May 2002.

  • J.
  • J. Hellerstien
  • Hellerstien. Query Processing and Network

. Query Processing and Network

  • Infrastructures. Talk at M.I.T. November 2002.
  • Infrastructures. Talk at M.I.T. November 2002.
  • R.
  • R. Avnur

Avnur and J. and J. Hellerstein

  • Hellerstein. Eddies: Continuously

. Eddies: Continuously adaptive query processing. In adaptive query processing. In ACM SIGMOD ACM SIGMOD, , Dallas, TX, May 2000. Dallas, TX, May 2000.

slide-5
SLIDE 5

CACQ Introduction CACQ Introduction

  • A Data Stream Management System.

A Data Stream Management System.

  • Leverages earlier work at Berkeley

Leverages earlier work at Berkeley (Telegraph). (Telegraph).

  • Adaptive approach to query processing.

Adaptive approach to query processing.

  • Implements cross-query sharing of work

Implements cross-query sharing of work and space. and space.

slide-6
SLIDE 6

CACQ Overview CACQ Overview

  • Eddies

Eddies

  • Lineage

Lineage

  • Predicate Index

Predicate Index

  • SteMs

SteMs

slide-7
SLIDE 7

Eddies Eddies

  • Developed for Telegraph.

Developed for Telegraph.

  • An improvement on static query plans.

An improvement on static query plans.

  • Provide continuous

Provide continuous adaptivity adaptivity. .

  • Route

Route tuples tuples through operators. through operators.

slide-8
SLIDE 8

Static Query Plan Static Query Plan

slide-9
SLIDE 9

Eddy Eddy

slide-10
SLIDE 10

Eddy Eddy

slide-11
SLIDE 11

Eddy Eddy

slide-12
SLIDE 12

Eddy Eddy

slide-13
SLIDE 13

Eddy Eddy

slide-14
SLIDE 14

Eddy Eddy

slide-15
SLIDE 15

Eddy Eddy

slide-16
SLIDE 16

CACQ Overview CACQ Overview

  • Eddies

Eddies

  • Lineage

Lineage

  • Predicate Index

Predicate Index

  • SteMs

SteMs

slide-17
SLIDE 17

Lineage Lineage

  • Each

Each tuple tuple’ ’s s path through the eddy is stored in path through the eddy is stored in the the tuple tuple. .

  • Each operator can handle

Each operator can handle tuples tuples with different with different lineages. lineages.

slide-18
SLIDE 18

When do we output a When do we output a tuple tuple? ?

  • Each query has a

Each query has a completionMask completionMask, which , which is simply a is simply a bitmask bitmask. .

  • Each query checks its

Each query checks its completionMask completionMask ANDed ANDed with a with a tuple tuple’ ’s s done done bits. bits.

  • If the above operation equals the

If the above operation equals the completionMask completionMask, then the , then the tuple tuple is output is output to that query. to that query.

slide-19
SLIDE 19

CACQ Overview CACQ Overview

  • Eddies

Eddies

  • Lineage

Lineage

  • Predicate Index

Predicate Index

  • SteMs

SteMs

slide-20
SLIDE 20

Predicate Index : Grouped Filter Predicate Index : Grouped Filter

  • An index is maintained

An index is maintained for each attribute that for each attribute that appears in a query. appears in a query.

  • Used to improve

Used to improve efficiency in overlapping efficiency in overlapping range queries. range queries.

  • Allows the system to

Allows the system to apply multiple selections apply multiple selections at once. at once.

slide-21
SLIDE 21

Predicate Indexes Illustrated Predicate Indexes Illustrated

slide-22
SLIDE 22

CACQ Overview CACQ Overview

  • Eddies

Eddies

  • Lineage

Lineage

  • Predicate Index

Predicate Index

  • SteMs

SteMs

slide-23
SLIDE 23

SteMs SteMs: State Modules : State Modules

  • Used to help computes joins.

Used to help computes joins.

  • Is an index that is built on-the-fly.

Is an index that is built on-the-fly.

  • Joins are no longer a binary operation.

Joins are no longer a binary operation.

  • Enforces a

Enforces a window window on the join.

  • n the join.
slide-24
SLIDE 24

SteMs SteMs: Illustrated : Illustrated

slide-25
SLIDE 25

A few extra details with joins A few extra details with joins

  • With k input sources there can be 2^k

With k input sources there can be 2^k possible intermediate possible intermediate tuples tuples. .

  • A Virtual Source is used to encode a

A Virtual Source is used to encode a subset of sources. subset of sources.

  • The

The sourceId sourceId bitmap is used to denote the bitmap is used to denote the virtual source. virtual source.

slide-26
SLIDE 26

Putting it all together: Examples Putting it all together: Examples

  • Single query without

Single query without joins. joins.

  • Multiple queries

Multiple queries without joins. without joins.

  • Multiple queries with

Multiple queries with joins. joins.

slide-27
SLIDE 27

Putting it all together: Examples Putting it all together: Examples

  • Single query without

Single query without joins. joins.

  • Multiple queries

Multiple queries without joins. without joins.

  • Multiple queries with

Multiple queries with joins. joins.

  • Eddy

Eddy

slide-28
SLIDE 28

Putting it all together: Examples Putting it all together: Examples

  • Single query without

Single query without joins. joins.

  • Multiple queries

Multiple queries without joins. without joins.

  • Multiple queries with

Multiple queries with joins. joins.

  • Eddy.

Eddy.

  • Eddy with Predicate

Eddy with Predicate Index. Index.

slide-29
SLIDE 29

Putting it all together: Examples Putting it all together: Examples

  • Single query without

Single query without joins. joins.

  • Multiple queries

Multiple queries without joins. without joins.

  • Multiple queries with

Multiple queries with joins. joins.

  • Eddy.

Eddy.

  • Eddy with Predicate

Eddy with Predicate Index. Index.

  • Eddy with

Eddy with SteMs SteMs. .

slide-30
SLIDE 30

Putting it all together: Examples Putting it all together: Examples

  • Single query without

Single query without joins. joins.

  • Multiple queries

Multiple queries without joins. without joins.

  • Multiple queries with

Multiple queries with joins. joins.

  • Eddy.

Eddy.

  • Eddy with Predicate

Eddy with Predicate Index. Index.

  • Eddy with

Eddy with SteMs SteMs. .

slide-31
SLIDE 31

Conclusions Conclusions

  • The strict rules of query paths for

The strict rules of query paths for continuous queries are broken. continuous queries are broken.

  • Operators are shared much more

Operators are shared much more aggressively than other systems. aggressively than other systems.

  • Multiple selections are computed at once.

Multiple selections are computed at once.

  • CACQ improves on performance and space

CACQ improves on performance and space by utilizing shared resources. by utilizing shared resources.