An Introduction To Data Stream Query Processing Neil Conway - - PowerPoint PPT Presentation

an introduction to data stream query processing
SMART_READER_LITE
LIVE PREVIEW

An Introduction To Data Stream Query Processing Neil Conway - - PowerPoint PPT Presentation

An Introduction To Data Stream Query Processing Neil Conway <nconway@truviso.com> Truviso, Inc. May 24, 2007 Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 1 / 45 Outline The Need For Data Stream Processing 1


slide-1
SLIDE 1

An Introduction To Data Stream Query Processing

Neil Conway <nconway@truviso.com>

Truviso, Inc.

May 24, 2007

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 1 / 45

slide-2
SLIDE 2

Outline

1

The Need For Data Stream Processing

2

Stream Query Languages

3

Query Processing Techniques For Streams System Architecture Shared Evaluation Adaptive Tuple Routing Overload Handling

4

Current Choices For A DSMS Open Source Proprietary

5

Demo

6

Q & A

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 2 / 45

slide-3
SLIDE 3

Outline

1

The Need For Data Stream Processing

2

Stream Query Languages

3

Query Processing Techniques For Streams System Architecture Shared Evaluation Adaptive Tuple Routing Overload Handling

4

Current Choices For A DSMS Open Source Proprietary

5

Demo

6

Q & A

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 3 / 45

slide-4
SLIDE 4

The Need For Data Stream Processing What’s wrong with database systems?

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 4 / 45

slide-5
SLIDE 5

The Need For Data Stream Processing What’s wrong with database systems? Nothing, but they aren’t the right solution to every problem

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 4 / 45

slide-6
SLIDE 6

The Need For Data Stream Processing What’s wrong with database systems? Nothing, but they aren’t the right solution to every problem What are some problems for which a traditional DBMS is an awkward fit?

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 4 / 45

slide-7
SLIDE 7

Financial Analysis

Electronic trading is now commonplace

Trading volume continues to increase rapidly

Algorithmic trading: detect advantageous market conditions, automatically execute trades

Latency is key

Visualization

A hard problem in itself

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 5 / 45

slide-8
SLIDE 8

Financial Analysis

Electronic trading is now commonplace

Trading volume continues to increase rapidly

Algorithmic trading: detect advantageous market conditions, automatically execute trades

Latency is key

Visualization

A hard problem in itself

Typical Queries

5-minute rolling average, volume-waited average price (VWAP) Comparison between sector averages and portfolio averages over time Implement models provided by quantitive analysis

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 5 / 45

slide-9
SLIDE 9

Network Monitoring

Network volume continues to increase rapidly Custom solutions are possible, but roll-your-own is expensive

Ad-hoc queries would be nice

Can we build generic infrastructure for these kinds of monitoring applications?

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 6 / 45

slide-10
SLIDE 10

Sensor Networks

Pervasive Sensors

“As the cost of micro sensors continues to decline over the next decade, we could see a world in which everything of material significance gets sensor-tagged.” – Mike Stonebraker Military applications: real-time command and control Healthcare Habitat monitoring Manufacturing

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 7 / 45

slide-11
SLIDE 11

Other Examples

Real-Time Decision Support

Turnaround-time for traditional data warehouses is often too slow “Business Activity Monitoring” (BAM)

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 8 / 45

slide-12
SLIDE 12

Other Examples

Real-Time Decision Support

Turnaround-time for traditional data warehouses is often too slow “Business Activity Monitoring” (BAM)

Fraud Detection

Sophisticated, cross-channel fraud Real-time

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 8 / 45

slide-13
SLIDE 13

Other Examples

Real-Time Decision Support

Turnaround-time for traditional data warehouses is often too slow “Business Activity Monitoring” (BAM)

Fraud Detection

Sophisticated, cross-channel fraud Real-time

Online Gaming

Detect malicious behavior Monitor quality of service

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 8 / 45

slide-14
SLIDE 14

Data Stream Management Systems

Database Systems

Mostly static data, ad-hoc one-time queries Fire the queries at the data, return result sets “Store and query” Focus: concurrent reads & writes, efficient use of I/O, maximize transaction throughput, transactional consistency, historical analysis

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 9 / 45

slide-15
SLIDE 15

Data Stream Management Systems

Database Systems

Mostly static data, ad-hoc one-time queries Fire the queries at the data, return result sets “Store and query” Focus: concurrent reads & writes, efficient use of I/O, maximize transaction throughput, transactional consistency, historical analysis

Data Stream Systems

Mostly transient data, continuous queries Fire the data at the queries, incrementally update result streams Data rates often exceed disk throughput

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 9 / 45

slide-16
SLIDE 16

Complex Event Processing (CEP)

Data stream processing emerged from the database community

Early 90’s: “active databases” with triggers

Complex Event Processing is another approach to the same problems

Different nomenclature and background Often similar in practice

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 10 / 45

slide-17
SLIDE 17

Outline

1

The Need For Data Stream Processing

2

Stream Query Languages

3

Query Processing Techniques For Streams System Architecture Shared Evaluation Adaptive Tuple Routing Overload Handling

4

Current Choices For A DSMS Open Source Proprietary

5

Demo

6

Q & A

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 11 / 45

slide-18
SLIDE 18

Data Streams

A stream is an infinite sequence of tuple, timestamp pairs

Append-only New type of database object

The timestamp defines a total order over the tuples in a stream

In practice: require that stream tuples have a special CQTIME column

Different approaches to building stream processing systems

This talk: relation-oriented DSMS. Specifically, TelegraphCQ, Truviso, StreamBase, . . .

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 12 / 45

slide-19
SLIDE 19

CREATE STREAM

Exactly 1 column must have a CQTIME constraint

CQTIME can be system-generated or user-provided

With user-provided timestamps, system must cope with out-of-order tuples

“Slack” specifies maximum out-of-orderness

Example Query

CREATE STREAM trades ( symbol varchar(5), price real, volume integer, tstamp timestamp CQTIME USER GENERATED SLACK ‘1 minute’ ) TYPE UNARCHIVED;

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 13 / 45

slide-20
SLIDE 20

Types of Streams

Raw Streams

Stream tuples are injected into the system by an external data source E.g. stock tickers, sensor data, network interface, . . . Both push and pull models have been explored

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 14 / 45

slide-21
SLIDE 21

Types of Streams

Raw Streams

Stream tuples are injected into the system by an external data source E.g. stock tickers, sensor data, network interface, . . . Both push and pull models have been explored

Derived Streams

Defined by a query expression that yields a stream

Archived Streams

Allows historical and real-time stream content to be combined in a single database object

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 14 / 45

slide-22
SLIDE 22

Language Design Philosophy

Pragmatism: relational query languages are well-established

Relational query evaluation techniques are well-understood Everyone knows SQL

Therefore, add stream-oriented extensions to SQL

Pioneering work: CQL from Stanford STREAM project

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 15 / 45

slide-23
SLIDE 23

Language Design Philosophy

Pragmatism: relational query languages are well-established

Relational query evaluation techniques are well-understood Everyone knows SQL

Therefore, add stream-oriented extensions to SQL

Pioneering work: CQL from Stanford STREAM project

Kinds Of Operators

Relation → Relation: Plain Old SQL Stream → Relation: Periodically produce a relation from a stream Relation → Stream: Produce stream from changes to a relation Note that S → S operators are not provided.

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 15 / 45

slide-24
SLIDE 24

Continuous Queries

Fundamental Difference

The result of a continuous query is an unbounded stream, not a finite relation

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 16 / 45

slide-25
SLIDE 25

Continuous Queries

Fundamental Difference

The result of a continuous query is an unbounded stream, not a finite relation

Typical Query

1 Split infinite stream into pieces via windows

S → R

2 Compute analysis for the current window, comparison with prior

windows or historical data

R → R

3 Convert result of analysis into result stream

R → S Often implicit (use defaults)

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 16 / 45

slide-26
SLIDE 26

Stream → Relation Operators: Windows

Streams are infinite: at any given time, examine a finite sub-set Apply window operator to stream to periodically produce visible sets of tuples

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 17 / 45

slide-27
SLIDE 27

Stream → Relation Operators: Windows

Streams are infinite: at any given time, examine a finite sub-set Apply window operator to stream to periodically produce visible sets of tuples

Properties of Sliding Windows

Range: “Width” of the window. Units: rows or time. Slide: How often to emit new visible sets. Units: rows or time. Start: When to start emitting results.

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 17 / 45

slide-28
SLIDE 28

Example Query

Description

Every second, return the total volume of trades in the previous second.

Query

SELECT sum(volume) AS volume, advance_agg(qtime) AS windowtime FROM trades < VISIBLE ‘1 second’ ADVANCE ‘1 second’ >

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 18 / 45

slide-29
SLIDE 29

Another Example

Description

Every 5 seconds, return the volume-adjusted price of MSFT for the last 1 minute of trades.

Query

SELECT sum(price * volume) / sum(volume) AS vwap, sum(volume) AS volume, advance_agg(qtime) AS windowtime FROM trades < VISIBLE ‘1 minute’ ADVANCE ‘5 seconds’ > WHERE symbol = ‘MSFT’

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 19 / 45

slide-30
SLIDE 30

More About Windows

Aggregation

Useful aggregate: advance agg(CQTIME) Timestamp that marks the end of the current window Similar aggregates for “beginning of window”, “middle of window” might also be useful

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 20 / 45

slide-31
SLIDE 31

More About Windows

Aggregation

Useful aggregate: advance agg(CQTIME) Timestamp that marks the end of the current window Similar aggregates for “beginning of window”, “middle of window” might also be useful

Other Window Types

Landmark: Fixed left edge, “elastic” right edge. Periodically reset. (“All stock trades after 9AM today.”) Partitioned: Divide stream into sub-streams based on partitioning key(s), then apply another S → R operator to the sub-streams.

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 20 / 45

slide-32
SLIDE 32

Relation → Stream Operators

Types of Operators

ISTREAM: the tuples added to a relation RSTREAM: all the tuples in a relation DSTREAM: the tuples removed from relation

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 21 / 45

slide-33
SLIDE 33

Relation → Stream Operators

Types of Operators

ISTREAM: the tuples added to a relation RSTREAM: all the tuples in a relation DSTREAM: the tuples removed from relation

Defaults

ISTREAM for queries without aggregation/grouping RSTREAM for queries with aggregation/grouping DSTREAM is rarely useful

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 21 / 45

slide-34
SLIDE 34

Mixed Joins

Common Requirement

Compare stream tuples with historical data System must provide both tables and streams! Elegantly modeled as a join between a table and a stream

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 22 / 45

slide-35
SLIDE 35

Mixed Joins

Common Requirement

Compare stream tuples with historical data System must provide both tables and streams! Elegantly modeled as a join between a table and a stream

Implementation

Stream is the right (outer) join operand; left (inner) operand is arbitrary Postgres subplan

For each stream tuple, join against non-continuous subplan

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 22 / 45

slide-36
SLIDE 36

Mixed Join Example

Description

Every 3 seconds, compute the total value of high-volume trades made on stocks in the S & P 500 in the past 5 seconds.

Example Query

SELECT T.symbol, sum(T.price * T.volume) FROM s_and_p_500 S, trades T < VISIBLE ‘5 sec’ ADVANCE ‘3 sec’ > WHERE T.symbol = S.symbol AND T.volume > 5000 GROUP BY T.symbol

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 23 / 45

slide-37
SLIDE 37

Composing Streams

The tuples in a stream can be viewed as a series of events

E.g. “The temperature in the room is 20◦”, 25◦, 30◦, . . .

The output of a continuous query is another series of events, typically higher-level or more complex

E.g. “The room is on fire.”

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 24 / 45

slide-38
SLIDE 38

Composing Streams

The tuples in a stream can be viewed as a series of events

E.g. “The temperature in the room is 20◦”, 25◦, 30◦, . . .

The output of a continuous query is another series of events, typically higher-level or more complex

E.g. “The room is on fire.”

Therefore, streams can be composed in various ways:

Stream views

Macro semantics

Derived streams Subqueries Active tables

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 24 / 45

slide-39
SLIDE 39

Derived Streams

A derived stream is a database object defined by a persistent continuous query Unlike a streaming view, a derived stream is always active Similar to a materialized view

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 25 / 45

slide-40
SLIDE 40

Example Query

Description

Every 3 seconds, compute the “volume-weighted average price” (VWAP) for all stocks traded in the past 5 seconds.

Query

CREATE STREAM vwap (symbol varchar(5), vwap float, vtime timestamp cqtime) AS (SELECT symbol, sum(price * volume) / sum(volume), advance_agg(qtime) FROM trades < VISIBLE ‘5 seconds’ ADVANCE ‘3 seconds’ > GROUP BY symbol);

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 26 / 45

slide-41
SLIDE 41

Subqueries

One-time subqueries can be used in continuous queries, of course Continuous subqueries are planned and executed as independent queries

Essentially inline derived streams

Require that subqueries yielding streams specify CQTIME Planned: WITH-clause subqueries

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 27 / 45

slide-42
SLIDE 42

Active Tables

An active table is a table with an associated continuous query Two modes of operation: Append: New stream tuples appended to table at each window Replace: At each new window, truncate previous table contents

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 28 / 45

slide-43
SLIDE 43

Event Language

Example Query

SELECT ‘Shoplifting!’, D.loc, D.id FROM Store S C D PARTITION BY id WHERE S.loc = ‘shelf’ and C.loc = ‘checkout’ AND D.loc = ‘door’ EVENT AND (FOLLOWS(S, D, ‘1 hour’), NOT PRECEDES(C, D, ‘1 hour’));

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 29 / 45

slide-44
SLIDE 44

Outline

1

The Need For Data Stream Processing

2

Stream Query Languages

3

Query Processing Techniques For Streams System Architecture Shared Evaluation Adaptive Tuple Routing Overload Handling

4

Current Choices For A DSMS Open Source Proprietary

5

Demo

6

Q & A

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 30 / 45

slide-45
SLIDE 45

Basic Requirements

Adaptivity

Static query planning is undesirable for long-running queries Either replan or use adaptive planning

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 31 / 45

slide-46
SLIDE 46

Basic Requirements

Adaptivity

Static query planning is undesirable for long-running queries Either replan or use adaptive planning

Shared Processing

Essential for good performance: 100s of queries not uncommon Long-lived queries make this more feasible

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 31 / 45

slide-47
SLIDE 47

Basic Requirements

Adaptivity

Static query planning is undesirable for long-running queries Either replan or use adaptive planning

Shared Processing

Essential for good performance: 100s of queries not uncommon Long-lived queries make this more feasible

Graceful Overload Handling

Stream data rates are often highly variable Often too expensive to provision for peak data rate Therefore, must handle overload gracefully

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 31 / 45

slide-48
SLIDE 48

System Architecture

Modified version of PostgreSQL One-time queries executed normally Continuous queries planned and executed by the CqRuntime process

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 32 / 45

slide-49
SLIDE 49

System Architecture

Modified version of PostgreSQL One-time queries executed normally Continuous queries planned and executed by the CqRuntime process Stream input: COPY, or submitted via TCP to CqIngress process

libevent-based, simple COPY-like protocol

Stream output: cursors, active tables, CqEgress process Communication between processes done via shared memory queue infrastructure

Message passing done via SysV shmem and locks

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 32 / 45

slide-50
SLIDE 50

Shared Runtime

New continuous query is defined → shared runtime via shared memory Runtime plans the query, folds query into single shared query plan

Not a traditional plan tree; graph of operators

Shared Runtime Main Loop

1 Check for control messages: add new CQ, remove CQ, . . . 2 Check for new stream tuples

Route each stream tuple through the operator graph (CPS) Push output tuples to result consumers

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 33 / 45

slide-51
SLIDE 51

Shared Evaluation

Continuous query evaluation done by a network of operators in the shared runtime If multiple queries reference the same operator, we can evaluate it

  • nly once

Better than linear scalability!

Each operator keeps track of the queries it helps to implement

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 34 / 45

slide-52
SLIDE 52

Implementing Shared Evaluation

Sharing Predicates

Simple cases: <, ≤, =, >, ≥, =

Construct a tree that divides domain of type into disjoint regions For each tuple: walk the tree to find the region the tuple belongs in

Region implies which queries the tuple is still visible to

Immutable functions can also be shared relatively easily

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 35 / 45

slide-53
SLIDE 53

Implementing Shared Evaluation

Sharing Predicates

Simple cases: <, ≤, =, >, ≥, =

Construct a tree that divides domain of type into disjoint regions For each tuple: walk the tree to find the region the tuple belongs in

Region implies which queries the tuple is still visible to

Immutable functions can also be shared relatively easily

Sharing Joins, Aggregates

Can also be done Even between queries with varying windows and predicates Requires some thought (say, a PhD thesis or two)

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 35 / 45

slide-54
SLIDE 54

Adaptive Tuple Routing

Given a new tuple, how do we route it through the graph of

  • perators?

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 36 / 45

slide-55
SLIDE 55

Adaptive Tuple Routing

Given a new tuple, how do we route it through the graph of

  • perators?

Traditional approach: statically choose an “optimal” route for each stream

Hard optimization problem Need to re-optimize when new queries defined or system conditions change (e.g. operator selectivity)

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 36 / 45

slide-56
SLIDE 56

Adaptive Tuple Routing

Given a new tuple, how do we route it through the graph of

  • perators?

Traditional approach: statically choose an “optimal” route for each stream

Hard optimization problem Need to re-optimize when new queries defined or system conditions change (e.g. operator selectivity)

TelegraphCQ approach: adaptive per-tuple routing

Push tuples one at a time through the operator graph; choose order of

  • perators at runtime

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 36 / 45

slide-57
SLIDE 57

Implementing Adaptive Routing

For each tuple, maintain lineage

“What operators has this tuple visited?” “Which queries can still see this tuple?”

Implication: can’t push down projections Make routing decisions on the basis of simple run-time statistics

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 37 / 45

slide-58
SLIDE 58

Handling Overload

Common scenario: peak stream rate >> average stream rate (“bursty”) The system should cope gracefully

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 38 / 45

slide-59
SLIDE 59

Handling Overload

Common scenario: peak stream rate >> average stream rate (“bursty”) The system should cope gracefully Three alternatives:

1

Spool tuples to disk, process later

But stream rates often exceed disk throughput

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 38 / 45

slide-60
SLIDE 60

Handling Overload

Common scenario: peak stream rate >> average stream rate (“bursty”) The system should cope gracefully Three alternatives:

1

Spool tuples to disk, process later

But stream rates often exceed disk throughput

2

Drop excess tuples

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 38 / 45

slide-61
SLIDE 61

Handling Overload

Common scenario: peak stream rate >> average stream rate (“bursty”) The system should cope gracefully Three alternatives:

1

Spool tuples to disk, process later

But stream rates often exceed disk throughput

2

Drop excess tuples

3

Substitute statistical summaries for dropped stream tuples

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 38 / 45

slide-62
SLIDE 62

Handling Overload

Common scenario: peak stream rate >> average stream rate (“bursty”) The system should cope gracefully Three alternatives:

1

Spool tuples to disk, process later

But stream rates often exceed disk throughput

2

Drop excess tuples

3

Substitute statistical summaries for dropped stream tuples

Quality of Service (QoS)

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 38 / 45

slide-63
SLIDE 63

Outline

1

The Need For Data Stream Processing

2

Stream Query Languages

3

Query Processing Techniques For Streams System Architecture Shared Evaluation Adaptive Tuple Routing Overload Handling

4

Current Choices For A DSMS Open Source Proprietary

5

Demo

6

Q & A

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 39 / 45

slide-64
SLIDE 64

Open Source DSMS

Esper

DSMS engine written in Java (GPL). SQL-like stream query language. http://esper.codehaus.org

TelegraphCQ

Academic prototype from UC Berkeley, based on PostgreSQL 7.3 PostgreSQL’s SQL dialect, plus stream-oriented extensions BSD licensed; http://telegraph.cs.berkeley.edu

StreamCruncher

DSMS engine written in Java. Free for commercial use (not open source). http://www.streamcruncher.com

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 40 / 45

slide-65
SLIDE 65

Proprietary DSMS

StreamBase

A Stonebraker company. Founded in 2003.

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 41 / 45

slide-66
SLIDE 66

Proprietary DSMS

StreamBase

A Stonebraker company. Founded in 2003.

Other Startups

Coral8 Apama (purchased by Progress Software in 2005) and more . . .

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 41 / 45

slide-67
SLIDE 67

Proprietary DSMS

StreamBase

A Stonebraker company. Founded in 2003.

Other Startups

Coral8 Apama (purchased by Progress Software in 2005) and more . . .

Established Companies

TIBCO BusinessEvents, Oracle BAM

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 41 / 45

slide-68
SLIDE 68

Truviso

Based on the experience gained from TelegraphCQ

New codebase

Application components:

1

Continuous Query Engine

Modified version of PostgreSQL (currently 8.2.4+)

2

Integration Framework

Connectors, input/output converters, query management

3

Visualization

Closed Series A funding in June 2006 1.0 release will be available Real Soon Now (currently RC3)

Lesson: PostgreSQL is a huge competitive advantage

We’re hiring :-)

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 42 / 45

slide-69
SLIDE 69

Outline

1

The Need For Data Stream Processing

2

Stream Query Languages

3

Query Processing Techniques For Streams System Architecture Shared Evaluation Adaptive Tuple Routing Overload Handling

4

Current Choices For A DSMS Open Source Proprietary

5

Demo

6

Q & A

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 43 / 45

slide-70
SLIDE 70

Outline

1

The Need For Data Stream Processing

2

Stream Query Languages

3

Query Processing Techniques For Streams System Architecture Shared Evaluation Adaptive Tuple Routing Overload Handling

4

Current Choices For A DSMS Open Source Proprietary

5

Demo

6

Q & A

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 44 / 45

slide-71
SLIDE 71

Q & A

Thank You. Any Questions?

Neil Conway (Truviso) Data Stream Query Processing May 24, 2007 45 / 45