Thomas Heinis* Eleni Tzirita Zacharatou‡ Farhan Tauheed§ Anastasia Ailamaki‡
RUBIK: Efficient Threshold Queries on Massive Time Series
§Oracle Labs, Zurich
*Imperial College London
‡École Polytechnique
Fédérale de Lausanne
RUBIK: Efficient Threshold Queries on Massive Time Series Eleni - - PowerPoint PPT Presentation
RUBIK: Efficient Threshold Queries on Massive Time Series Eleni Tzirita Zacharatou Thomas Heinis* Farhan Tauheed Anastasia Ailamaki cole Polytechnique *Imperial College London Oracle Labs, Zurich Fdrale de Lausanne
§Oracle Labs, Zurich
*Imperial College London
‡École Polytechnique
Fédérale de Lausanne
2
voltage voltage time time
time
Temporal Resolution Model Resolution
3
time
voltage
4
time series id voltage time step
Trends Correlation Opportunity to scale with Increased simulation duration Across time increase in temporal resolution Increasingly detailed models Across time series increase in spatial resolution
5
1 1 1 1 1
Timestep Bin
≥ 5 ≥ 10 ≥ 15 ≥ 20 17 9 5 2
Timestep Value
3: [15-20) 2: [10-15) 1: [5-10) 0: [0-5)
6
1 1 1 1 1
Timestep Bin
7
1 1 1 1 1
Timestep Bin
8
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
9
Mix
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Timestep Time series Bins
All 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
All 1 All 1 Mix
1 1 1 1 1 1 1 1 1 1 1 1 1
All 0 All 1 Mix All 0
10
Mix
All 0 All 1 All 1 Mix
1 1
All 0 All 1 Mix All 0
11
Mix All 0 All 1 All 1 Mix All 0 All 1 Mix All 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Timestep Bin
12
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Mix Mix All 1 All 1
Mix All 0 All 1 All 1
13
In-memory indexes: FastBit (WAH-compressed bitmap index) and RUBIK Configuration: 128 bins, Hardware: AMD Opteron CPU @ 2.7GHz, 32GB RAM Time series data: 1000 time steps, 1.2GB – 4.8GB
5 10 15 20 25 312K 624K 1.25M Total execution time (s) # time series FastBit RUBIK
300 600 900 1200 1500 312K 624K 1.25M
Index size (MB) # time series FastBit RUBIK #queries: 60
Datasets: 500K – 2M time series, 1024 time steps, 2.1GB – 8.4GB
2 4 6 8 10 small medium (2x) large (4x)
size (GB)
dataset
Index Size Dataset Size
14
6.7X 5.8X 7.5X Hardware: AMD Opteron, 2.7GHz, 32GB RAM
Benchmark: 60 threshold queries, random thresholds, up to 15% selectivity Configuration: 128 bins
2 4 6 8 small medium (2X) large (4X)
query execution time (s)
dataset
2D range query Filtering
15